[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2019-02-06 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762016#comment-16762016
 ] 

huaxiang sun commented on HBASE-18693:
--

By the way, do you know how to update email address in jira? It seems that it 
is still associated with the old email address which is invalid.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2019-02-06 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762013#comment-16762013
 ] 

huaxiang sun commented on HBASE-18693:
--

Sorry [~pankaj2461], can you take over this jira? I do not have bandwidth now. 
Thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2019-01-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754754#comment-16754754
 ] 

Hadoop QA commented on HBASE-18693:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} HBASE-18693 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18693 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895096/HBASE-18693.master.003.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15765/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2019-01-29 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754752#comment-16754752
 ] 

Pankaj Kumar commented on HBASE-18693:
--

PingĀ [~huaxiang] , any plan to commit this change?

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2018-01-05 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313764#comment-16313764
 ] 

huaxiang sun commented on HBASE-18693:
--

Thanks [~jingcheng.du] for the review.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2018-01-04 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312387#comment-16312387
 ] 

Jingcheng Du commented on HBASE-18693:
--

Thanks [~huaxiang]!
I am +1 to V3.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2018-01-04 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311744#comment-16311744
 ] 

huaxiang sun commented on HBASE-18693:
--

Hi [~jingcheng.du], just want to follow up the review status, thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-12-18 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295419#comment-16295419
 ] 

huaxiang sun commented on HBASE-18693:
--

Hi [~dujin...@gmail.com], ping for review, thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-12-13 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16289742#comment-16289742
 ] 

huaxiang sun commented on HBASE-18693:
--

Hi [~dujin...@gmail.com], v3 is up to date. The only difference is 
diff --git a/hbase-shell/src/main/ruby/shell/commands/restore_snapshot.rb 
b/hbase-shell/src/main/rub|  diff --git 
hbase-shell/src/main/ruby/hbase_constants.rb 
hbase-shell/src/main/ruby/hbase_constants.r
  
|
  index ebaae78..12df9ff 100644 
 
  
|
  --- hbase-shell/src/main/ruby/hbase_constants.rb  
 
  
|
  +++ hbase-shell/src/main/ruby/hbase_constants.rb  
 
  
|
  @@ -84,6 +84,7 @@ module HBaseConstants   
 
  
|
 SERVER_NAME = 'SERVER_NAME'.freeze 
 
  
|
 LOCALITY_THRESHOLD = 'LOCALITY_THRESHOLD'.freeze   
 
  
|
 RESTORE_ACL = 'RESTORE_ACL'.freeze 
 
  
|
  +  MOVE_MOB_FILES_FROM_ARCHIVE_TO_WORKDIR = 
'MOVE_MOB_FILES_FROM_ARCHIVE_TO_WORKDIR'.freeze
  
|
 FORMATTER = 'FORMATTER'.freeze 
 
  
|
 FORMATTER_CLASS = 'FORMATTER_CLASS'.freeze 
   

Which is to address the TestShell failure. Can you review the v2 in review 
board? Thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-12-12 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288010#comment-16288010
 ] 

huaxiang sun commented on HBASE-18693:
--

Thanks [~dujin...@gmail.com] for remind. I think I have the latest code which I 
will polish and post.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-12-12 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16287449#comment-16287449
 ] 

Jingcheng Du commented on HBASE-18693:
--

Thanks [~huaxiang] for the patch!
Have you already posted the v3 patch to RB?

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-11-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233801#comment-16233801
 ] 

Hadoop QA commented on HBASE-18693:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
31s{color} | {color:red} hbase-client: The patch generated 2 new + 228 
unchanged - 0 fixed = 230 total (was 228) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
8s{color} | {color:red} hbase-server: The patch generated 24 new + 270 
unchanged - 0 fixed = 294 total (was 270) {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m 
11s{color} | {color:red} The patch generated 4 new + 340 unchanged - 1 fixed = 
344 total (was 341) {color} |
| {color:red}-1{color} | {color:red} ruby-lint {color} | {color:red}  0m  
5s{color} | {color:red} The patch generated 2 new + 835 unchanged - 0 fixed = 
837 total (was 835) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
54s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
49m 23s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
31s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 89m 
37s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
16s{color} | {color:green} hbase-shell 

[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-16 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206319#comment-16206319
 ] 

huaxiang sun commented on HBASE-18693:
--

I am checking these failed unittests locally and will do another QA run after 
local verification, thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206314#comment-16206314
 ] 

Ted Yu commented on HBASE-18693:


Can you get a clean QA run ?
See if the 3 failed tests can be reproduced locally.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-16 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206304#comment-16206304
 ] 

huaxiang sun commented on HBASE-18693:
--

@tedyu and [~jingcheng.du], I posted v2 at the review board, any comments for 
v2? Thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199628#comment-16199628
 ] 

Hadoop QA commented on HBASE-18693:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
37s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
10s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m  
9s{color} | {color:red} The patch generated 3 new + 334 unchanged - 1 fixed = 
337 total (was 335) {color} |
| {color:red}-1{color} | {color:red} ruby-lint {color} | {color:red}  0m  
5s{color} | {color:red} The patch generated 1 new + 743 unchanged - 0 fixed = 
744 total (was 743) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
46s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
60m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
2m 43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
57s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
15s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}124m 16s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 43s{color} 
| {color:red} hbase-shell in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 

[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-10 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16199017#comment-16199017
 ] 

huaxiang sun commented on HBASE-18693:
--

Thanks [~dujin...@gmail.com] and [~mdrob], I will take care of the comments.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-10 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198767#comment-16198767
 ] 

Mike Drob commented on HBASE-18693:
---

The rubocop & ruby-lint warnings look fine to ignore for now. We'll need to do 
a major cleanup pass later on line length anyway.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-09 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198114#comment-16198114
 ] 

Jingcheng Du commented on HBASE-18693:
--

Thanks a lot [~huaxiang], I've updated the comments in RB.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-09 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198008#comment-16198008
 ] 

Jingcheng Du commented on HBASE-18693:
--

Sure, [~huaxiang]. I am looking at it. It may take a few days. Thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-09 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16197198#comment-16197198
 ] 

huaxiang sun commented on HBASE-18693:
--

[~dujin...@gmail.com], can you help to look at the patch? Thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16195558#comment-16195558
 ] 

Hadoop QA commented on HBASE-18693:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
34s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m  
8s{color} | {color:red} The patch generated 3 new + 332 unchanged - 1 fixed = 
335 total (was 333) {color} |
| {color:red}-1{color} | {color:red} ruby-lint {color} | {color:red}  0m  
3s{color} | {color:red} The patch generated 1 new + 731 unchanged - 0 fixed = 
732 total (was 731) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
35s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
35m 18s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 95m 
45s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 10s{color} 
| {color:red} hbase-shell in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | 

[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-31 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149413#comment-16149413
 ] 

huaxiang sun commented on HBASE-18693:
--

Thanks [~dujin...@gmail.com]! I will upload a patch.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-31 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149191#comment-16149191
 ] 

Jingcheng Du commented on HBASE-18693:
--

bq. I got what was your concern. restore_snapshot always restores to the same 
table, that is why I add an option here. clone_snapshot is a different story, 
it can be cloned to different tables. If the option is added to clone_snapshot, 
it will corrupt the snapshot.
You are right. I am +1 on this option. Thanks!

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-30 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147677#comment-16147677
 ] 

huaxiang sun commented on HBASE-18693:
--

Hi Jingcheng,

{quote}
Restoring a snapshot to the same table is okay. What if we try to restore the 
snapshot in another table? The same MOB file can be in different locations? No, 
right?
{quote}

I got what was your concern. restore_snapshot always restores to the same 
table, that is why I add an option here. clone_snapshot is a different story, 
it can be cloned to different tables. If the option is added to clone_snapshot, 
it will corrupt the snapshot.

{quote}
You are right, this is a problem. How about select files with multiple threads, 
each thread handle part of the files selection? Thanks.
{quote}
HBASE-17043 has been created for this effort. I think this is not enough and 
overhead (pressure to NN). We need to give user an option in this case.
If this option looks good to you, I am going to post a patch.

Thanks

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-30 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147396#comment-16147396
 ] 

Jingcheng Du commented on HBASE-18693:
--

Thanks Huaxiang.
bq. The snapshot itself is not destroyed after moving mob files from archive to 
working directory. I do not see an issue to restore a snapshot twice here. Can 
you share more details?
Restoring a snapshot to the same table is okay. What if we try to restore the 
snapshot in another table? The same MOB file can be in different locations? No, 
right?

bq. For one of our use cases, user exported a snapshot with millions of mob 
files and restored the table at a remote cluster. The select() took more than 
one day to complete before actual compaction happened. We did the hack to skip 
hfile links so compaction could happen within several minutes. Even compacting 
links in a longer interval, this is still a huge overhead. What do you think?
You are right, this is a problem. How about select files with multiple threads, 
each thread handle part of the files selection? Thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-29 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146177#comment-16146177
 ] 

huaxiang sun commented on HBASE-18693:
--

I did one test manually to confirm restore_snapshot with mob files moved to 
working directory. The snapshot is still there and I can do 
restore_snapshot/clone_snapshot with it.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-29 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145814#comment-16145814
 ] 

huaxiang sun commented on HBASE-18693:
--

Hi [~dujin...@gmail.com],

{quote}
My concern is if we restore a snapshot twice which is possible, how to
handle such operations?
{quote}

The snapshot itself is not destroyed after moving mob files from archive to 
working directory. I do not see an issue to restore a snapshot twice here. Can 
you share more details?

{quote}
Or we can skip the hfile links in most of MOB compaction, and compact the
links in a longer interval (like a month)?
{quote}

For one of our use cases, user exported a snapshot with millions of mob files 
and restored the table at a remote cluster. The select() took more than one day 
to complete before actual compaction happened. We did the hack to skip hfile 
links so compaction could happen within several minutes. Even compacting links 
in a longer interval, this is still a huge overhead. What do you think? 

Thanks. 

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-29 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144864#comment-16144864
 ] 

Jingcheng Du commented on HBASE-18693:
--

Thanks Huaxiang.
HDFS move doesn't copy the data, right, it doesn't, it is supposed to be a 
rename operation.
My concern is if we restore a snapshot twice which is possible, how to handle 
such operations?

In HBase, we compact the hfile links in compaction, I think compacting hfile 
links in MOB compaction is reasonable too.
Or we can skip the hfile links in most of MOB compaction, and compact the links 
in a longer interval (like a month)?
I prefer the 1st option. What's your idea? Thanks.


> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-29 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144865#comment-16144865
 ] 

Jingcheng Du commented on HBASE-18693:
--

HDFS move doesn't copy the data, right, it doesn't, it is supposed to be a
rename operation.
My concern is if we restore a snapshot twice which is possible, how to
handle such operations?

In HBase, we compact the hfile links in compaction, I think compacting
hfile links in MOB compaction is reasonable too.
Or we can skip the hfile links in most of MOB compaction, and compact the
links in a longer interval (like a month)?
I prefer the 1st option. What's your idea? Thanks.




> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-29 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144813#comment-16144813
 ] 

huaxiang sun commented on HBASE-18693:
--

Thanks [~jingcheng.du]. hdfs move() is a very light operation. I browsed the 
source code and also looked at 
https://stackoverflow.com/questions/34512596/how-does-hdfs-mv-command-work. As 
far as I understand, it only involves name node and no real data operation is 
performed. Skip hFileLink may not work as some mob hFileLinks need to be 
compacted to reduce file number in the directory. If these HFileLinks are 
compacted always, this will cause lots of unnecessary IOs as mob files will be 
created. Please share your thoughts, thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-29 Thread Jingcheng Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16144792#comment-16144792
 ] 

Jingcheng Du commented on HBASE-18693:
--

Thanks [~huangxiangang].
The restore operations should be a fast one, moving data in such an operation 
is not proper I think.
Could we just skip the hfile link, or just compact them no matter what size it 
is in the compaction? How about this?

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-27 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143157#comment-16143157
 ] 

huaxiang sun commented on HBASE-18693:
--

For each mob file, it is expected that fs.move() will be called. Right now, for 
each mob file, it needs to create two files for HFileLink. So I am expecting 
that moving files from archive dir to working dir to be as efficient as the 
current implementation.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-27 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16143051#comment-16143051
 ] 

Anoop Sam John commented on HBASE-18693:


bq. we want to add an option so that restore_snapshot can move mob files from 
archive dir to working dir
How efficient this op will be? I assume this wont include actual data move but 
just rename ops.  What if a million files under the MOB region?

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-08-25 Thread huaxiang sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142402#comment-16142402
 ] 

huaxiang sun commented on HBASE-18693:
--

ping [~jingcheng.du] and [~anoop.hbase], any comments before I go ahead to 
implement the proposed option? Thanks.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)