[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16170830#comment-16170830 ] Ted Yu commented on HBASE-14417: Vlad: All the patches on this JIRA were from me. This was resolved almost half year ago. To add DistCp support, open new JIRA. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v23.txt, 14417-tbl-ext.v24.txt, 14417-tbl-ext.v9.txt, > 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, > 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16170820#comment-16170820 ] Vladimir Rodionov commented on HBASE-14417: --- Reopened to add DistCp support. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Blocker > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v23.txt, 14417-tbl-ext.v24.txt, 14417-tbl-ext.v9.txt, > 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, > 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990397#comment-15990397 ] Hudson commented on HBASE-14417: ABORTED: Integrated in Jenkins build HBase-HBASE-14614 #190 (See [https://builds.apache.org/job/HBase-HBASE-14614/190/]) HBASE-14417 Incremental backup and bulk loading (tedyu: rev 0345fc87759a7d44ecc385327ebb586fc632fb65) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/mapreduce/MapReduceRestoreJob.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestIncrementalBackupWithBulkLoad.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/BackupHFileCleaner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestBackupBase.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestBackupHFileCleaner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupAdminImpl.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/BackupObserver.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/RestoreTablesClient.java > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v23.txt, 14417-tbl-ext.v24.txt, 14417-tbl-ext.v9.txt, > 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, > 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946469#comment-15946469 ] Hudson commented on HBASE-14417: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #2758 (See [https://builds.apache.org/job/HBase-Trunk_matrix/2758/]) HBASE-14417 Incremental backup and bulk loading (tedyu: rev 0345fc87759a7d44ecc385327ebb586fc632fb65) * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/BackupObserver.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestIncrementalBackupWithBulkLoad.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestBackupBase.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupAdminImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/RestoreTablesClient.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/backup/TestBackupHFileCleaner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/mapreduce/MapReduceRestoreJob.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFiles.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/backup/BackupHFileCleaner.java > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v23.txt, 14417-tbl-ext.v24.txt, 14417-tbl-ext.v9.txt, > 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, > 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946136#comment-15946136 ] Hadoop QA commented on HBASE-14417: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 46s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 4s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 37s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 26m 11s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 103m 5s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 142m 37s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12860929/14417-tbl-ext.v24.txt | | JIRA Issue | HBASE-14417 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 649abde09dfb 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / cb4fac1 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | whitespace | https://builds.apache.org/job/PreCommit-HBASE-Build/6246/artifact/patchprocess/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/6246/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6246/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Incremental backup and bulk loading > ---
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943830#comment-15943830 ] Vladimir Rodionov commented on HBASE-14417: --- Yes, Let me do one more round today. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v23.txt, 14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943762#comment-15943762 ] Ted Yu commented on HBASE-14417: [~vrodionov]: Do you have other comment ? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v23.txt, 14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941317#comment-15941317 ] Ted Yu commented on HBASE-14417: TestAcidGuarantees failure was not related to my patch. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v23.txt, 14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941315#comment-15941315 ] Hadoop QA commented on HBASE-14417: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 4s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 6s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 7s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 102m 25s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 142m 1s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.TestAcidGuarantees | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12860450/14417-tbl-ext.v23.txt | | JIRA Issue | HBASE-14417 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux f1256c2f7f5c 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / faf81d5 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | whitespace | https://builds.apache.org/job/PreCommit-HBASE-Build/6215/artifact/patchprocess/whitespace-eol.txt | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/6215/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/6215/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/P
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938860#comment-15938860 ] Vladimir Rodionov commented on HBASE-14417: --- [~tedyu], please do not commit until I finish RB. Thanks. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, > 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, > 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938772#comment-15938772 ] Josh Elser commented on HBASE-14417: Passing over the +1 from RB. Also, mind the whitespace error on commit. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v22.txt, > 14417-tbl-ext.v9.txt, 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, > 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, > 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Here is review board: > https://reviews.apache.org/r/57790/ > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15937212#comment-15937212 ] Hadoop QA commented on HBASE-14417: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 2s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 22s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 22s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 99m 44s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 139m 44s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:8d52d23 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859991/14417-tbl-ext.v22.txt | | JIRA Issue | HBASE-14417 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 9e7da8794801 3.13.0-103-generic #150-Ubuntu SMP Thu Nov 24 10:34:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / f2d1b8d | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | whitespace | https://builds.apache.org/job/PreCommit-HBASE-Build/6198/artifact/patchprocess/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/6198/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6198/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Incremental backup and bulk loading >
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933289#comment-15933289 ] Vladimir Rodionov commented on HBASE-14417: --- [~te...@apache.org], can you submit the patch to RB? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Blocker > Labels: backup > Fix For: 2.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v11.txt, > 14417-tbl-ext.v14.txt, 14417-tbl-ext.v18.txt, 14417-tbl-ext.v19.txt, > 14417-tbl-ext.v20.txt, 14417-tbl-ext.v21.txt, 14417-tbl-ext.v9.txt, > 14417.v11.txt, 14417.v13.txt, 14417.v1.txt, 14417.v21.txt, 14417.v23.txt, > 14417.v24.txt, 14417.v25.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Here is the review board (out of date): > https://reviews.apache.org/r/54258/ > In order not to miss the hfiles which are loaded into region directories in a > situation where postBulkLoadHFile() hook is not called (bulk load being > interrupted), we record hfile names thru preCommitStoreFile() hook. > At time of incremental backup, we check the presence of such hfiles. If they > are present, they become part of the incremental backup image. > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932201#comment-15932201 ] Hadoop QA commented on HBASE-14417: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 4s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 55s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 55s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 35m 17s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 107m 37s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 159m 5s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:8d52d23 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12859488/14417-tbl-ext.v21.txt | | JIRA Issue | HBASE-14417 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux a58f6ac38258 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / 7c03a21 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | whitespace | https://builds.apache.org/job/PreCommit-HBASE-Build/6145/artifact/patchprocess/whitespace-eol.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/6145/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6145/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Incremental backup and bulk loading >
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932085#comment-15932085 ] Hadoop QA commented on HBASE-14417: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 5s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 58s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 4s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 12s {color} | {color:red} hbase-server generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 119m 56s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 166m 4s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | Using .equals to compare two byte[]'s, (equivalent to ==) in org.apache.hadoop.hbase.backup.impl.BackupSystemTable.readBulkloadRows(List) At BackupSystemTable.java:byte[]'s, (equivalent to ==) in org.apache.hadoop.hbase.backup.impl.BackupSystemTable.readBulkloadRows(List) At BackupSystemTable.java:[line 416] | | | Boxing/unboxing to parse a primitive org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.getTsFromBackupId(String) At RestoreTablesClient.java:org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.getTsFromBackupId(String) At RestoreTablesClient.java:[line 263] | | | Exception is caught when Exception is not thrown in org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.restore(HashMap, TableName[], TableName[], boolean) At RestoreTablesClient.java:is not thrown in org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.restore(HashMap, TableName[], TableName[], boolean) At RestoreTablesClient.java:[line 249] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:8d52d23 | | JIRA Patch URL
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733635#comment-15733635 ] Ted Yu commented on HBASE-14417: The final filename is available here in HRegion#bulkLoadHFiles() : {code} Path commitedStoreFile = store.bulkLoadHFile(finalPath, seqId); {code} If we add one more hook above which records final filename in hbase:backup table, we still depend on postBulkLoadHFile() hook to write final filename one more time (with state of completion) - because bulk load event persistence (done in finally block) may fail. Meaning BackupHFileCleaner wouldn't have enough information whether the bulk load succeeded by simply checking the existence of store file(s) in region directory: {code} // write a bulk load event when not all hfiles are loaded try { WALProtos.BulkLoadDescriptor loadDescriptor = ProtobufUtil.toBulkLoadDescriptor( this.getRegionInfo().getTable(), {code} > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v9.txt, > 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 14417.v2.txt, 14417.v21.txt, > 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733183#comment-15733183 ] Ted Yu commented on HBASE-14417: The sequence Id for bulk loaded hfile is generated inside HRegion#bulkLoadHFiles(). Coming out of bulkLoadHFiles() call, postBulkLoadHFile() hook is called with actual file names. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v9.txt, > 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 14417.v2.txt, 14417.v21.txt, > 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15732531#comment-15732531 ] Ted Yu commented on HBASE-14417: More response to Vlad's review comment w.r.t. fault tolerance in bulk load. When bulk load fails midway, the user should provide complete set of hfiles again because the staging directory is not exposed to end users. With this in mind, the benefit of using another hook (prior to postBulkLoadHFile()) to persist location of bulk loaded hfiles is minimal - since in subsequent bulk load attempt(s), the same set of (source) hfiles would be loaded again. Another factor is that the more writes to hbase:backup table, the higher the chance of getting (write) failure. One optimization we can do in the future is to combine writes (performed in postBulkLoadHFile()) from several regions on the same region server, provided that these writes are sufficiently close (300 ms apart, e.g.). The completion of bulk load on a single region server is determined by the slowest participating region, so this optimization would keep the response time on par with the current implementation (where hbase:backup table is not involed). > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v10.txt, 14417-tbl-ext.v9.txt, > 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 14417.v2.txt, 14417.v21.txt, > 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715023#comment-15715023 ] Ashish Singhi commented on HBASE-14417: --- I don't want to block this jira. I will be on holiday for next two weeks so will not be able to check this before that. If any one else is fine pls go ahead and commit it. I will review it later. Thanks > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v9.txt, 14417.v1.txt, 14417.v11.txt, > 14417.v13.txt, 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, > 14417.v25.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712616#comment-15712616 ] Ted Yu commented on HBASE-14417: Due to lack of access to gmail, I didn't see the above until several minutes ago. Here is the review board: https://reviews.apache.org/r/54258/ > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v9.txt, 14417.v1.txt, 14417.v11.txt, > 14417.v13.txt, 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, > 14417.v25.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15694965#comment-15694965 ] Ashish Singhi commented on HBASE-14417: --- Can you upload the patch on RB. It will be easy to review. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v9.txt, 14417.v1.txt, 14417.v11.txt, > 14417.v13.txt, 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, > 14417.v25.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685093#comment-15685093 ] Ted Yu commented on HBASE-14417: BackupHFileCleaner should be registered through hbase.master.hfilecleaner.plugins . It is responsible for keeping bulk loaded hfiles so that incremental backup can pick them up. BackupObserver should be registered through hbase.coprocessor.region.classes It is notified when bulk load completes and writes records into hbase:backup table. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417-tbl-ext.v9.txt, 14417.v1.txt, 14417.v11.txt, > 14417.v13.txt, 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, > 14417.v25.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15650599#comment-15650599 ] Ted Yu commented on HBASE-14417: I am proceeding to implement the above proposal. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15638081#comment-15638081 ] Devaraj Das commented on HBASE-14417: - A summary of some internal discussions on the high-level flow that doesn't use ZK... 1. Client updates the hbase:backup table with a set of paths that are to be bulkloaded (if the tables in question have been fully backed up at least once in the past) 2. Client performs the bulkload of the data. If the client fails before the bulkload was fully complete, the cleaner chore in (5) would take care of cleaning up the unneeded entries from hbase:backup 3. There is a HFileCleaner that makes sure that paths that came about due to (1) are held until the next incremental backup 4. As part of the incremental backup, the hbase:backup table is updated to reflect the right location where the earlier bulkloaded file got copied to 5. A chore runs periodically (in the BackupController) that eliminates entries from the hbase:backup table if the corresponding paths don't exist in the filesystem until after a configured time period (default, say 24 hours; bulkload timeout is assumed to be much smaller than this, and hence all bulkloads that are meant to successfully complete would complete). Thoughts? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629891#comment-15629891 ] Vladimir Rodionov commented on HBASE-14417: --- [~tedyu], calling remote region in postAppend hook does not look like a good idea. How about putting all this logic into RSRpcServices.bulkLoadHFile? Not necessary do synchronous call, you can use queue and update backup table on bulk load finish. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629369#comment-15629369 ] Ted Yu commented on HBASE-14417: I observed this in the TestHRegionServerBulkLoad-output for the version (v11 and earlier) where bulk load marker is written directly to hbase:backup table in postAppend hook: {code} 2016-09-13 23:10:14,072 DEBUG [B.defaultRpcServer.handler=4,queue=0,port=35667] ipc.CallRunner(112): B.defaultRpcServer.handler=4,queue=0,port=35667: callId: 10646 service: ClientService methodName: Scan size: 264 connection: 172.18.128.12:59780 org.apache.hadoop.hbase.RegionTooBusyException: failed to get a lock in 6 ms. regionName=atomicBulkLoad,,1473808150804.6b6c67612b01bce3348c144b959b7f0e., server=cn012.l42scl.hortonworks.com,35667,1473808145352 at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:7744) at org.apache.hadoop.hbase.regionserver.HRegion.lock(HRegion.java:7725) at org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:7634) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2588) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2582) at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2569) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33516) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2229) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:136) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:111) at java.lang.Thread.run(Thread.java:745) {code} Here was the state of the BulkLoadHandler thread (stuck): {code} "RS:0;cn012:36301.append-pool9-t1" #453 prio=5 os_prio=0 tid=0x7fc3945bb000 nid=0x18ec in Object.wait() [0x7fc30dada000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:1727) - locked <0x000794750580> (a java.util.concurrent.atomic.AtomicLong) at org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1756) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:241) at org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191) - locked <0x000794750048> (a org.apache.hadoop.hbase.client.BufferedMutatorImpl) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:949) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:569) at org.apache.hadoop.hbase.backup.impl.BackupSystemTable.writeBulkLoadDesc(BackupSystemTable.java:227) at org.apache.hadoop.hbase.backup.impl.BulkLoadHandler.postAppend(BulkLoadHandler.java:83) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.postAppend(FSHLog.java:1448) {code} Even increasing handler count didn't help: {code} diff --git a/hbase-server/src/test/resources/hbase-site.xml b/hbase-server/src/test/resources/hbase-site.xml index bca90a3..829fcc9 100644 --- a/hbase-server/src/test/resources/hbase-site.xml +++ b/hbase-server/src/test/resources/hbase-site.xml @@ -30,6 +30,10 @@ +hbase.backup.enable +true + + hbase.defaults.for.version.skip true @@ -48,11 +52,11 @@ hbase.regionserver.handler.count -5 +50 {code} Post v11, the data stored in zookeeper is temporary: once an incremental backup is run for the table receiving bulk load, data in zookeeper would be stored for the backup Id and removed from zookeeper. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15626239#comment-15626239 ] stack commented on HBASE-14417: --- I like what [~vrodionov] asks here. We are busy elsewhere trying to undo our reliance on zk for all but the bare minimum yet here we are dev'ing new features on it. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v25.txt, > 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15623137#comment-15623137 ] Vladimir Rodionov commented on HBASE-14417: --- [~te...@apache.org], is patch ready for review? If yes, then please open review on RB. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622809#comment-15622809 ] Ted Yu commented on HBASE-14417: hbase:backup table is used to retrieve Set of tables which have gone through full backup after region server comes up (since the region server may have missed prior zk notifications). Afterwards, BulkLoadHandler would receive notifications from zookeeper and doesn't need to poll hbase:backup table. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622790#comment-15622790 ] Vladimir Rodionov commented on HBASE-14417: --- {quote} Patch v24 registers tables being fully backed up in zookeeper so that BulkLoadHandler on respective region server can avoid unnecessary post to zookeeper. {quote} Why can not we use hbase:backup table for that? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v24.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605864#comment-15605864 ] Ted Yu commented on HBASE-14417: In IncrementalTableBackupClient, before marking the backup complete, I am adding code to copy the bulk loaded hfiles to destination filesystem and persist the list of copied files to hbase:backup table. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605689#comment-15605689 ] Ted Yu commented on HBASE-14417: When running TestFullRestore, I found an interesting issue: bulk load can happen to the table which is fully backed up - when overwrite option is specified. I can think of two ways to omit these bulk loaded files: 1. under backup zookeeper subtree, create znode with tables which are being restored to with overwrite option. This allows the postAppend() hook to skip these tables for the duration of the restore 2. at the end of bulk load, issue deletes against the znodes which are added during the bulk load I am leaning toward first option. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15589109#comment-15589109 ] Ted Yu commented on HBASE-14417: Discussed with Enis. The persistence of bulk loaded hfiles should be synchronous with the bulk load. Writing to hbase:backup table through postAppend() hook is tricky. The precedent of HBASE-13153 can be used for the persistence of reference to bulk loaded hfiles. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15564243#comment-15564243 ] Ted Yu commented on HBASE-14417: Constructive suggestion is welcome. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563507#comment-15563507 ] Vladimir Rodionov commented on HBASE-14417: --- >> One potential approach is to store hfile information in zookeeper. -1. No Zk, please. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563463#comment-15563463 ] Ted Yu commented on HBASE-14417: While working on BackupHFileCleaner, the counterpart to ReplicationHFileCleaner, I notice the potential impact on the server hosting hbase:backup because we need to have up-to-date information on the hfiles which are still referenced by the incremental backup. One potential approach is to store hfile information in zookeeper. This would also alleviate the issue mentioned above about reducing writing BulkLoadDescriptor's to hbase:backup table. Any suggestions, [~mbertozzi] [~vrodionov] ? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v2.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514628#comment-15514628 ] Ted Yu commented on HBASE-14417: Currently there is one BulkLoadHandler inside each region server which writes BulkLoadDescriptor periodically to hbase:backup table. Ideally only BulkLoadDescriptor's for tables which have gone through full backup should be written. Looking for a way to pass Set of such tables to region servers so that each server doesn't have to poll hbase:backup table periodically. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v2.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511698#comment-15511698 ] Ted Yu commented on HBASE-14417: I was adding new test which does one more full table backup after the incremental restore and verifies that BulkLoadDescriptor's are cleaned up from hbase:backup. It turns out the bulk loaded hfiles were renamed during the incremental restore which resulted in FileNotFoundException. Filed HBASE-16672 so that the bulk loaded hfiles can be used for multiple restore destinations. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453795#comment-15453795 ] Vladimir Rodionov commented on HBASE-14417: --- {quote} What about multiple restore destinations ? {quote} Multiple destination support is tricky, agree, this is why we need to see full design doc, step-by-step, to start a productive discussion. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453774#comment-15453774 ] Ted Yu commented on HBASE-14417: bq. we need ship bulk load only once to backup destination What about multiple restore destinations ? I can write up some doc after getting consensus. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453728#comment-15453728 ] Vladimir Rodionov commented on HBASE-14417: --- {quote} Suppose there're two backup sets involving common table(s). The bulk loaded hfiles for the common table should be shared by the two backups. {quote} Its fine. We keep backup data per table (in incremental mode), means, that we need ship bulk load only once to backup destination. Design doc, [~tedyu]? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453160#comment-15453160 ] Ted Yu commented on HBASE-14417: HFile references would stay, assisted by BackupHFileCleaner. Please read the earlier comments. Suppose there're two backup sets involving common table(s). The bulk loaded hfiles for the common table should be shared by the two backups. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15453063#comment-15453063 ] Vladimir Rodionov commented on HBASE-14417: --- {quote} The above is restore, right ? For backup, we only keep hfile refs. {quote} No, for backup. Where are you going to keep bulkloaded files after backup is complete? Not sure, I am following you here. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452950#comment-15452950 ] Ted Yu commented on HBASE-14417: bq. HFiles reached the backup destination The above is restore, right ? For backup, we only keep hfile refs. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452938#comment-15452938 ] Vladimir Rodionov commented on HBASE-14417: --- I thought we need these HFile references in hbase:backup only until we ship them (bulk loaded files) to backup destination? What do we need them for after backup is complete and HFiles reached the backup destination, [~tedyu]? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452824#comment-15452824 ] Ted Yu commented on HBASE-14417: bq. Delete file ref from hbase:backup after backup completes Did you mean after restore completes ? What if the user wants to restore to a different destination afterward ? The removal of hfile ref from hbase:backup can be coupled with the deletion of backup(s). Restoring incremental backup would ship hfiles along with WAL files to destination. Suppose given this sequence of events where f means full backup, b means bulk load and i means incremental backup: * f * b1 (with hfile1) * b2 (with hfile2) * i1 * b3 (with hfile3) * i2 The design of hfile ref in hbase:backup should make BackupHFileCleaner operation efficient. If we consolidate hfile ref from b1 and b2 into i1, b3 into i2, BackupHFileCleaner needs to search backward (across all outstanding incremental backups): i2 -> i1 -> f. If we consolidate hfile ref from b1 and b2 by merging them and storing b1', there is no need to search incremental backup(s). > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452786#comment-15452786 ] Vladimir Rodionov commented on HBASE-14417: --- Delete file ref from hbase:backup after backup completes - do not keep them there forever. Now we have two different types of files in backup : HFile (snapshot/full), WAL (incremental). Bulk load adds third type - HFile (bulk load/incremental) *Shipping HFiles to backup destination *New storage layout for incremental backup image, that has bulk loaded files? *The algorithm for restore incremental backup that has both : WALs and HFiles? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15452524#comment-15452524 ] Ted Yu commented on HBASE-14417: ReplicationHFileCleaner retrieves hfile refs from zookeeper in order to check for deletable files. The new BackupHFileCleaner would retrieve hfile refs by scanning hbase:backup table. The hfile refs may be stored separately if no incremental / full backup has been performed since the bulk load or, in manifest of some incremental backup. Since we don't know which incremental backup manifest may contain related hfile ref, we need to scan backwards until one incremental backup is found or, one full backup is found. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449854#comment-15449854 ] Ted Yu commented on HBASE-14417: In principal, the above is in line with my earlier comments. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449794#comment-15449794 ] Enis Soztutar commented on HBASE-14417: --- We should use a similar design for this issue with "HFile replication" via HBASE-13153. Replication of hfiles is conceptually very similar to incremental bulk load, and we already have the tools. Please read the design doc and corresponding code. For this issue, we can mainly do this: - Similar to replication, everytime BL happens, we create a reference per incremental backup as a part of the BL process. These references can be saved in the hbase:backup table. - A custom hfile cleaner (like the ReplicationHFileCleaner) will monitor the bulk loaded hfiles, and makes sure that they are not deleted even after compactions, etc if there is still incremental backup to be performed. - Incremental backup will also copy the files that are from BL by referring to the references saved in hbase:backup table in the next round. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439447#comment-15439447 ] Ted Yu commented on HBASE-14417: There may be more than one round of bulk load between the full backup and incremental backup(s). For each round, we may use timestamp of completion of bulk load for the loaded hfiles (in terms of record in hbase:backup). When the next incremental backup takes place, we consolidate all the recorded bulk loaded hfiles and save the list in manifest of the incremental backup. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437430#comment-15437430 ] Ted Yu commented on HBASE-14417: During offline discussion, Vladimir suggested that we record the list of bulk loaded hfiles into hbase:backup table at the end of bulk load. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414459#comment-15414459 ] Ted Yu commented on HBASE-14417: I think HBASE-14141 should be done first which would lay foundation for proper resolution of this issue. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414450#comment-15414450 ] Devaraj Das commented on HBASE-14417: - The question is should we do this first or HBASE-14141 first. We need both in reality. We could put in a short term solution for backing up bulk-loaded data but wondering if we should bite the bullet and do HBASE-14141 and then this. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414447#comment-15414447 ] Ted Yu commented on HBASE-14417: Discussed with Devaraj. Quick solution is to detect the presence of bulk loaded hfiles between full backup and incremental backup. If bulk load is detected, convert incremental backup to full backup. Better solution, related to HBASE-14141, is to list bulk loaded hfiles during incremental backup and put them in their own backup image. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382805#comment-15382805 ] Ted Yu commented on HBASE-14417: If bulk loading is scheduled at regular interval, one approach is to align incremental backup activity with the bulk load. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375729#comment-15375729 ] Ted Yu commented on HBASE-14417: After registering these files, would they still participate in the next round of major compaction (assuming major compaction comes before the next incremental backup) ? > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368619#comment-15368619 ] Vladimir Rodionov commented on HBASE-14417: --- Bulk loading produces new HFiles, we need to register these files and move them to backup destination during next incremental phase. Something like this. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368585#comment-15368585 ] Ted Yu commented on HBASE-14417: Looks like bulk data loading should be applied to target table(s) as well. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). -- This message was sent by Atlassian JIRA (v6.3.4#6332)