[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:40 AM:


Ooops, may be a bug.

Update:

No, it can not be. When tool starts we create instance of Configuration:

{code}
  public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
int ret = ToolRunner.run(conf, new BackupDriver(), args);
System.exit(ret);
  }
{code}

That is what we get from CLASSPATH. Check your setup, [~saint@gmail.com]





was (Author: vrodionov):
Ooops, may be a bug.




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:15 AM:


Ooops, may be a bug.





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to configure HBase cluster. From 
your log file, we  see that LocalFileSystem is used instead of hdfs, that looks 
like your own configuration problem.  If you have cluster, go to any node to 
HBASE_HOME/bin and run backup command. If it fails, then you will not be able 
to run any hbase command, including 'shell'. Can you run "hbase shell" and 
connect to cluster?




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WA

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:12 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to configure HBase cluster. From 
your log file, we  see that LocalFileSystem is used instead of hdfs, that looks 
like your own configuration problem.  If you have cluster, go to any node to 
HBASE_HOME/bin and run backup command. If it fails, then you will not be able 
to run any hbase command, including 'shell'. Can you run "hbase shell" and 
connect to cluster?





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. Can you run "hbase shell" and connect to cluster?




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental back

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:11 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. Can you run "hbase shell" and connect to cluster?





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the ful

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 2:00 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you will not be able to run any hbase command, including 
'shell'. 





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are will not be able to run any hbase command, including 
'shell'. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> 

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 1:58 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are will not be able to run any hbase command, including 
'shell'. 





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are not able to run any hbase command, including 'shell'. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570534#comment-15570534
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 1:56 AM:


[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 

If you have cluster, go to any node to 
HBASE_HOME/bin

and run backup command

If it fails, then you are not able to run any hbase command, including 'shell'. 





was (Author: vrodionov):
[~stack]
You can file JIRA, of course, but I think it will be closed with standard 
suggestion to ask community for help on how to
configure HBase cluster. From your log file, we clearly see that 
LocalFileSystem is used instead of hdfs, that is clearly, your own 
configuration issue. 




> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups 

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570273#comment-15570273
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 12:11 AM:
-

{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace, [~saint@gmail.com]? It seems something is 
wrong with your hbase conf file.

Google: "java.lang.IllegalArgumentException Wrong FS: hdfs:" 
https://www.google.com/webhp?sourceid=chrome-instant&rlz=1C5CHFA_enUS548US548&ion=1&espv=2&ie=UTF-8#q=java.lang.IllegalArgumentException+Wrong+FS%3A+hdfs%3A

Next time, can you try recommended:
{code}
$ ./hbase/bin/hbase backup create full 
hdfs://ve0524.halxg.cloudera.com:8020/backu
{code}

hbase pick up config from CLASSPATH, all other options, such as --config may 
require you putting all hadoop xml files into command CLASSPATH manually.


was (Author: vrodionov):
{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace, [~saint@gmail.com]? It seems something is 
wrong with your hbase conf file.

Google: "java.lang.IllegalArgumentException Wrong FS: hdfs:" 
https://www.google.com/webhp?sourceid=chrome-instant&rlz=1C5CHFA_enUS548US548&ion=1&espv=2&ie=UTF-8#q=java.lang.IllegalArgumentException+Wrong+FS%3A+hdfs%3A

Next time, can try recommended:
{code}
$ ./hbase/bin/hbase backup create full 
hdfs://ve0524.halxg.cloudera.com:8020/backu
{code}

hbase pick up config from CLASSPATH, all other options, such as --config may 
require you putting all hadoop xml files into command CLASSPATH manually.

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve bette

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570273#comment-15570273
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 12:05 AM:
-

{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace, [~saint@gmail.com]? It seems something is 
wrong with your hbase conf file.

Google: "java.lang.IllegalArgumentException Wrong FS: hdfs:" 
https://www.google.com/webhp?sourceid=chrome-instant&rlz=1C5CHFA_enUS548US548&ion=1&espv=2&ie=UTF-8#q=java.lang.IllegalArgumentException+Wrong+FS%3A+hdfs%3A

Next time, can try recommended:
{code}
$ ./hbase/bin/hbase backup create full 
hdfs://ve0524.halxg.cloudera.com:8020/backu
{code}

hbase pick up config from CLASSPATH, all other options, such as --config may 
require you putting all hadoop xml files into command CLASSPATH manually.


was (Author: vrodionov):
{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace, [~saint@gmail.com]? It seems something is 
wrong with your hbase conf file.

Google: "java.lang.IllegalArgumentException Wrong FS: hdfs:" 
https://www.google.com/webhp?sourceid=chrome-instant&rlz=1C5CHFA_enUS548US548&ion=1&espv=2&ie=UTF-8#q=java.lang.IllegalArgumentException+Wrong+FS%3A+hdfs%3A

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570273#comment-15570273
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/13/16 12:02 AM:
-

{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace, [~saint@gmail.com]? It seems something is 
wrong with your hbase conf file.

Google: "java.lang.IllegalArgumentException Wrong FS: hdfs:" 
https://www.google.com/webhp?sourceid=chrome-instant&rlz=1C5CHFA_enUS548US548&ion=1&espv=2&ie=UTF-8#q=java.lang.IllegalArgumentException+Wrong+FS%3A+hdfs%3A


was (Author: vrodionov):
{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace, [~saint@gmail.com]? It seems something is 
wrong with your hbase conf file.

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to di

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-10-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570273#comment-15570273
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 10/12/16 11:59 PM:
-

{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace, [~saint@gmail.com]? It seems something is 
wrong with your hbase conf file.


was (Author: vrodionov):
{quote}
It failed with java.lang.IllegalArgumentException Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:/// and then 
later I got an NPE:
{quote}

Can you post full stack trace?

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-09-20 Thread Frank Welsch (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504659#comment-15504659
 ] 

Frank Welsch edited comment on HBASE-7912 at 9/21/16 3:18 AM:
--

I am attaching revised doc of the feature. The revisions are for the following:
--correct the backupId argument spelling and format on the CLI
--to remove the hbase backup cancel functionality from the doc because it's not 
yet supported
--list limitations of the current HBase backup-and-restore functionality
--added required property setting in the container-executor.cfg file


was (Author: fwelsch):
I am attaching revised doc of the feature. The revisions are for the following:
--correct the backupId argument spelling and format on the CLI
--to remove the hbase backup cancel functionality from the doc because it's not 
yet supported
--list limitations of the current HBase backup-and-restore functionality

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-09-12 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15485053#comment-15485053
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 9/12/16 7:34 PM:
---

Thanks, [~saint@gmail.com] for review :)

We will address command-line tool usability shortly. 

{quote}
Could also say how long a backup is going to take roughly. Me as an operator 
would be afraid to run a backup because I'd think the command could run for 
ever on my 100 node cluster.
{quote} 

Too many unknowns to make a good estimate here. Backup is a sequence of 
snapshot and distcp. Both should work reasonably fast for 100 node cluster. 
Incremental backup is just distcp, hence will run even faster.
{quote}
Unable to get table hbase:backup state
org.apache.hadoop.hbase.TableNotFoundException: hbase:backup
{quote}

That is the issue not only for backup system table - for all tables in HBase. 
This error is annoying I agree. Needs to be addressed in a separate JIRA for 
CreateTableProcedure.

{quote}
Convert is unexplained as is silent (I can guess what the latter means)
{quote}
Artefacts from the past. Will remove them.
{quote}
I would have liked file: scheme as an option for backup location if only for 
testing purposes (the timelinev2 folks might like this too). I can file an 
issue.
{quote}

You can file JIRA, of course. 

{quote}
-w workers are threads or mapreduce tasks? Thats what I asked myself when I saw 
it.
Would be great working the doc explaination of each of these options back into 
the command usage. More folks will read the cmd output than doc 
(unfortunately). e.g. the doc explains what the -w option is about where usage 
output does not.
{quote}

Yes, that is M/R task number for M/R - based distributed backup/restore 
service. In theory, one can provide totally non-M/R based implementation for 
all distributed services and exact meaning of *-w* can be different in this 
case.

{quote}
Does that '-b' for bandwidth actually work? If so, how.
{quote}
Yes, it works for both: export snapshot and distcp copy. You can specify 
bandwidth per map task in both tools.

{quote}
I tried history command. It emitted nothing. I add a -h and got the above.
Is 'history' the 'list' of backups taken? They the same thing?
{quote}

Yes, history is the list of backups in system table. You saw nothing because 
you had nothing in a system table. 

Overall, thanks for quick review of command-line tools and we will address 
usability issues shortly. 
 


  


was (Author: vrodionov):
Thanks, [~saint@gmail.com] for review :)

We will address command-line tool usability shortly. 

{quote}
Could also say how long a backup is going to take roughly. Me as an operator 
would be afraid to run a backup because I'd think the command could run for 
ever on my 100 node cluster.
{quote} 

Too many unknowns to make a good estimate here. Backup is a sequence of 
snapshot and distcp. Both should work reasonably fast for 100 node cluster. 
Incremental backup is just distcp, hence will run even faster.
{quote}
Unable to get table hbase:backup state
org.apache.hadoop.hbase.TableNotFoundException: hbase:backup
{quote}

That is the issue not only for backup system table - for all tables in HBase. 
This error is annoying I agree. Needs to be addressed in a separate JIRA for 
CreateTableProcedure.

{quote}
Convert is unexplained as is silent (I can guess what the latter means)
{quote}
Artefacts from the past. Will remove them.
{quote}
I would have liked file: scheme as an option for backup location if only for 
testing purposes (the timelinev2 folks might like this too). I can file an 
issue.
{quote}

You can file JIRA, of course. 

{quote}
-w workers are threads or mapreduce tasks? Thats what I asked myself when I saw 
it.
Would be great working the doc explaination of each of these options back into 
the command usage. More folks will read the cmd output than doc 
(unfortunately). e.g. the doc explains what the -w option is about where usage 
output does not.
{quote}

Yes, that is M/R task number for M/R - based distributed backup/restore 
service. In theory, one can provide totally non-M/R based implementation for 
all distributed services and exact meaning of *-w* can be different in this 
case.

{quote}
Does that '-b' for bandwidth actually work? If so, how.
{quote}
Yes, it works for both: export snapshot and distcp copy. You can specify 
bandwidth per map task in both tools.

{quote}
I tried history command. It emitted nothing. I add a -h and got the above.
Is 'history' the 'list' of backups taken? They the same thing?
{quote}

Yes, history is the list of backups in system table. You saw nothing because I 
had nothing. 

Overall, thanks for quick review of command-line tools and we will address 
usability issues shortly. 
 


  

> HBase Backup/Restore Based on HBase Snapshot
> ---

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-09-09 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478012#comment-15478012
 ] 

Vladimir Rodionov edited comment on HBASE-7912 at 9/9/16 7:23 PM:
--

Backup User Guide is attached. Prepared by [~fwelsch]


was (Author: vrodionov):
Backup User Guide is attached.

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: Backup-and-Restore-Apache_9Sep2016.pdf, 
> HBaseBackupAndRestore - v0.8.pdf, HBaseBackupAndRestore -0.91.pdf, 
> HBaseBackupAndRestore-v0.9.pdf, HBaseBackupAndRestore.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into longer period and 
> bigger incremental backups.
> * Unified command line interface 

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-04-12 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238048#comment-15238048
 ] 

Devaraj Das edited comment on HBASE-7912 at 4/12/16 9:43 PM:
-

bq. We are not repeating, ns is upper directory level, similar to hbase layout. 
I am not sure I am following you here
I think [~enis] is referring to the fact that you have the namespace repeated 
in the path twice. For example, in this the 'default' namespace appears twice:
{code}ROOT/default/test­1459375580152/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d{code}


was (Author: devaraj):
bq. We are not repeating, ns is upper directory level, similar to hbase layout. 
I am not sure I am following you here
I think [~enis] is referring to the fact that you have the namespace repeated 
in the path twice. For example, in this:
{code}ROOT/default/test­1459375580152/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d{code}

> HBase Backup/Restore Based on HBase Snapshot
> 
>
> Key: HBASE-7912
> URL: https://issues.apache.org/jira/browse/HBASE-7912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Richard Ding
>Assignee: Vladimir Rodionov
>  Labels: backup
> Fix For: 2.0.0
>
> Attachments: HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBaseBackupandRestore.pdf, 
> HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backu

[jira] [Comment Edited] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

2016-03-30 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219257#comment-15219257
 ] 

Enis Soztutar edited comment on HBASE-7912 at 3/31/16 3:18 AM:
---

Thanks [~vrodionov] for updating the design doc for layout and backup image 
formats. It helps understanding the patches better. 

We had discussed offline about this, and I thought the plan was to use the 
{{backupId}} as the leading dir name. Instead of 
{code}ROOT/default/test­1459375580152/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d{code},
 I think we should use the same directory layout of the {{hbase.rootdir}} under 
the backup root dir: 

So for full backups it should look like this: 
{code}ROOT/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d{code}.
 And for incremental backups the layout should still follow the same 
{{hbase.rootdir}} layout. So instead of: 
{code}ROOT/WALs/backup_1459378688723/10.22.11.177%2C58809%2C1459378626079.14593786
59124{code}
it should be: 
{code}ROOT/backup_1459378688723/WALs/10.22.11.177,58809,1459378626079/10.22.11.177%2C58809%2C1459378626079.14593786
59124{code}. Notice that there is an extra  in the layout as well. 

This structure will allow the rest of the code base (like hfile links, etc) to 
work seamlessly and also will help with a large number of files under WALs. 
Also, all the table in the table set for a backup will be hosted together etc. 

You have {{not used}} and {{obsolite}} fields in the PB structures. Since this 
is new work, whatever is not used and not needed should be removed from the 
patches. 

The backup images contain this: {{required string root_dir = 3;}}, which I 
think we should remove. The problem with having the absolute path in the 
manifests is that, it will make the directory un-relocatable. The issue is that 
if the operator does rename, or otherwise changes NN info etc, then it will be 
silent data loss. I think we should make it so that every path in image / 
manifest is relative, and all ancestors are implicitly under the same remote 
backup location. 

This is from the doc: 
bq. There was concern that we can lose data in between incremental backup 
sessions and this why the tracking of already copied WAL files has been added, 
but it turned out that is not necessary to do this because we ALWAYS include 
ALL tables which have at least one backup session into final backup table list 
for incremental backup.
Without this, the issue with HBASE-15442 will still happen, no? With the 
dependency generation algorithm as in the doc, if I have: 
 - full backup t1 with backup_id = bid1
 - incremental backup t2 with backup_id = bid2
 - incremental backup t1 with backup_id = bid3

then bid3 will NOT depend on bid2, so it is data loss still, no? 

{{BackupImage}} itself just duplicates the information that we already have in 
the manifest files. Do we really need that structure at all. Can we instead 
keep a list of backup_ids and read the manifests at the time of restore? 



was (Author: enis):
Thanks [~vrodionov] for updating the design doc for layout and backup image 
formats. It helps understanding the patches better. 

We had discussed offline about this, and I thought the plan was to use the 
{{backupId}} as the leading dir name. Instead of 
{{ROOT/default/test­1459375580152/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d}}, I 
think we should use the same directory layout of the {{hbase.rootdir}} under 
the backup root dir: 

So for full backups it should look like this: 
{{ROOT/backup_1459375618126/archive/data/default/test­1459375
580152/543f8c02c388dd931fb9bcd1c38e7372/f/a6ce4789f9b444d89bbc755254afd27d}}. 
And for incremental backups the layout should still follow the same 
{{hbase.rootdir}} layout. So instead of: 
{{ROOT/WALs/backup_1459378688723/10.22.11.177%2C58809%2C1459378626079.14593786
59124}}
it should be: 
{{ROOT/backup_1459378688723/WALs/10.22.11.177,58809,1459378626079/10.22.11.177%2C58809%2C1459378626079.14593786
59124}}. Notice that there is an extra  in the layout as well. 

This structure will allow the rest of the code base (like hfile links, etc) to 
work seamlessly and also will help with a large number of files under WALs. 
Also, all the table in the table set for a backup will be hosted together etc. 

You have {{not used}} and {{obsolite}} fields in the PB structures. Since this 
is new work, whatever is not used and not needed should be removed from the 
patches. 

The backup images contain this: {{required string root_dir = 3;}}, which I 
think we should remove. The problem with having the absolute path in the 
manifests is that, it will make the directory un-relocatab