[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

stack (JIRA) Thu, 13 Oct 2016 12:25:35 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15572912#comment-15572912
 ]


stack commented on HBASE-7912:
------------------------------

Here is some more:

I played with sets. When I add to a set, it lists: Added tables [clicks 
students] to 's' backup set ... but when I list the set, I get 
s={clicks,students}; i.e. in former it is square brackets and space delimited 
but in the latter it is curlies and commas.

Why is this an error: stack@ve0524:~$ ./hbase/bin/hbase --config ~/conf_hbase 
backup set
ERROR: Command line format

It should just dump the help/usage w/o complaint. When I do './hbase/bin/hbase 
--config ~/conf_hbase backup'... it does the right thing (dumping out 
help/usage but not reporting ERROR). Then again, this does: 
stack@ve0524:~$ ./hbase/bin/hbase --config ~/conf_hbase backup create
ERROR: wrong number of arguments: 1

When required args are not supplied, dumping help/usage rather than ERROR is 
usual and user friendly.

When I run a backup, it finishes the job with these:

{code}
2016-10-13 11:42:47,634 INFO  [main] mapreduce.Job:  map 100% reduce 0%
2016-10-13 11:42:47,635 INFO  [main] mapreduce.Job: Job job_local781524854_0001 
completed successfully
2016-10-13 11:42:47,660 INFO  [main] mapreduce.Job: Counters: 27
        File System Counters
                FILE: Number of bytes read=55034661
                FILE: Number of bytes written=56196072
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=24283676687
                HDFS: Number of bytes written=24283672819
                HDFS: Number of read operations=116
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=42
        Map-Reduce Framework
                Map input records=12
                Map output records=0
                Input split bytes=1322
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=1077
                Total committed heap usage (bytes)=3061186560
        org.apache.hadoop.hbase.snapshot.ExportSnapshot$Counter
                BYTES_COPIED=15861720126
                BYTES_EXPECTED=15861720126
                BYTES_SKIPPED=0
                COPY_FAILED=0
                FILES_COPIED=12
                FILES_SKIPPED=0
                MISSING_FILES=0
        File Input Format Counters
                Bytes Read=0
        File Output Format Counters
                Bytes Written=0
2016-10-13 11:42:47,660 INFO  [main] snapshot.ExportSnapshot: Finalize the 
Snapshot Export
2016-10-13 11:42:47,661 ERROR [main] snapshot.ExportSnapshot: Snapshot export 
failed
java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
        at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:1956)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:626)
        at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:1010)
        at 
org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyService.copy(MapReduceBackupCopyService.java:302)
        at 
org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.snapshotCopy(FullTableBackupClient.java:288)
        at 
org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.execute(FullTableBackupClient.java:510)
        at 
org.apache.hadoop.hbase.backup.impl.HBaseBackupAdmin.backupTables(HBaseBackupAdmin.java:532)
        at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:225)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:114)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:135)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:171)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:140)
2016-10-13 11:42:47,766 ERROR [main] impl.FullTableBackupClient: Unexpected 
BackupException : java.io.IOException: Filesystem closed
java.io.IOException: java.io.IOException: Filesystem closed
        at 
org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyService.copy(MapReduceBackupCopyService.java:325)
        at 
org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.snapshotCopy(FullTableBackupClient.java:288)
        at 
org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.execute(FullTableBackupClient.java:510)
        at 
org.apache.hadoop.hbase.backup.impl.HBaseBackupAdmin.backupTables(HBaseBackupAdmin.java:532)
        at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:225)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:114)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:135)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:171)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:140)
Caused by: java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:808)
...
{code}

The command I ran was $ ./hbase/bin/hbase --config ~/conf_hbase backup create 
full clicks I'm missing the backup location which probably explains the above. 
I'd think the tool would verify the command before going ahead and running a 
mapreduce job?

This is interesting... if I pass a set and a table as command lets me do, it 
runs the backup twice. I was sort of expecting one backup dir with all content 
in it. No biggie. Here is the command I ran (the -set included the table so the 
tool behavior avoids backup overwrite).

 ./hbase/bin/hbase --config ~/conf_hbase backup create full 
hdfs://ve0524.halxg.cloudera.com:8020/backup clicks -set s

Probably worth documenting this behavior.

Oh. Just noticed that usage says tables and -set are mutually exclusive but 
above I was able to run a command with tables and -set. Fix the usage?

Why in here:
{code}
stack@ve0524:~$ ./hbase/bin/hbase --config ~/conf_hbase backup create
ERROR: wrong number of arguments: 1
Usage: hbase backup create <type> <BACKUP_ROOT> [tables] [-set name] [-w 
workers][-b bandwith]
 type           "full" to create a full backup image
                "incremental" to create an incremental backup image
 BACKUP_ROOT     The full root path to store the backup image,
                 the prefix can be hdfs, webhdfs or gpfs
Options:
 tables          If no tables ("") are specified, all tables are backed up.
                 Otherwise it is a comma separated list of tables.
 -w              number of parallel workers (MapReduce tasks).
 -b              bandwith per one worker (MapReduce task) in MBs per sec
 -set            name of backup set to use (mutually exclusive with [tables])
{code}

... does the tables 'option' not take a hyphen but all others do? Most of the 
time options are delimited by space but not always....

I tried and incremental against the clicks table. I got a SUCCESS on the end 
but it seems to have updated the manifest on unrelated tables. Is that 
expected? Here is the command:

stack@ve0524:~$ ./hbase/bin/hbase --config ~/conf_hbase backup create 
incremental hdfs://ve0524.halxg.cloudera.com:8020/backup clicks

... here is the output:

{code}
...
2016-10-13 12:00:02,154 INFO  [main] impl.BackupManifest: Manifest file stored 
to 
hdfs://ve0524.halxg.cloudera.com:8020/backup/backup_1476385197835/default/clicks/.backup.manifest
2016-10-13 12:00:02,156 DEBUG [AsyncRpcChannel-pool2-t88] ipc.AsyncRpcChannel: 
Use SIMPLE authentication for service ClientService, sasl=false
2016-10-13 12:00:02,213 INFO  [main] impl.BackupManifest: Manifest file stored 
to 
hdfs://ve0524.halxg.cloudera.com:8020/backup/backup_1476385197835/default/students/.backup.manifest
2016-10-13 12:00:02,271 INFO  [main] impl.BackupManifest: Manifest file stored 
to 
hdfs://ve0524.halxg.cloudera.com:8020/backup/backup_1476385197835/default/tsdb-tree/.backup.manifest
2016-10-13 12:00:02,330 INFO  [main] impl.BackupManifest: Manifest file stored 
to 
hdfs://ve0524.halxg.cloudera.com:8020/backup/backup_1476385197835/default/tsdb/.backup.manifest
2016-10-13 12:00:02,379 INFO  [main] impl.BackupManifest: Manifest file stored 
to 
hdfs://ve0524.halxg.cloudera.com:8020/backup/backup_1476385197835/WALs/.backup.manifest
2016-10-13 12:00:02,388 INFO  [main] impl.FullTableBackupClient: Backup 
backup_1476385197835 completed.
Backup session backup_1476385197835 finished. Status: SUCCESS
...
{code}

I'm just surprised seeing manifests for unrelated tables being updated. Maybe 
you can't incremental a single table? Should it complain if so?

If I describe the incremental it talks about other tables:

{code}
ID             : backup_1476385425149
Type           : INCREMENTAL
Tables         : ycsb,tsdb-uid,tsdb-meta,clicks,students,tsdb-tree,tsdb
State          : COMPLETE
Start time     : Thu Oct 13 12:03:45 PDT 2016
End time       : Thu Oct 13 12:03:49 PDT 2016
Progress       : 100
{code}

... that formatting and ':' is a bit wonky...... A 100 is 100% I suppose.

It is nice I can ask about the backups. Thats good. The history is good too... 
especially the bit where it says why FAILED:

State          : FAILED
Start time     : Wed Oct 12 14:49:46 PDT 2016
Failed message : Wrong FS: hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, 
expected: file:///

I was able to backup and restore a small table. Nice. Checks out.

I  made an incremental and was then able to do a full restore (an incremental 
w/o a FULL first complains saying 'wrong type'... messaging could be better 
saying need a FULL first).

Thats enough for now. Would be interesting to spend more time trying the combos 
but basics seems to work. Thats good. Interface is messy, inconsistent, and so 
will leave operator with bad impression. If that were fixed, and stuff above, 
then I'd be game for merge (No incentive to fix once it is in).














> HBase Backup/Restore Based on HBase Snapshot
> --------------------------------------------
>
>                 Key: HBASE-7912
>                 URL: https://issues.apache.org/jira/browse/HBASE-7912
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Richard Ding
>            Assignee: Vladimir Rodionov
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-------------------------------------------------------------------------------------------------------------*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into longer period and 
> bigger incremental backups.
> * Unified command line interface for all the above.
> The solution will support HBase backup to FileSystem, either on the same 
> cluster or across clusters.  It has the flexibility to support backup to 
> other devices and servers in the future.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

Reply via email to