[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

stack (JIRA) Wed, 12 Oct 2016 17:38:54 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570379#comment-15570379
 ]


stack commented on HBASE-7912:
------------------------------

{code}
2016-10-12 17:19:49,538 DEBUG [AsyncRpcChannel-pool2-t13] ipc.AsyncRpcChannel: 
Use SIMPLE authentication for service ClientService, sasl=false
2016-10-12 17:19:49,554 ERROR [main] impl.FullTableBackupClient: Unexpected 
BackupException : Wrong FS: hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, 
expected: file:///
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:///
        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:647)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:82)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:425)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
        at org.apache.hadoop.fs.FileSystem$4.<init>(FileSystem.java:1712)
        at 
org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1711)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.listLocatedStatus(ChecksumFileSystem.java:589)
        at org.apache.hadoop.fs.FileSystem$6.<init>(FileSystem.java:1787)
        at org.apache.hadoop.fs.FileSystem.listFiles(FileSystem.java:1783)
        at 
org.apache.hadoop.hbase.backup.util.BackupClientUtil.getFiles(BackupClientUtil.java:161)
        at 
org.apache.hadoop.hbase.backup.util.BackupServerUtil.getWALFilesOlderThan(BackupServerUtil.java:381)
        at 
org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.execute(FullTableBackupClient.java:492)
        at 
org.apache.hadoop.hbase.backup.impl.HBaseBackupAdmin.backupTables(HBaseBackupAdmin.java:532)
        at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:225)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:114)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:135)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:171)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:140)
2016-10-12 17:19:49,557 ERROR [main] impl.FullTableBackupClient: 
BackupId=backup_1476317988066,startts=1476317988637,failedts=1476317989557,failedphase=null,failedmessage=Wrong
 FS: hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs, expected: file:///
2016-10-12 17:19:49,559 DEBUG [AsyncRpcChannel-pool2-t14] ipc.AsyncRpcChannel: 
Use SIMPLE authentication for service ClientService, sasl=false
2016-10-12 17:19:49,656 DEBUG [main] ipc.AsyncRpcClient: Stopping async HBase 
RPC client
{code}

I pass the --config pointing at my conf dir which points at my hdfs and hbase. 
If I don't pass a --config, it'll use all defaults and not find the clusters 
(My config dir includes symlink to hdfs-site.xml).

Yeah, I've seen similar mismatch issues in the past in my own code (your google 
pointer is for a code writer, not for a 'user' like me).  I can bang my head 
and try 'fixing' it but am trying to convey a 'users' experience followiing 
instruction and tool usage. What is a little odd here is that the complaint is 
out of the backup tool, not about the arg I'm passing (must not be reading it 
immediately... because doesn't matter if I pass a file:/// or hdfs:/// scheme 
for backup location).

Let me know what you want me to try... 

This is straight cluster deploy. An Hadoop 2.7.3 build. All generally checks 
out.

> HBase Backup/Restore Based on HBase Snapshot
> --------------------------------------------
>
>                 Key: HBASE-7912
>                 URL: https://issues.apache.org/jira/browse/HBASE-7912
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Richard Ding
>            Assignee: Vladimir Rodionov
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: Backup-and-Restore-Apache_19Sep2016.pdf, 
> Backup-and-Restore-Apache_9Sep2016.pdf, HBaseBackupAndRestore - v0.8.pdf, 
> HBaseBackupAndRestore -0.91.pdf, HBaseBackupAndRestore-v0.9.pdf, 
> HBaseBackupAndRestore.pdf, HBaseBackupRestore-Jira-7912-DesignDoc-v1.pdf, 
> HBaseBackupRestore-Jira-7912-DesignDoc-v2.pdf, 
> HBaseBackupRestore-Jira-7912-v4.pdf, HBaseBackupRestore-Jira-7912-v5 .pdf, 
> HBaseBackupRestore-Jira-7912-v6.pdf, HBase_BackupRestore-Jira-7912-CLI-v1.pdf
>
>
> Finally, we completed the implementation of our backup/restore solution, and 
> would like to share with community through this jira. 
> We are leveraging existing hbase snapshot feature, and provide a general 
> solution to common users. Our full backup is using snapshot to capture 
> metadata locally and using exportsnapshot to move data to another cluster; 
> the incremental backup is using offline-WALplayer to backup HLogs; we also 
> leverage global distribution rolllog and flush to improve performance; other 
> added-on values such as convert, merge, progress report, and CLI commands. So 
> that a common user can backup hbase data without in-depth knowledge of hbase. 
>  Our solution also contains some usability features for enterprise users. 
> The detail design document and CLI command will be attached in this jira. We 
> plan to use 10~12 subtasks to share each of the following features, and 
> document the detail implement in the subtasks: 
> * *Full Backup* : provide local and remote back/restore for a list of tables
> * *offline-WALPlayer* to convert HLog to HFiles offline (for incremental 
> backup)
> * *distributed* Logroll and distributed flush 
> * Backup *Manifest* and history
> * *Incremental* backup: to build on top of full backup as daily/weekly backup 
> * *Convert*  incremental backup WAL files into hfiles
> * *Merge* several backup images into one(like merge weekly into monthly)
> * *add and remove* table to and from Backup image
> * *Cancel* a backup process
> * backup progress *status*
> * full backup based on *existing snapshot*
> *-------------------------------------------------------------------------------------------------------------*
> *Below is the original description, to keep here as the history for the 
> design and discussion back in 2013*
> There have been attempts in the past to come up with a viable HBase 
> backup/restore solution (e.g., HBASE-4618).  Recently, there are many 
> advancements and new features in HBase, for example, FileLink, Snapshot, and 
> Distributed Barrier Procedure. This is a proposal for a backup/restore 
> solution that utilizes these new features to achieve better performance and 
> consistency. 
>  
> A common practice of backup and restore in database is to first take full 
> baseline backup, and then periodically take incremental backup that capture 
> the changes since the full baseline backup. HBase cluster can store massive 
> amount data.  Combination of full backups with incremental backups has 
> tremendous benefit for HBase as well.  The following is a typical scenario 
> for full and incremental backup.
> # The user takes a full backup of a table or a set of tables in HBase. 
> # The user schedules periodical incremental backups to capture the changes 
> from the full backup, or from last incremental backup.
> # The user needs to restore table data to a past point of time.
> # The full backup is restored to the table(s) or to different table name(s).  
> Then the incremental backups that are up to the desired point in time are 
> applied on top of the full backup. 
> We would support the following key features and capabilities.
> * Full backup uses HBase snapshot to capture HFiles.
> * Use HBase WALs to capture incremental changes, but we use bulk load of 
> HFiles for fast incremental restore.
> * Support single table or a set of tables, and column family level backup and 
> restore.
> * Restore to different table names.
> * Support adding additional tables or CF to backup set without interruption 
> of incremental backup schedule.
> * Support rollup/combining of incremental backups into longer period and 
> bigger incremental backups.
> * Unified command line interface for all the above.
> The solution will support HBase backup to FileSystem, either on the same 
> cluster or across clusters.  It has the flexibility to support backup to 
> other devices and servers in the future.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-7912) HBase Backup/Restore Based on HBase Snapshot

Reply via email to