[ 
https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453183#comment-13453183
 ] 

Jagane Sundar commented on HDFS-3370:
-------------------------------------

Pardon my naive question - but, are hard links adequate for the purposes of 
HBase backup? The first line in this JIRA says "This provides a lightweight way 
for applications like hbase to create a snapshot".

Perhaps HBase experts can answer this question: Are single file hard links 
adequate for HBase backup? Don't you want a Point In Time snapshot of the 
entire filesystem, or at least all the files under the HBase data directory?

Don't you really want a sequence of events such as:
1. Flush all HBase MemStores
2. Quiesce HBase, i.e. get it to stop writing to HDFS
3. Call underlying HDFS to create PIT RO Snapshot with COW Semantics
4. Tell HBase to end quiesce, i.e. it can start writing to HDFS again
5. Backup program now reads from RO snapshot and writes to backup device, while 
HBase continues to write to the real directory tree
6. When the backup program is done, it deletes the RO snapshot

                
> HDFS hardlink
> -------------
>
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLink.pdf
>
>
> We'd like to add a new feature hardlink to HDFS that allows harlinked files 
> to share data without copying. Currently we will support hardlinking only 
> closed files, but it could be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are 
> primarily used in facebook:
> 1. This provides a lightweight way for applications like hbase to create a 
> snapshot;
> 2. This also allows an application like Hive to move a table to a different 
> directory without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to