[ 
https://issues.apache.org/jira/browse/HBASE-6233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408764#comment-13408764
 ] 

Matteo Bertozzi commented on HBASE-6233:
----------------------------------------

{quote}I was wondering if you have any concern around creation of all the 
symlinks on a table of some decent size taking a good bit of time Matteo? The 
window during which the snapshot is being made could be pretty wide. Would that 
be a problem?
{quote}
The time is (fs.rename() * nfiles + fs.symlink() * nfiles), but is just a 
metadata operation on HDFS. I don't have the times for how long it takes but I 
can come up with some benchmark, maybe with hdfs under heavy load.

Anyway, you need to keep track of the files in some way: create one reference 
file for each files or add a reference in .META. and both seems much more 
heavier since they require interaction with both namenode + datanode.

{quote}we do not delete files; rather we just rename them w/ a '.del' ending 
and leave them in place.{quote}
But if you want to remove the table this files should be moved.
And by doing this you need to add some logic to the current code to don't read 
the .del files

{quote}
Adding list of files to .META. might make for our being able to do other 
fancyness such as the Accumulo fast table copy, etc.
{quote}
The accumulo clone table is one of the feature that we can easily get with 
snapshots.
I've called it "mount snapshot" that essentially is the accumulo clone table. 
(Take a look at HBASE-6353, for a description of the snapshot operations).

Again, if you think at restore with the hardlink support you can easily have 
everything. So we just need to come up with an alternative to hardlink.
                
> [brainstorm] snapshots: hardlink alternatives
> ---------------------------------------------
>
>                 Key: HBASE-6233
>                 URL: https://issues.apache.org/jira/browse/HBASE-6233
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: Restore-Snapshot-Hardlink-alternatives.pdf
>
>
> Discussion ticket around snapshots and hardlink alternatives.
> (See the HDFS-3370 discussion about hardlink and implementation problems)
> (taking for a moment WAL out of the discussion and focusing on hfiles)
> With hardlinks available taking snapshot will be fairly easy:
> * (hfiles are immutable)
> * hardlink to .snapshot/name to take snapshot
> * hardlink from .snapshot/name to restore the snapshot
> * No code change needed (on fs.delete() only one reference is deleted)
> but we don't have hardlinks, what are the alternatives?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to