[ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855071#action_12855071
 ] 

Todd Lipcon commented on HBASE-50:
----------------------------------

bq. In current implementation, a write lock is acquired when the system is 
trying to do a transition. When the snapshot is requested, we can try to 
acquire this write lock. Snapshot is initiated only If the write lock can be 
obtained.

This is the part I'm not clear on. Are you attempting to achieve a simultaneous 
write lock across all region servers in the cluster? Also will have to make 
sure that we "lock" any regions that are currently being moved, etc. Doing this 
without impacting realtime workload on the cluster is the tricky part in my 
opinion.

Perhaps since we're accepting "sloppy snapshot" we can do without this kind of 
lock, and the master instead sends out a "request for snapshot" and each 
regionserver can do its thing at the appropriate time.

bq. If hard-links is supported in HDFS, then everything is simple since HDFS 
wil handle the reference counting of the files

Yes, hard links would make it easier, but in a large cluster with thousands of 
regions each with many hfiles and many column families, iterating over every 
store file could be prohibitively expensive if we have to lock everything while 
doing it. I think it's fine if snapshots take 15 seconds to create, but if the 
cluster is frozen for those 15 seconds, it's a much less useful feature, no?

bq. If old files are moved aside after a compaction, how do we deal with 
concurrent readers of the snapshot?

The hardlink solution avoids this issue. I was imaginging that the "snapshot 
manifest" which would point to explicit paths on HDFS.

> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Priority: Minor
>
> Havening an option to take a snapshot of a table would be vary useful in 
> production.
> What I would like to see this option do is do a merge of all the data into 
> one or more files stored in the same folder on the dfs. This way we could 
> save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. 
> Say I had a read_only table that must be online. I could take a snapshot of 
> it when needed and export it to a separate data center and have it loaded 
> there and then i would have it online at multi data centers for load 
> balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect 
> from failed servers, but this does not protect use from software bugs that 
> might delete or alter data in ways we did not plan. We should have a way we 
> can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to