[ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12875943#action_12875943
 ] 

stack commented on HBASE-50:
----------------------------

Coments on '3 Design Overview':

+ How you going to ensure tabke is in 'good status'.  Can you not snapshot it 
whatever its state?  All regions being on line is a requirement?
+ I like the bit where we roll wal log rather than wait on memstore flushes.  
Good.
+ FYI, wal logs are now archived, not deleted.  Replication needs them.  
Replication might also be managing clean up of the archives (j-d, whats the 
story here?)  If an outstanding snapshot, one that has not been deleted, then 
none of its wals should be removed.


Comments on '4 Message Passing Via ZK"

+ I can say 'snapshot' all tables?  Can I say 'snapshot catalog tables -- meta 
and root tables?'
+ If a RS fails between 'ready' and 'finish', does this mean we abandon the 
snapshot?
+ I'd say if RS is not ready for snapshot, just fail it.  Something is badly 
wrong is a RS can't snapshot.
+ Would it make sense for there to be a state between ready and finish and the 
data in this intermediate state would be the RS's progress?
+ Diagram looks good.





> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Li Chongxin
>            Priority: Minor
>         Attachments: HBase Snapshot Design Report V2.pdf, snapshot-src.zip
>
>
> Havening an option to take a snapshot of a table would be vary useful in 
> production.
> What I would like to see this option do is do a merge of all the data into 
> one or more files stored in the same folder on the dfs. This way we could 
> save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. 
> Say I had a read_only table that must be online. I could take a snapshot of 
> it when needed and export it to a separate data center and have it loaded 
> there and then i would have it online at multi data centers for load 
> balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect 
> from failed servers, but this does not protect use from software bugs that 
> might delete or alter data in ways we did not plan. We should have a way we 
> can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to