[ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876259#action_12876259
 ] 

Li Chongxin commented on HBASE-50:
----------------------------------

@Stack, Thanks for the comments. Here are some replies and questions for the 
comments.

.bq + I don't think you should take on requirement 1), only the hbase admin can 
create a snapshot. There is no authentication/access control in hbase currently 
- its coming but not here yet - and without it, this would be hard for you to 
enforce.

I think I didn't state it properly. I know access control is not included in 
hbase currently. What I mean here is, snapshot should be put in class 
HBaseAdmin instead of HTable. Client side operations being divided into these 
two classes is also for the consideration of access control which is provided 
in the future, isn't it?

.bq + Regards requirement 2., I'd suggest that how the snapshot gets copied out 
from under hbase should also be outside the scope of your work. I'd say your 
work is making a viable snapshot that can be copied with perhaps some tests to 
prove it works - that might copy off data - but in general, i'd say how actual 
copying is done is outside of the scope of this issue. 

Strictly, requirement 2 is not about how snapshot is copied out from under 
hbase. Actually, table data is not really copied when snapshot in current 
design. To make it fast, snapshot just captures the state of the table 
especially all the table files. So for requirement 2, just make sure the table 
data (hfiles indeed) are not mutated when snapshot.

bq. + How you going to ensure tabke is in 'good status'. Can you not snapshot 
it whatever its state? All regions being on line is a requirement?

Regarding tables that are disabled, all regions being on line should not be a 
requirement. As for 'good status', what I'm thinking is a table region could be 
in PENDING_OPEN or PENDING_CLOSE state, in which it might be half opened. I'm 
not sure wether RS or the master should take on the responsibility to perform 
the snapshot at this time. On the other side, if the table is completely opened 
or closed, snapshot can be taken by RS or the master.

bq. + FYI, wal logs are now archived, not deleted. Replication needs them. 
Replication might also be managing clean up of the archives (j-d, whats the 
story here?) If an outstanding snapshot, one that has not been deleted, then 
none of its wals should be removed.

Great. In current design, WAL log files are the only data files that are really 
copied. If they are now archived instead of deleted, we can create log files 
reference just as hfiles instead of copying the actual data. This will further 
shorten the snapshot time. Another LogCleanerDelegate, say 
ReferencedLogCleaner, could be created to check whether the log file should be 
deleted for the consideration of snapshot. What do you think?

bq. + I can say 'snapshot' all tables? Can I say 'snapshot catalog tables - 
meta and root tables?'

I think snapshot for meta works fine but snapshot for root table is a little 
tricky. When the snapshot is performed for a user table, .META. is updated to 
keep track of the file references. If a .META. table is snapshot, -ROOT- can be 
update to keep track of the file references. But where to keep the file 
references for -ROOT- table(region) if it is snapshot, still in -ROOT-? Should 
these newly updated file references information also be included in the 
snapshot?

bq. + If a RS fails between 'ready' and 'finish', does this mean we abandon the 
snapshot?

Yes. If a RS fails between 'ready' and 'finish', it should notify the client or 
master, whichever orchestrates, then the client or the master will send a 
signal to stop the snapshot on all RS via ZK. Something like this.

bq. + I'd say if RS is not ready for snapshot, just fail it. Something is badly 
wrong is a RS can't snapshot.

Currently, there is a timeout for snapshot ready. If a RS is ready, it'll wait 
for all the RS to be ready. Then the snapshot starts on all RS. Otherwise, the 
ready RS timeout and snapshot does not start on any RS. It's a synchronous way. 
Do you think this is appropriate? Will it create too much load to perform 
snapshot concurrently on the RS? (Jonathan perfer an asynchronous method)

bq. + Would it make sense for there to be a state between ready and finish and 
the data in this intermediate state would be the RS's progress?

Do you mean a znode is create for each RS to keep the progress? Then how do you 
define the RS's progress? What data will be kept in this znode?

Thanks again for the comments. I will update the design document based on them.


> Snapshot of table
> -----------------
>
>                 Key: HBASE-50
>                 URL: https://issues.apache.org/jira/browse/HBASE-50
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Billy Pearson
>            Assignee: Li Chongxin
>            Priority: Minor
>         Attachments: HBase Snapshot Design Report V2.pdf, snapshot-src.zip
>
>
> Havening an option to take a snapshot of a table would be vary useful in 
> production.
> What I would like to see this option do is do a merge of all the data into 
> one or more files stored in the same folder on the dfs. This way we could 
> save data in case of a software bug in hadoop or user code. 
> The other advantage would be to be able to export a table to multi locations. 
> Say I had a read_only table that must be online. I could take a snapshot of 
> it when needed and export it to a separate data center and have it loaded 
> there and then i would have it online at multi data centers for load 
> balancing and failover.
> I understand that hadoop takes the need out of havening backup to protect 
> from failed servers, but this does not protect use from software bugs that 
> might delete or alter data in ways we did not plan. We should have a way we 
> can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to