[ https://issues.apache.org/jira/browse/HBASE-13031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Purtell updated HBASE-13031: ----------------------------------- Fix Version/s: (was: 0.98.15) 0.98.16 > Ability to snapshot based on a key range > ---------------------------------------- > > Key: HBASE-13031 > URL: https://issues.apache.org/jira/browse/HBASE-13031 > Project: HBase > Issue Type: Improvement > Reporter: churro morales > Assignee: churro morales > Fix For: 2.0.0, 1.3.0, 0.98.16 > > Attachments: HBASE-13031-v1.patch, HBASE-13031.patch > > > Posted on the mailing list and seems like some people are interested. A > little background for everyone. > We have a very large table, we would like to snapshot and transfer the data > to another cluster (compressed data is always better to ship). Our problem > lies in the fact it could take many weeks to transfer all of the data and > during that time with major compactions, the data stored in dfs has the > potential to double which would cause us to run out of disk space. > So we were thinking about allowing the ability to snapshot a specific key > range. > Ideally I feel the approach is that the user would specify a start and stop > key, those would be associated with a region boundary. If between the time > the user submits the request and the snapshot is taken the boundaries change > (due to merging or splitting of regions) the snapshot should fail. > We would know which regions to snapshot and if those changed between when the > request was submitted and the regions locked, the snapshot could simply fail > and the user would try again, instead of potentially giving the user more / > less than what they had anticipated. I was planning on storing the start / > stop key in the SnapshotDescription and from there it looks pretty straight > forward where we just have to change the verifier code to accommodate the key > ranges. > If this design sounds good to anyone, or if I am overlooking anything please > let me know. Once we agree on the design, I'll write and submit the patches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)