[
https://issues.apache.org/jira/browse/HBASE-7735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610868#comment-13610868
]
Jonathan Hsieh commented on HBASE-7735:
---------------------------------------
[~terry_zhang], there definitely is a comment in the code about coordinating
using the region name list as opposed to the region server list (legacy of the
initial implementation) and is similar to what you are suggesting. Doing that
could make snapshots robust in the face of region moves -- a region moves but
it is till the same region. I think with the approach you suggest splits will
cause failure still.
Region splits and merges are introduce a different problem because now the
region identities change. Since those operations will likely be guarded by the
table read lock (not the exclusive write lock), we'd need something else to
either fail-fast (current behavior could be improved to do this faster),
detect-and-recover, or block the final steps of a merge or split.
Do you agree?
> Prevent regions from moving during online snapshot.
> ---------------------------------------------------
>
> Key: HBASE-7735
> URL: https://issues.apache.org/jira/browse/HBASE-7735
> Project: HBase
> Issue Type: Sub-task
> Reporter: Jonathan Hsieh
>
> To increase the probability of snapshots succeeding, we should attempt to
> prevent splits and region moves from happening. Currently we take region
> locks but this could be "too late" and results in an aborted snapshot.
> We should probably take the table lock (0.96) when starting a snapshot and
> for a 0.94 backport we should probably disable the balancer.
> This will probably not be tackled until after trunk merge.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira