[
https://issues.apache.org/jira/browse/HBASE-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tianying Chang updated HBASE-6070:
----------------------------------
@stack
Thanks. I want to get some second opinion from others. I guess it is better to
do this by opening a separate jira. I have created HBASE-7058 for this purpose.
If other people found no other potential problem, I can provide patch.
> AM.nodeDeleted and SSH races creating problems for regions under SPLIT
> ----------------------------------------------------------------------
>
> Key: HBASE-6070
> URL: https://issues.apache.org/jira/browse/HBASE-6070
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.92.1, 0.94.0
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 0.92.2, 0.94.1, 0.96.0
>
> Attachments: HBASE-6070_0.92_1.patch, HBASE-6070_0.92.patch,
> HBASE-6070_0.94_1.patch, HBASE-6070_0.94.patch, HBASE-6070_trunk_1.patch,
> HBASE-6070_trunk.patch
>
>
> We tried to address the problems in Master restart and RS restart while SPLIT
> region is in progress as part of HBASE-5806.
> While doing some more we found still there is one race condition.
> -> Split has just started and the znode is in RS_SPLIT state.
> -> RS goes down.
> -> First call back for SSH comes.
> -> As part of the fix for HBASE-5806 SSH knows that some region is in RIT.
> -> But now nodeDeleted event comes for the SPLIt node and there we try to
> delete the RIT.
> -> After this we try to see in the SSH whether any node is in RIT. As we
> dont find the region in RIT the region is never assigned.
> When we fixed HBASE-5806 step 6 happened first and then step 5 happened. So
> we missed it. Now we found that. Will come up with a patch shortly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira