[ https://issues.apache.org/jira/browse/HBASE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277172#comment-13277172 ]
nkeywal commented on HBASE-5926: -------------------------------- the race condition is decreased to a production-acceptable minimum imho. We do a compare & delete in the java code, so the race condition is now: between the comparison and the delete, we fail if, and only if: the session expires and the master node is deleted and the master backup recreates the node. That's unlikely. > Delete the master znode after a master crash > -------------------------------------------- > > Key: HBASE-5926 > URL: https://issues.apache.org/jira/browse/HBASE-5926 > Project: HBase > Issue Type: Improvement > Components: master, scripts > Affects Versions: 0.96.0 > Reporter: nkeywal > Assignee: nkeywal > Priority: Minor > Fix For: 0.96.0 > > Attachments: 5926.v6.patch > > > This is the continuation of the work done in HBASE-5844. > But we can't apply exactly the same strategy: for the region server, there is > a znode per region server, while for the master & backup master there is a > single znode for both. > So if we apply the same strategy as for a regionserver, we may have this > scenario: > 1) Master starts > 2) Backup master starts > 3) Master dies > 4) ZK detects it > 5) Backup master receives the update from ZK > 6) Backup master creates the new master node and become the main master > 7) Previous master script continues > 8) Previous master script deletes the master node in ZK > 9) => issue: we deleted the node just created by the new master > This should not happen often (usually the znode will be deleted soon enough), > but it can happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira