[
https://issues.apache.org/jira/browse/ACCUMULO-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915851#comment-13915851
]
Mike Drob commented on ACCUMULO-2422:
-------------------------------------
I think I'm missing part of the picture here - so the first master gets the
lock, the second master sets up a watch. After the first master dies, and
restarts, what prevents it from getting the lock again, since presumably there
is no contention for it, right?
> Backup master can miss acquiring lock when primary exits
> --------------------------------------------------------
>
> Key: ACCUMULO-2422
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2422
> Project: Accumulo
> Issue Type: Bug
> Components: fate, master
> Affects Versions: 1.5.0
> Reporter: Bill Havanki
> Assignee: Bill Havanki
> Priority: Critical
> Labels: failover, locking
>
> While running randomwalk tests with agitation for the 1.5.1 release, I've
> seen situations where a backup master that is eligible to grab the master
> lock continues to wait. When this condition arises and the other master
> restarts, both wait for the lock without success.
> I cannot reproduce the problem reliably, and I think more investigation is
> needed to see what circumstances could be causing the problem.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)