dlmarion opened a new issue, #6197:
URL: https://github.com/apache/accumulo/issues/6197

   When restarting processes after an un-clean stop, an exception can be thrown 
in a ZooKeeper Watcher thread. This situation could occur when a process is 
killed and then restarted before the ZooKeeper server realizes the process is 
dead and removes its lock.
   
   The current process will call `ServiceLock.tryLock`, which calls 
`ServiceLock.lock` and does the following:
   
     1. Creates an ephemeral node at some path `p`
     2. Sets a watcher on the node created in step 1
     3. In determineLockOwnership() sorts the children of `p` to determine if 
the node created in step 1 is the first child.
     4. The node is step 1 is *not* the first child because the ZK server has 
not removed the prior ephemeral node from the server process that was killed
     5. Establishes a Watcher on the node prior to the node in step 1
     6. The ZK Server deletes the prior ephemeral node concurrent to step 5
     7. Because the prior node no longer exists in ZooKeeper, 
determineLockOwnership is called again where we obtain the lock and set 
`createdNodeName` to null
     8. The watcher for the prior node fires because the ZK server removed it 
and calls determineLockOwnership at line 363 which throws an 
IllegalStateException because `createdNodeName` is null.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to