Github user mkedwards commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/707#discussion_r235467449
  
    --- Diff: 
zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/QuorumPeer.java
 ---
    @@ -1556,32 +1573,50 @@ public String getNextDynamicConfigFilename() {
         }
         
         public void setLastSeenQuorumVerifier(QuorumVerifier qv, boolean 
writeToDisk){
    -        synchronized (QV_LOCK) {
    -            if (lastSeenQuorumVerifier != null && 
lastSeenQuorumVerifier.getVersion() > qv.getVersion()) {
    -                LOG.error("setLastSeenQuorumVerifier called with stale 
config " + qv.getVersion() +
    -                        ". Current version: " + 
quorumVerifier.getVersion());
    +        // If qcm is non-null, we may call qcm.connectOne(), which will 
take the lock on qcm
    +        // and then take QV_LOCK.  Take the locks in the same order to 
ensure that we don't
    +        // deadlock against other callers of connectOne().  If qcmRef gets 
set in another
    +        // thread while we're inside the synchronized block, that does no 
harm; if we didn't
    +        // take a lock on qcm (because it was null when we sampled it), we 
won't call
    +        // connectOne() on it.  (Use of an AtomicReference is enough to 
guarantee visibility
    +        // of updates that provably happen in another thread before 
entering this method.)
    +        QuorumCnxManager qcm = qcmRef.get();
    +        Object outerLockObject = (qcm != null) ? qcm : QV_LOCK;
    --- End diff --
    
    It does.  I'm fairly sure that this guarantees that the `QuorumCnxManager` 
lock can only be taken inside `QV_LOCK` if it's already held outside it.  
Perhaps we can get confirmation from @castuardo and his team's Distributed 
System Model Checking tool?


---

Reply via email to