Github user nicktrav commented on the issue:

    https://github.com/apache/zookeeper/pull/330
  
    @DanBenediktson - I've been looking into writing a test for this patch, but 
I can't seem to replicate the case you speak about on the original ticket.
    
    Specifically:
    
    > The exact code path it goes through in this case is complicated, because 
there has to be a previously-closed socket still waiting in the selector 
(otherwise, the first timeout evaluation will not fail because "now" still 
hasn't been updated, and then the actual connect timeout will be applied in 
ClientCnxnSocket.doTransport()) so that select() will harvest the IO from the 
previous socket and updateNow(), resulting in the next loop through 
ClientCnxnSocket.SendThread.run() observing the spurious timeout and failing.
    
    Are you able to provide some more details on how this client can get into 
this state? Walking through the code, I'm having difficulty understanding how 
the client can end up a reconnect loop.
    
    We are keen to see this patch land as it would make a fix for 
ZOOKEEPER-2869 inherently safer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to