Github user kevinconaway commented on the pull request:

    https://github.com/apache/storm/pull/639#issuecomment-219860409
  
    Actually it looks like this issue
    
    >There is another more serious issue that lead to huge problems in one of 
out topologies whenever a worker crashed due to some exception.
    If worker A sucessfully connects to worker B for the first time during 
startup but worker B closes the connection for some reason before the 
:worker-active-flag is set to true (here 
https://github.com/apache/storm/blob/v0.9.6/storm-core/src/clj/backtype/storm/daemon/worker.clj#L356),
 there will be no further reconnect attempts, since no messages will be 
processed and neither send() nor flushMessages() will ever be called.
    
    may be fixed by STORM-1609 with the addition of the client keepalive 
TimerTask


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to