GitHub user Sergeant007 opened a pull request:
https://github.com/apache/storm/pull/304
[STORM-537] A worker reconnects infinitely to another dead worker
A fix for [STORM-537](https://issues.apache.org/jira/browse/STORM-537). The
bug is that a worker reconnects to another dead worker infinitely when it tries
to send a batch of messages. Each message in a batch causes a new reconnect.
More details are in the jira issue.
Pull request contains a simple fix and tests. Actually there is
"test-reconnect-to-permanently-failed-server" which is exactly for this bug.
There is also "test-reconnect-to-temporarily-failed-server" which was written
just-for-fun, because this functionality is not covered by other tests.
Note, that storm with applied fix works well and fixed the issue on our
staging environment.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Sergeant007/storm
storm-537-infinite-reconnection
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/304.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #304
----
commit 1aacccf286829e9289d86a6ed10b23cb2b21bc47
Author: Sergey Tryuber <[email protected]>
Date: 2014-10-29T15:27:56Z
[STORM-537] A worker reconnects infinitely to another dead worker
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---