[
https://issues.apache.org/jira/browse/STORM-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956077#comment-14956077
]
ASF GitHub Bot commented on STORM-1106:
---------------------------------------
Github user HeartSaVioR commented on the pull request:
https://github.com/apache/storm/pull/795#issuecomment-147896620
@kishorvpatil
For confirming that I understood this issue well, I'd like you to elaborate
this issue some more.
As far as I understand, when retry count exceeds, Connect.run() throws
RuntimeException, but worker doesn't be killed since it is a TimerTask.
So it just closes the connection and wait for reassign for such worker.
If Nimbus reassigns dead worker to another after retry limit exceed,
another connection is being made and it would be fine.
But some reason if problematic worker is just not able to connect (for
example, STW, and so on) to another workers for longer than connection retrying
but not forever, and nimbus doesn't reassign problematic worker, another
workers cannot connect to problematic worker forever.
Is my assumption right? Or there's other reason?
> Netty Client Connection Attempts should not be limited
> ------------------------------------------------------
>
> Key: STORM-1106
> URL: https://issues.apache.org/jira/browse/STORM-1106
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Affects Versions: 0.10.0
> Reporter: Kishor Patil
> Assignee: Kishor Patil
> Priority: Blocker
>
> The workers should not give-up making connection with other workers. This
> could cause the worker to be blocked forever.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)