please unsubscribe me from your list. the amount of mail received is just unbearable thank you
2014-10-31 8:22 GMT+01:00 ASF GitHub Bot (JIRA) <[email protected]>: > > [ > https://issues.apache.org/jira/browse/STORM-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191532#comment-14191532 > ] > > ASF GitHub Bot commented on STORM-537: > -------------------------------------- > > Github user Sergeant007 commented on the pull request: > > https://github.com/apache/storm/pull/304#issuecomment-61226979 > > @d2r , @harshach , @HeartSaVioR > Could you, please, take a look on this pull request? > > > > A worker reconnects infinitely to another dead worker > > ----------------------------------------------------- > > > > Key: STORM-537 > > URL: https://issues.apache.org/jira/browse/STORM-537 > > Project: Apache Storm > > Issue Type: Bug > > Affects Versions: 0.9.3 > > Reporter: Sergey Tryuber > > > > We're using 0.9.3-rc1. Most probably this wrong behavior was introduced > as a side efffect for STORM-409. When I kill a worker, another worker > starts to print messages like: > > {noformat} > > 2014-10-20 11:45:03 b.s.m.n.Client [INFO] Reconnect started for > Netty-Client-<HOST>:4706... [0] > > 2014-10-20 11:45:03 b.s.m.n.Client [INFO] Reconnect started for > Netty-Client-<HOST>:4706... [1] > > 2014-10-20 11:45:03 b.s.m.n.Client [INFO] Reconnect started for > Netty-Client-<HOST>:4706... [2] > > ..... so on > > {noformat} > > Then it reaches default 300 max_retries and starts the cycle again: > > {noformat} > > 2014-10-20 11:54:38 b.s.m.n.Client [INFO] connection established to a > remote host Netty-Client-<HOST>:4706, [id: > > 0xec088412, /<HOST>:39795 :> <HOST>:4706] > > 2014-10-20 11:54:38 b.s.m.n.Client [INFO] Reconnect started for > Netty-Client-<HOST>:4706... [0] > > 2014-10-20 11:54:38 b.s.m.n.Client [INFO] Reconnect started for > Netty-Client-<HOST>:4706... [1] > > 2014-10-20 11:54:38 b.s.m.n.Client [INFO] Reconnect started for > Netty-Client-<HOST>:4706... [2] > > {noformat} > > And so on infinitely... > > An issue most probably is in > backtype.storm.messaging.netty.Client#connect method in following place > which determines that we give up on reconnection: > > {code} > > if (null != channel) { > > LOG.info("connection established to a remote host " + name() + ", " > + channel.toString()); > > channelRef.set(channel); > > } else { > > close(); > > throw new RuntimeException("Remote address is not reachable. We will > close this client " + name()); > > } > > {code} > > I guess (not tried yet), that _channel_ object is not _null_ if this is > a real reconnection. So the method return a _channel_ object and then > reconnection starts again and again. > > This might be fixed by adding explicity *current = null;* into following > code block of the same method: > > {code} > > if (!future.isSuccess()) { > > if (null != current) { > > current.close(); > > } > > } > > {code} > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) >
