Hi - 
We are seeing workers dying and restarting quite a bit, apparently from netty 
connection issues.

For example, the log below shows:
* Reconnect for worker at 121:6700
* connection established to 121:6700
* closing connection to 121:6700
* Reconnect started to 121:6700

all within 1 second.

We have netty config updated to:
storm.messaging.netty.max_retries: 30
storm.messaging.netty.max_wait_ms: 10000
storm.messaging.netty.min_wait_ms: 1000

And the workers die pretty quickly because often 30 retries does not end up 
with a connection. 

Any suggestions for how to prevent netting from closing a connection 
immediately? I could not see any obvious reason in the code that this would 
happen.

Thanks
Tyson

2014-09-26 09:32:03 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.13.121:6700... [5]
2014-09-26 09:32:04 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.13.121:6701... [6]
2014-09-26 09:32:11 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.10.180:6701... [6]
2014-09-26 09:32:12 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.10.180:6702... [6]
2014-09-26 09:32:13 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.13.121:6700... [6]
2014-09-26 09:32:14 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.13.121:6701... [7]
2014-09-26 09:32:18 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.13.121:6700... [7]
2014-09-26 09:32:18 b.s.m.n.Client [INFO] connection established to a remote 
host Netty-Client-/10.27.13.121:6700, [id: 0xb8b33bef, /10.27.10.180:33880 => 
/10.27.13.121:6700]
2014-09-26 09:32:18 b.s.m.n.Client [INFO] Closing Netty Client 
Netty-Client-/10.27.13.121:6700
2014-09-26 09:32:18 b.s.m.n.Client [INFO] Waiting for pending batchs to be sent 
with Netty-Client-/10.27.13.121:6700..., timeout: 600000ms, pendings: 0
2014-09-26 09:32:19 b.s.m.n.Client [INFO] New Netty Client, connect to 
10.27.13.121, 6700, config: , buffer_size: 5242880
2014-09-26 09:32:19 b.s.m.n.Client [INFO] Reconnect started for 
Netty-Client-/10.27.13.121:6700... [0]
2014-09-26 09:32:19 b.s.m.n.Client [INFO] connection established to a remote 
host Netty-Client-/10.27.13.121:6700, [id: 0x9dc224e6, /10.27.10.180:33881 => 
/10.27.13.121:6700]

Reply via email to