Hello, After increasing the config parameters i dont see zookeeper suspended message but worker is restarted on other machine. Does it have anything to do with Netty connection setting?
storm.zookeeper.session.timeout: 40000 storm.zookeeper.connection.timeout: 30000 storm.messaging.transport: "backtype.storm.messaging.netty.Context" storm.messaging.netty.buffer_size: 209715200 storm.messaging.netty.max_retries: 10 storm.messaging.netty.max_wait_ms: 5000 storm.messaging.netty.min_wait_ms: 10000 2015-02-18 15:13:47 b.s.m.n.Client [INFO] New Netty Client, connect to realtimeslave1.novalocal, 6702, config: , buffer_size: 209715200 2015-02-18 15:13:47 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-realtimeslave1.novalocal/10.0.0.14:6702... [0] 2015-02-18 15:13:47 b.s.m.n.Client [INFO] Closing Netty Client Netty-Client-realtimeslave1.novalocal/10.0.0.14:6703 2015-02-18 15:13:47 b.s.m.n.Client [INFO] Waiting for pending batchs to be sent with Netty-Client-realtimeslave1.novalocal/10.0.0.14:6703..., timeout: 600000ms, pendings: 0 2015-02-18 15:13:52 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-realtimeslave1.novalocal/10.0.0.14:6702... [1] 2015-02-18 15:13:57 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-realtimeslave1.novalocal/10.0.0.14:6702... [2] 2015-02-18 15:14:02 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-realtimeslave1.novalocal/10.0.0.14:6702... [3] 2015-02-18 15:14:03 b.s.m.n.Client [INFO] connection established to a remote host Netty-Client-realtimeslave1.novalocal/10.0.0.14:6702, [id: 0x320fd4e4, /10.0.0.11:48658 => realtimeslave1.novalocal/10.0.0.14:6702] On Tue, Feb 17, 2015 at 10:31 PM, Tousif <[email protected]> wrote: > Thanks, > I will try out these config properties. > On Feb 17, 2015 7:58 PM, "Harsha" <[email protected]> wrote: > >> >> You might be loosing zookeeper connection. Try increasing these two values >> storm.zookeeper.session.timeout: 20000 >> storm.zookeeper.connection.timeout: 15000 >> >> >> On Tue, Feb 17, 2015, at 06:03 AM, Tousif wrote: >> >> Hello, >> >> I have a bolt which uses a pool of large objects. When pool >> reinitialises(once in 4 hours) bolt waits for few seconds and disconnects >> with zookeper. >> >> I have specified following properties in yaml but still worker dies. >> >> supervisor.worker.start.timeout.secs 300 >> supervisor.worker.timeout.secs 60 >> >> >> Here are the logs: >> >> 2015-02-17 04:35:01 o.a.z.ClientCnxn [INFO] Client session timed out, >> have not heard from server in 15906ms for sessionid 0x14b9200ea400009, >> closing socket connection and attempting reconnect >> 2015-02-17 04:35:01 o.a.c.f.s.ConnectionStateManager [INFO] State change: >> SUSPENDED >> 2015-02-17 04:35:01 o.a.c.f.s.ConnectionStateManager [WARN] There are no >> ConnectionStateListeners registered. >> 2015-02-17 04:35:01 b.s.cluster [WARN] Received event >> :disconnected::none: with disconnected Zookeeper. >> 2015-02-17 04:35:02 o.a.z.ClientCnxn [INFO] Opening socket connection to >> server realtimeanalytics.novalocal/10.0.0.11:2181. Will not attempt to >> authenticate using SASL (Unable to locate a login configuration) >> 2015-02-17 04:35:02 o.a.z.ClientCnxn [INFO] Socket connection established >> to realtimeanalytics.novalocal/10.0.0.11:2181, initiating session >> 2015-02-17 04:35:02 o.a.z.ClientCnxn [INFO] Session establishment >> complete on server realtimeanalytics.novalocal/10.0.0.11:2181, sessionid >> = 0x14b9200ea400009, negotiated timeout = 20000 >> 2015-02-17 04:35:02 o.a.c.f.s.ConnectionStateManager [INFO] State change: >> RECONNECTED >> 2015-02-17 04:35:02 o.a.c.f.s.ConnectionStateManager [WARN] There are no >> ConnectionStateListeners registered. >> 2015-02-17 04:35:33 o.a.z.ClientCnxn [INFO] Client session timed out, >> have not heard from server in 13499ms for sessionid 0x14b9200ea400009, >> closing socket connection and attempting reconnect >> 2015-02-17 04:35:34 o.a.c.f.s.ConnectionStateManager [INFO] State change: >> SUSPENDED >> 2015-02-17 04:35:34 b.s.cluster [WARN] Received event >> :disconnected::none: with disconnected Zookeeper. >> 2015-02-17 04:35:34 o.a.c.f.s.ConnectionStateManager [WARN] There are no >> ConnectionStateListeners registered. >> >> >> >> -- >> >> >> Regards >> Tousif Khazi >> >> >> >> > -- Regards Tousif Khazi
