[Contd..]
BTW I'm running storm 2.1.0 on OpenJDK11 and I'm able to reproduce this
issue every time during pacemaker rolling bounce.
Today I downgraded to storm 2.0.0 and ran into the same issue. Did anyone
face the same issue? Let me know if you need more details.

On Sun, Nov 17, 2019 at 7:48 PM Sharath Raghavan <[email protected]>
wrote:

> Hello everyone,
> I'm working on upgrading storm from 1.2.1 to 2.1.0. While performing some
> fault tolerance testing, I noticed a weird behavior.
>
> *Scenario:*
> I submitted topology and everything works fine. Now, if I bounce pacemaker
> server - where the pacemaker client is connected to - it is fails to
> heartbeat and gets stuck in a retry loop forever.
>
> From what I understand - pacemaker server is receiving SEND_PULSE messages
> and responds well which is then received by client (SEND_PULSE_RESPONSE)
> for the right message_id. However the client fails while looking up message
> sent previously. Log says - "No message for slot: <message_id>".
>
> Any idea why this could be?
>
> Detailed logs below -
> 2019-11-18 03:04:19,023+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:159 - Sending pacemaker message to host0979.com:
> HBMessage(type:SEND_PULSE, data:<HBMessageData
> pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
> details:1F 8B 08 00 00 00 00 00 00 00 E5 5D 0B 7C 96 B5 D5 0F 50 04 A1 A5
> 17 0A 82 C2 A8 80 DC 0B 45 CA 55 11 54 2E 0A 8A 5C 3A 01 07 02 52 44 94 4B
> A1 45 C1 21 A2 02 DE 50 51 D0 79 C1 FD EA 74 8A 8A 13 1D 2A 6E BA 5F 99 A8
> 38 71 63 EA 26 4A 51 E6 07 53 3F DD 07 9B B8 39 71 FB BE F2 A4 4F C2 DB E4
> 9C E4 24 EF DE 2D EF D7 1F 6E 6E E7 FF 27 49 93 7F 9E 93 9C E4 A4 29 AB C7
> 18 EB 3B E7 D2 C2...)>)
> 2019-11-18 03:04:19,024+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:165 - Put message in slot: 1 for host0979.com
> 2019-11-18 03:04:19,028+0000  [host0979.com-pm-1] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:1)
> 2019-11-18 03:04:19,028+0000  [host0979.com-pm-1] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:1)
> 2019-11-18 03:04:19,028+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:179 - Got Response: HBMessage(type:SEND_PULSE_RESPONSE,
> data:null, message_id:1)
>
> 2019-11-18 03:05:19,041+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:159 - Sending pacemaker message to host0979.com:
> HBMessage(type:SEND_PULSE, data:<HBMessageData
> pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
> details:1F 8B 08 00 00 00 00 00 00 00 E5 9D 09 98 56 C5 99 EF 5F 36 6D 43
> A3 88 A8 08 74 D3 3B 2D C8 BE 35 8A 01 14 15 E2 86 82 0A 04 04 64 8F EC B4
> BB D1 06 41 8D 41 41 24 88 09 3E 83 A6 EF C4 05 13 50 16 31 10 35 B6 19 46
> A3 62 42 AE A2 E8 70 91 64 D0 90 E8 B8 E4 D1 C8 24 43 9F E2 54 F1 7D 55 EF
> 5B F5 56 9D 7C 93 D3 D7 49 66 BC B7 FE FF AE 53 5F D5 AF 4E AD EF 69 0E 8D
> 00 A0 CF 8C A9 5D...)>)
> 2019-11-18 03:05:19,041+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:165 - Put message in slot: 2 for host0979.com
> 2019-11-18 03:05:19,044+0000  [host0979.com-pm-1] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:2)
> 2019-11-18 03:05:19,044+0000  [host0979.com-pm-1] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:2)
> 2019-11-18 03:05:19,044+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:179 - Got Response: HBMessage(type:SEND_PULSE_RESPONSE,
> data:null, message_id:2)
>
>
>
> *------- <Bounced pacemaker server @ host0979.com <http://host0979.com>>
> -------*
>
> 2019-11-18 03:06:19,054+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:159 - Sending pacemaker message to host0979.com:
> HBMessage(type:SEND_PULSE, data:<HBMessageData
> pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
> details:1F 8B 08 00 00 00 00 00 00 00 ED 7D 0B 94 56 C5 95 EE A6 01 45 40
> 40 A3 23 30 18 10 45 50 FA F1 F7 FB 41 2B 60 30 20 22 0F 41 45 14 BA 91 97
> 08 34 84 97 20 2A 18 11 11 05 89 BC BA D5 44 54 12 8C 90 2C B8 A2 92 51 07
> 34 4C 24 2B 4C 02 09 C9 90 84 49 48 F4 5E 88 71 46 46 E5 C6 18 35 B7 39 C5
> A9 E2 FF AB F6 AE DA 75 CE FC 97 D3 F7 F6 02 69 D7 FE BE 53 A7 4E D5 57 67
> 9F AA 5D BB DA 40...)>)
>
> *2019-11-18 03:06:19,055+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:165 - Put message in slot: 3 for host0979.com
> <http://host0979.com>*2019-11-18 03:06:20,056+0000
>  [executor-heartbeat-timer] WARN  PacemakerClient:192 - Not getting
> response or getting null response. Making 9 more attempts for host0979.com
> .
> 2019-11-18 03:06:21,056+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 8 more attempts for host0979.com.
> 2019-11-18 03:06:22,057+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 7 more attempts for host0979.com.
> 2019-11-18 03:06:23,057+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 6 more attempts for host0979.com.
> 2019-11-18 03:06:24,057+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 5 more attempts for host0979.com.
> 2019-11-18 03:06:25,058+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 4 more attempts for host0979.com.
> 2019-11-18 03:06:26,058+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 3 more attempts for host0979.com.
> 2019-11-18 03:06:27,059+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 2 more attempts for host0979.com.
> 2019-11-18 03:06:28,059+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 1 more attempts for host0979.com.
> 2019-11-18 03:06:29,059+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 0 more attempts for host0979.com.
> 2019-11-18 03:06:32,275+0000  [executor-heartbeat-timer] ERROR
> rejectedExecution:770 - Failed to submit a listener notification task.
> Event loop shut down?
> java.util.concurrent.RejectedExecutionException: event executor terminated
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:855)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:328)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:321)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:778)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:768)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:432)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:112)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.DefaultChannelPromise.setFailure(DefaultChannelPromise.java:89)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.safeExecute(AbstractChannelHandlerContext.java:1010)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:610)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:465)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline.close(DefaultChannelPipeline.java:1003)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannel.close(AbstractChannel.java:238)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClient.close_channel(PacemakerClient.java:260)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClient.close(PacemakerClient.java:267)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClientPool.rotateClients(PacemakerClientPool.java:92)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClientPool.send(PacemakerClientPool.java:54)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.cluster.PaceMakerStateStorage.set_worker_hb(PaceMakerStateStorage.java:127)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.cluster.StormClusterStateImpl.workerHeartbeat(StormClusterStateImpl.java:509)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.daemon.worker.Worker.doExecutorHeartbeats(Worker.java:372)
> ~[storm-client-2.1.0.jar:2.1.0]
> at org.apache.storm.StormTimer$1.run(StormTimer.java:110)
> [storm-client-2.1.0.jar:2.1.0]
> at org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:226)
> [storm-client-2.1.0.jar:2.1.0]
> 2019-11-18 03:06:32,276+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:261 - channel host0979.com/<ip>:6699 closed
> 2019-11-18 03:06:32,276+0000  [executor-heartbeat-timer] ERROR
> PaceMakerStateStorage:138 - couldn't get response after 10 attempts. Failed
> to set_worker_hb. Will make 9 more attempts.
> 2019-11-18 03:06:32,278+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:159 - Sending pacemaker message to host0977.com:
> HBMessage(type:SEND_PULSE, data:<HBMessageData
> pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
> details:1F 8B 08 00 00 00 00 00 00 00 ED 7D 0B 94 56 C5 95 EE A6 01 45 40
> 40 A3 23 30 18 10 45 50 FA F1 F7 FB 41 2B 60 30 20 22 0F 41 45 14 BA 91 97
> 08 34 84 97 20 2A 18 11 11 05 89 BC BA D5 44 54 12 8C 90 2C B8 A2 92 51 07
> 34 4C 24 2B 4C 02 09 C9 90 84 49 48 F4 5E 88 71 46 46 E5 C6 18 35 B7 39 C5
> A9 E2 FF AB F6 AE DA 75 CE FC 97 D3 F7 F6 02 69 D7 FE BE 53 A7 4E D5 57 67
> 9F AA 5D BB DA 40...)>)
> 2019-11-18 03:06:32,280+0000  [host0977.com-pm-1] DEBUG
> PacemakerClient:143 - Channel is ready: [id: 0xa4d6db58]
>
> *2019-11-18 03:06:32,282+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:165 - Put message in slot: 0 for host0977.com
> <http://host0977.com>*2019-11-18 03:06:32,294+0000  [host0977.com-pm-1]
> ERROR PacemakerClientHandler:60 - Exception occurred in Pacemaker.
> java.nio.channels.NotYetConnectedException: null
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannel$AbstractUnsafe.flush0()(Unknown
> Source) ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> 2019-11-18 03:06:32,296+0000  [host0977.com-pm-1] INFO
>  PacemakerClientHandler:37 - Connection established from /<ip>:58206 to
> host0977.com/<ip>:6699
> 2019-11-18 03:06:32,397+0000  [Timer-0] INFO  PacemakerClient:246 -
> reconnecting to host0977.com
> 2019-11-18 03:06:32,397+0000  [Timer-0] DEBUG PacemakerClient:261 -
> channel host0977.com/<ip>:6699 closed
> 2019-11-18 03:06:32,404+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:143 - Channel is ready: [id: 0xcf170c9f]
> 2019-11-18 03:06:32,409+0000  [host0977.com-pm-2] INFO
>  PacemakerClientHandler:37 - Connection established from /<ip>:58212 to
> host0977.com/<ip>:6699
> 2019-11-18 03:06:33,284+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 9 more attempts for host0977.com.
> 2019-11-18 03:06:33,305+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:33,306+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
>
> *2019-11-18 03:06:33,306+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0*2019-11-18 03:06:34,284+0000
>  [executor-heartbeat-timer] WARN  PacemakerClient:192 - Not getting
> response or getting null response. Making 8 more attempts for host0977.com
> .
> 2019-11-18 03:06:34,285+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:34,286+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:34,286+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:35,284+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 7 more attempts for host0977.com.
> 2019-11-18 03:06:35,286+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:35,286+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:35,286+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:36,285+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 6 more attempts for host0977.com.
> 2019-11-18 03:06:36,287+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:36,287+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:36,287+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:37,285+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 5 more attempts for host0977.com.
> 2019-11-18 03:06:37,287+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:37,287+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:37,287+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:38,286+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 4 more attempts for host0977.com.
> 2019-11-18 03:06:38,288+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:38,288+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:38,288+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:39,286+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 3 more attempts for host0977.com.
> 2019-11-18 03:06:39,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:39,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:39,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:40,287+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 2 more attempts for host0977.com.
> 2019-11-18 03:06:40,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:40,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:40,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:41,288+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 1 more attempts for host0977.com.
> 2019-11-18 03:06:41,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:41,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:41,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:42,288+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 0 more attempts for host0977.com.
> 2019-11-18 03:06:42,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:42,289+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:42,290+0000  [host0977.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
> 2019-11-18 03:06:45,527+0000  [executor-heartbeat-timer] ERROR
> rejectedExecution:770 - Failed to submit a listener notification task.
> Event loop shut down?
> java.util.concurrent.RejectedExecutionException: event executor terminated
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:855)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:328)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:321)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:778)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:768)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:432)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:112)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.DefaultChannelPromise.setFailure(DefaultChannelPromise.java:89)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.safeExecute(AbstractChannelHandlerContext.java:1010)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:610)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:465)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline.close(DefaultChannelPipeline.java:1003)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannel.close(AbstractChannel.java:238)
> ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClient.close_channel(PacemakerClient.java:260)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClient.close(PacemakerClient.java:267)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClientPool.rotateClients(PacemakerClientPool.java:92)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.pacemaker.PacemakerClientPool.send(PacemakerClientPool.java:54)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.cluster.PaceMakerStateStorage.set_worker_hb(PaceMakerStateStorage.java:127)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.cluster.StormClusterStateImpl.workerHeartbeat(StormClusterStateImpl.java:509)
> ~[storm-client-2.1.0.jar:2.1.0]
> at
> org.apache.storm.daemon.worker.Worker.doExecutorHeartbeats(Worker.java:372)
> ~[storm-client-2.1.0.jar:2.1.0]
> at org.apache.storm.StormTimer$1.run(StormTimer.java:110)
> [storm-client-2.1.0.jar:2.1.0]
> at org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:226)
> [storm-client-2.1.0.jar:2.1.0]
> 2019-11-18 03:06:45,528+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:261 - channel host0977.com/<ip>:6699 closed
> 2019-11-18 03:06:45,529+0000  [executor-heartbeat-timer] ERROR
> PaceMakerStateStorage:138 - couldn't get response after 10 attempts. Failed
> to set_worker_hb. Will make 8 more attempts.
> 2019-11-18 03:06:45,532+0000  [host0978.com-pm-1] DEBUG
> PacemakerClient:143 - Channel is ready: [id: 0x94bfcb44]
> 2019-11-18 03:06:45,530+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:159 - Sending pacemaker message to host0978.com:
> HBMessage(type:SEND_PULSE, data:<HBMessageData
> pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
> details:1F 8B 08 00 00 00 00 00 00 00 ED 7D 0B 94 56 C5 95 EE A6 01 45 40
> 40 A3 23 30 18 10 45 50 FA F1 F7 FB 41 2B 60 30 20 22 0F 41 45 14 BA 91 97
> 08 34 84 97 20 2A 18 11 11 05 89 BC BA D5 44 54 12 8C 90 2C B8 A2 92 51 07
> 34 4C 24 2B 4C 02 09 C9 90 84 49 48 F4 5E 88 71 46 46 E5 C6 18 35 B7 39 C5
> A9 E2 FF AB F6 AE DA 75 CE FC 97 D3 F7 F6 02 69 D7 FE BE 53 A7 4E D5 57 67
> 9F AA 5D BB DA 40...)>)
> 2019-11-18 03:06:45,535+0000  [executor-heartbeat-timer] DEBUG
> PacemakerClient:165 - Put message in slot: 0 for host0978.com
> 2019-11-18 03:06:45,536+0000  [host0978.com-pm-1] ERROR
> PacemakerClientHandler:60 - Exception occurred in Pacemaker.
> java.nio.channels.NotYetConnectedException: null
> at
> org.apache.storm.shade.io.netty.channel.AbstractChannel$AbstractUnsafe.flush0()(Unknown
> Source) ~[storm-shaded-deps-2.1.0.jar:2.1.0]
> 2019-11-18 03:06:45,537+0000  [host0978.com-pm-1] INFO
>  PacemakerClientHandler:37 - Connection established from /<ip>:49534 to
> host0978.com/<ip>:6699
> 2019-11-18 03:06:45,637+0000  [Timer-0] INFO  PacemakerClient:246 -
> reconnecting to host0978.com
> 2019-11-18 03:06:45,638+0000  [Timer-0] DEBUG PacemakerClient:261 -
> channel host0978.com/<ip>:6699 closed
> 2019-11-18 03:06:45,644+0000  [host0978.com-pm-2] DEBUG
> PacemakerClient:143 - Channel is ready: [id: 0x58e0cd56]
> 2019-11-18 03:06:45,649+0000  [host0978.com-pm-2] INFO
>  PacemakerClientHandler:37 - Connection established from /<ip>:49536 to
> host0978.com/<ip>:6699
> 2019-11-18 03:06:46,535+0000  [executor-heartbeat-timer] WARN
>  PacemakerClient:192 - Not getting response or getting null response.
> Making 9 more attempts for host0978.com.
> 2019-11-18 03:06:46,556+0000  [host0978.com-pm-2] DEBUG
> PacemakerClientHandler:43 - Got Message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:46,556+0000  [host0978.com-pm-2] DEBUG
> PacemakerClient:216 - Pacemaker client got message:
> HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
> 2019-11-18 03:06:46,556+0000  [host0978.com-pm-2] DEBUG
> PacemakerClient:220 - No message for slot: 0
>
>
>
> --
> Thanks
> *Sharath*
>


-- 
--
Thanks
*Sharath*

Reply via email to