Hello everyone,
I'm working on upgrading storm from 1.2.1 to 2.1.0. While performing some
fault tolerance testing, I noticed a weird behavior.

*Scenario:*
I submitted topology and everything works fine. Now, if I bounce pacemaker
server - where the pacemaker client is connected to - it is fails to
heartbeat and gets stuck in a retry loop forever.

>From what I understand - pacemaker server is receiving SEND_PULSE messages
and responds well which is then received by client (SEND_PULSE_RESPONSE)
for the right message_id. However the client fails while looking up message
sent previously. Log says - "No message for slot: <message_id>".

Any idea why this could be?

Detailed logs below -
2019-11-18 03:04:19,023+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:159 - Sending pacemaker message to host0979.com:
HBMessage(type:SEND_PULSE, data:<HBMessageData
pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
details:1F 8B 08 00 00 00 00 00 00 00 E5 5D 0B 7C 96 B5 D5 0F 50 04 A1 A5
17 0A 82 C2 A8 80 DC 0B 45 CA 55 11 54 2E 0A 8A 5C 3A 01 07 02 52 44 94 4B
A1 45 C1 21 A2 02 DE 50 51 D0 79 C1 FD EA 74 8A 8A 13 1D 2A 6E BA 5F 99 A8
38 71 63 EA 26 4A 51 E6 07 53 3F DD 07 9B B8 39 71 FB BE F2 A4 4F C2 DB E4
9C E4 24 EF DE 2D EF D7 1F 6E 6E E7 FF 27 49 93 7F 9E 93 9C E4 A4 29 AB C7
18 EB 3B E7 D2 C2...)>)
2019-11-18 03:04:19,024+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:165 - Put message in slot: 1 for host0979.com
2019-11-18 03:04:19,028+0000  [host0979.com-pm-1] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:1)
2019-11-18 03:04:19,028+0000  [host0979.com-pm-1] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:1)
2019-11-18 03:04:19,028+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:179 - Got Response: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:1)

2019-11-18 03:05:19,041+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:159 - Sending pacemaker message to host0979.com:
HBMessage(type:SEND_PULSE, data:<HBMessageData
pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
details:1F 8B 08 00 00 00 00 00 00 00 E5 9D 09 98 56 C5 99 EF 5F 36 6D 43
A3 88 A8 08 74 D3 3B 2D C8 BE 35 8A 01 14 15 E2 86 82 0A 04 04 64 8F EC B4
BB D1 06 41 8D 41 41 24 88 09 3E 83 A6 EF C4 05 13 50 16 31 10 35 B6 19 46
A3 62 42 AE A2 E8 70 91 64 D0 90 E8 B8 E4 D1 C8 24 43 9F E2 54 F1 7D 55 EF
5B F5 56 9D 7C 93 D3 D7 49 66 BC B7 FE FF AE 53 5F D5 AF 4E AD EF 69 0E 8D
00 A0 CF 8C A9 5D...)>)
2019-11-18 03:05:19,041+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:165 - Put message in slot: 2 for host0979.com
2019-11-18 03:05:19,044+0000  [host0979.com-pm-1] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:2)
2019-11-18 03:05:19,044+0000  [host0979.com-pm-1] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:2)
2019-11-18 03:05:19,044+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:179 - Got Response: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:2)



*------- <Bounced pacemaker server @ host0979.com <http://host0979.com>>
-------*

2019-11-18 03:06:19,054+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:159 - Sending pacemaker message to host0979.com:
HBMessage(type:SEND_PULSE, data:<HBMessageData
pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
details:1F 8B 08 00 00 00 00 00 00 00 ED 7D 0B 94 56 C5 95 EE A6 01 45 40
40 A3 23 30 18 10 45 50 FA F1 F7 FB 41 2B 60 30 20 22 0F 41 45 14 BA 91 97
08 34 84 97 20 2A 18 11 11 05 89 BC BA D5 44 54 12 8C 90 2C B8 A2 92 51 07
34 4C 24 2B 4C 02 09 C9 90 84 49 48 F4 5E 88 71 46 46 E5 C6 18 35 B7 39 C5
A9 E2 FF AB F6 AE DA 75 CE FC 97 D3 F7 F6 02 69 D7 FE BE 53 A7 4E D5 57 67
9F AA 5D BB DA 40...)>)

*2019-11-18 03:06:19,055+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:165 - Put message in slot: 3 for host0979.com
<http://host0979.com>*2019-11-18 03:06:20,056+0000
 [executor-heartbeat-timer] WARN  PacemakerClient:192 - Not getting
response or getting null response. Making 9 more attempts for host0979.com.
2019-11-18 03:06:21,056+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 8 more attempts for host0979.com.
2019-11-18 03:06:22,057+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 7 more attempts for host0979.com.
2019-11-18 03:06:23,057+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 6 more attempts for host0979.com.
2019-11-18 03:06:24,057+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 5 more attempts for host0979.com.
2019-11-18 03:06:25,058+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 4 more attempts for host0979.com.
2019-11-18 03:06:26,058+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 3 more attempts for host0979.com.
2019-11-18 03:06:27,059+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 2 more attempts for host0979.com.
2019-11-18 03:06:28,059+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 1 more attempts for host0979.com.
2019-11-18 03:06:29,059+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 0 more attempts for host0979.com.
2019-11-18 03:06:32,275+0000  [executor-heartbeat-timer] ERROR
rejectedExecution:770 - Failed to submit a listener notification task.
Event loop shut down?
java.util.concurrent.RejectedExecutionException: event executor terminated
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:855)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:328)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:321)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:778)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:768)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:432)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:112)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.DefaultChannelPromise.setFailure(DefaultChannelPromise.java:89)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.safeExecute(AbstractChannelHandlerContext.java:1010)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:610)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:465)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline.close(DefaultChannelPipeline.java:1003)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannel.close(AbstractChannel.java:238)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClient.close_channel(PacemakerClient.java:260)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClient.close(PacemakerClient.java:267)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClientPool.rotateClients(PacemakerClientPool.java:92)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClientPool.send(PacemakerClientPool.java:54)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.cluster.PaceMakerStateStorage.set_worker_hb(PaceMakerStateStorage.java:127)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.cluster.StormClusterStateImpl.workerHeartbeat(StormClusterStateImpl.java:509)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.daemon.worker.Worker.doExecutorHeartbeats(Worker.java:372)
~[storm-client-2.1.0.jar:2.1.0]
at org.apache.storm.StormTimer$1.run(StormTimer.java:110)
[storm-client-2.1.0.jar:2.1.0]
at org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:226)
[storm-client-2.1.0.jar:2.1.0]
2019-11-18 03:06:32,276+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:261 - channel host0979.com/<ip>:6699 closed
2019-11-18 03:06:32,276+0000  [executor-heartbeat-timer] ERROR
PaceMakerStateStorage:138 - couldn't get response after 10 attempts. Failed
to set_worker_hb. Will make 9 more attempts.
2019-11-18 03:06:32,278+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:159 - Sending pacemaker message to host0977.com:
HBMessage(type:SEND_PULSE, data:<HBMessageData
pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
details:1F 8B 08 00 00 00 00 00 00 00 ED 7D 0B 94 56 C5 95 EE A6 01 45 40
40 A3 23 30 18 10 45 50 FA F1 F7 FB 41 2B 60 30 20 22 0F 41 45 14 BA 91 97
08 34 84 97 20 2A 18 11 11 05 89 BC BA D5 44 54 12 8C 90 2C B8 A2 92 51 07
34 4C 24 2B 4C 02 09 C9 90 84 49 48 F4 5E 88 71 46 46 E5 C6 18 35 B7 39 C5
A9 E2 FF AB F6 AE DA 75 CE FC 97 D3 F7 F6 02 69 D7 FE BE 53 A7 4E D5 57 67
9F AA 5D BB DA 40...)>)
2019-11-18 03:06:32,280+0000  [host0977.com-pm-1] DEBUG PacemakerClient:143
- Channel is ready: [id: 0xa4d6db58]

*2019-11-18 03:06:32,282+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:165 - Put message in slot: 0 for host0977.com
<http://host0977.com>*2019-11-18 03:06:32,294+0000  [host0977.com-pm-1]
ERROR PacemakerClientHandler:60 - Exception occurred in Pacemaker.
java.nio.channels.NotYetConnectedException: null
at
org.apache.storm.shade.io.netty.channel.AbstractChannel$AbstractUnsafe.flush0()(Unknown
Source) ~[storm-shaded-deps-2.1.0.jar:2.1.0]
2019-11-18 03:06:32,296+0000  [host0977.com-pm-1] INFO
 PacemakerClientHandler:37 - Connection established from /<ip>:58206 to
host0977.com/<ip>:6699
2019-11-18 03:06:32,397+0000  [Timer-0] INFO  PacemakerClient:246 -
reconnecting to host0977.com
2019-11-18 03:06:32,397+0000  [Timer-0] DEBUG PacemakerClient:261 - channel
host0977.com/<ip>:6699 closed
2019-11-18 03:06:32,404+0000  [host0977.com-pm-2] DEBUG PacemakerClient:143
- Channel is ready: [id: 0xcf170c9f]
2019-11-18 03:06:32,409+0000  [host0977.com-pm-2] INFO
 PacemakerClientHandler:37 - Connection established from /<ip>:58212 to
host0977.com/<ip>:6699
2019-11-18 03:06:33,284+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 9 more attempts for host0977.com.
2019-11-18 03:06:33,305+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:33,306+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)

*2019-11-18 03:06:33,306+0000  [host0977.com-pm-2] DEBUG
PacemakerClient:220 - No message for slot: 0*2019-11-18 03:06:34,284+0000
 [executor-heartbeat-timer] WARN  PacemakerClient:192 - Not getting
response or getting null response. Making 8 more attempts for host0977.com.
2019-11-18 03:06:34,285+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:34,286+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:34,286+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:35,284+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 7 more attempts for host0977.com.
2019-11-18 03:06:35,286+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:35,286+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:35,286+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:36,285+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 6 more attempts for host0977.com.
2019-11-18 03:06:36,287+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:36,287+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:36,287+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:37,285+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 5 more attempts for host0977.com.
2019-11-18 03:06:37,287+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:37,287+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:37,287+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:38,286+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 4 more attempts for host0977.com.
2019-11-18 03:06:38,288+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:38,288+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:38,288+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:39,286+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 3 more attempts for host0977.com.
2019-11-18 03:06:39,289+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:39,289+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:39,289+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:40,287+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 2 more attempts for host0977.com.
2019-11-18 03:06:40,289+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:40,289+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:40,289+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:41,288+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 1 more attempts for host0977.com.
2019-11-18 03:06:41,289+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:41,289+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:41,289+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:42,288+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 0 more attempts for host0977.com.
2019-11-18 03:06:42,289+0000  [host0977.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:42,289+0000  [host0977.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:42,290+0000  [host0977.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0
2019-11-18 03:06:45,527+0000  [executor-heartbeat-timer] ERROR
rejectedExecution:770 - Failed to submit a listener notification task.
Event loop shut down?
java.util.concurrent.RejectedExecutionException: event executor terminated
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:855)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:328)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:321)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:778)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:768)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:432)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.util.concurrent.DefaultPromise.setFailure(DefaultPromise.java:112)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.DefaultChannelPromise.setFailure(DefaultChannelPromise.java:89)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.safeExecute(AbstractChannelHandlerContext.java:1010)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:610)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:465)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline.close(DefaultChannelPipeline.java:1003)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.shade.io.netty.channel.AbstractChannel.close(AbstractChannel.java:238)
~[storm-shaded-deps-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClient.close_channel(PacemakerClient.java:260)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClient.close(PacemakerClient.java:267)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClientPool.rotateClients(PacemakerClientPool.java:92)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.pacemaker.PacemakerClientPool.send(PacemakerClientPool.java:54)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.cluster.PaceMakerStateStorage.set_worker_hb(PaceMakerStateStorage.java:127)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.cluster.StormClusterStateImpl.workerHeartbeat(StormClusterStateImpl.java:509)
~[storm-client-2.1.0.jar:2.1.0]
at
org.apache.storm.daemon.worker.Worker.doExecutorHeartbeats(Worker.java:372)
~[storm-client-2.1.0.jar:2.1.0]
at org.apache.storm.StormTimer$1.run(StormTimer.java:110)
[storm-client-2.1.0.jar:2.1.0]
at org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:226)
[storm-client-2.1.0.jar:2.1.0]
2019-11-18 03:06:45,528+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:261 - channel host0977.com/<ip>:6699 closed
2019-11-18 03:06:45,529+0000  [executor-heartbeat-timer] ERROR
PaceMakerStateStorage:138 - couldn't get response after 10 attempts. Failed
to set_worker_hb. Will make 8 more attempts.
2019-11-18 03:06:45,532+0000  [host0978.com-pm-1] DEBUG PacemakerClient:143
- Channel is ready: [id: 0x94bfcb44]
2019-11-18 03:06:45,530+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:159 - Sending pacemaker message to host0978.com:
HBMessage(type:SEND_PULSE, data:<HBMessageData
pulse:HBPulse(id:/workerbeats/if-usmsc-streams-p1-19_7-storm2-SNAPSHOT-3-1574046188/c40d6ed2-c0f7-414b-8452-6af03036caef-<ip>-6700,
details:1F 8B 08 00 00 00 00 00 00 00 ED 7D 0B 94 56 C5 95 EE A6 01 45 40
40 A3 23 30 18 10 45 50 FA F1 F7 FB 41 2B 60 30 20 22 0F 41 45 14 BA 91 97
08 34 84 97 20 2A 18 11 11 05 89 BC BA D5 44 54 12 8C 90 2C B8 A2 92 51 07
34 4C 24 2B 4C 02 09 C9 90 84 49 48 F4 5E 88 71 46 46 E5 C6 18 35 B7 39 C5
A9 E2 FF AB F6 AE DA 75 CE FC 97 D3 F7 F6 02 69 D7 FE BE 53 A7 4E D5 57 67
9F AA 5D BB DA 40...)>)
2019-11-18 03:06:45,535+0000  [executor-heartbeat-timer] DEBUG
PacemakerClient:165 - Put message in slot: 0 for host0978.com
2019-11-18 03:06:45,536+0000  [host0978.com-pm-1] ERROR
PacemakerClientHandler:60 - Exception occurred in Pacemaker.
java.nio.channels.NotYetConnectedException: null
at
org.apache.storm.shade.io.netty.channel.AbstractChannel$AbstractUnsafe.flush0()(Unknown
Source) ~[storm-shaded-deps-2.1.0.jar:2.1.0]
2019-11-18 03:06:45,537+0000  [host0978.com-pm-1] INFO
 PacemakerClientHandler:37 - Connection established from /<ip>:49534 to
host0978.com/<ip>:6699
2019-11-18 03:06:45,637+0000  [Timer-0] INFO  PacemakerClient:246 -
reconnecting to host0978.com
2019-11-18 03:06:45,638+0000  [Timer-0] DEBUG PacemakerClient:261 - channel
host0978.com/<ip>:6699 closed
2019-11-18 03:06:45,644+0000  [host0978.com-pm-2] DEBUG PacemakerClient:143
- Channel is ready: [id: 0x58e0cd56]
2019-11-18 03:06:45,649+0000  [host0978.com-pm-2] INFO
 PacemakerClientHandler:37 - Connection established from /<ip>:49536 to
host0978.com/<ip>:6699
2019-11-18 03:06:46,535+0000  [executor-heartbeat-timer] WARN
 PacemakerClient:192 - Not getting response or getting null response.
Making 9 more attempts for host0978.com.
2019-11-18 03:06:46,556+0000  [host0978.com-pm-2] DEBUG
PacemakerClientHandler:43 - Got Message:
HBMessage(type:SEND_PULSE_RESPONSE, data:null, message_id:0)
2019-11-18 03:06:46,556+0000  [host0978.com-pm-2] DEBUG PacemakerClient:216
- Pacemaker client got message: HBMessage(type:SEND_PULSE_RESPONSE,
data:null, message_id:0)
2019-11-18 03:06:46,556+0000  [host0978.com-pm-2] DEBUG PacemakerClient:220
- No message for slot: 0



--
Thanks
*Sharath*

Reply via email to