Worker restart when Nimbus leader goes down - version 2.2.1

Pradeep Badiger Mon, 21 Mar 2022 08:23:40 -0700

Hi,

We are running into an issue with Storm 2.2.1 when a leader Nimbus node goes 
down, the active worker gets restarted automatically. We have a 3 Nimbus and 3 
Supervisor node cluster. We have set the topology.min.replication.count to 2.


As per https://storm.apache.org/releases/current/nimbus-ha-design.html, the 
worker should continue to process tuples irrespective of leadership changes 
within Nimbus.

We turned on the trace logging on Nimbus and Supervisor and found that 
Supervisor kills the worker and downloads the latest blob (code, conf, jar) 
from the new leader Nimbus.

apache.storm.daemon.supervisor.Slot - STATE running msInState: 777557 
topo:SourceGenToSinkPVB3_1-1-1647543081 
worker:550735af-4c91-420e-8396-d957e0e1967b -> kill-blob-update msInState: 3002 
topo:SourceGenToSinkPVB3_1-1-1647543081 
worker:550735af-4c91-420e-8396-d957e0e1967b
DEBUG apache.storm.daemon.supervisor.Slot - STATE waiting-for-blob-update

Any help on this is appreciated.

Thanks,
Pradeep V.B.
This email and any files transmitted with it are confidential, proprietary and 
intended solely for the individual or entity to whom they are addressed. If you 
have received this email in error please delete it immediately.

<<attachment: winmail.dat>>

Worker restart when Nimbus leader goes down - version 2.2.1

Reply via email to