Hi, We are running into an issue with Storm 2.2.1 when a leader Nimbus node goes down, the active worker gets restarted automatically. We have a 3 Nimbus and 3 Supervisor node cluster. We have set the topology.min.replication.count to 2.
As per https://storm.apache.org/releases/current/nimbus-ha-design.html, the worker should continue to process tuples irrespective of leadership changes within Nimbus. We turned on the trace logging on Nimbus and Supervisor and found that Supervisor kills the worker and downloads the latest blob (code, conf, jar) from the new leader Nimbus. apache.storm.daemon.supervisor.Slot - STATE running msInState: 777557 topo:SourceGenToSinkPVB3_1-1-1647543081 worker:550735af-4c91-420e-8396-d957e0e1967b -> kill-blob-update msInState: 3002 topo:SourceGenToSinkPVB3_1-1-1647543081 worker:550735af-4c91-420e-8396-d957e0e1967b DEBUG apache.storm.daemon.supervisor.Slot - STATE waiting-for-blob-update Any help on this is appreciated. Thanks, Pradeep V.B. This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.
<<attachment: winmail.dat>>
