This does not seem to work as expected. I killed the storm cluster and then ran 'storm kill_workers'. This part worked and all of the topologies were killed. However, when supervisor was restarted, it also restarted all of the topologies. We would like to kill all of the topologies permanently. We are using Storm version 1.2.1
From: [email protected] At: 05/01/19 16:49:39To: [email protected] Subject: Re: Kill_workers cli not working as expected I don't believe the supervisor needs to be running to run kill_workers. You can kill the supervisor first, then run kill_workers to get rid of the workers, do the restart and reboot the supervisor. Den ons. 1. maj 2019 kl. 22.09 skrev Mitchell Rathbun (BLOOMBERG/ 731 LEX) <[email protected]>: Say that we want to kill all topologies when a machine is brought down. The machine will be brought back up shortly after, which includes restarting supervisor. If supervisor always restarts worker processes after kill_workers, then won't they still be restarted when supervisor is brought back up, since active topologies are kept within ZooKeeper? And if supervisor is required for this command to work, then supervisor must be running until kill_workers completes. How can we guarantee that supervisor is then killed before the worker processes are restarted? From: [email protected] At: 04/30/19 16:58:17To: [email protected] Subject: Re: Kill_workers cli not working as expected I believe kill_workers is for cleaning up workers if e.g. you want to shut down a supervisor node, or if you have an unstable machine you want to take out of the cluster. The command was introduced because simply killing the supervisor process would leave the workers alive. If you want to kill the workers and keep them dead, you should also kill the supervisor on that machine. More context at https://issues.apache.org/jira/browse/STORM-1058 Den tir. 30. apr. 2019 kl. 22.28 skrev Mitchell Rathbun (BLOOMBERG/ 731 LEX) <[email protected]>: We currently run both Nimbus and Supervisor on the same cluster. When running 'storm kill_workers', I have noticed that all of the workers are killed, but then are restarted. In the supervisor log I see the following for each topology: 2019-04-30 16:21:17,571 INFO Slot [SLOT_19227] STATE KILL_AND_RELAUNCH msInState: 5 topo:WingmanTopology998-1-1556594165 worker:f0de5 54d-81a1-48ce-82e8-9beef009969b -> WAITING_FOR_WORKER_START msInState: 0 topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48c e-82e8-9beef009969b 2019-04-30 16:21:25,574 INFO Slot [SLOT_19227] STATE WAITING_FOR_WORKER_START msInState: 8003 topo:WingmanTopology998-1-1556594165 wo rker:f0de554d-81a1-48ce-82e8-9beef009969b -> RUNNING msInState: 0 topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48ce-82e8- 9beef009969b Is this the expected behavior (worker process is bounced, not killed)? I thought that kill_workers would essentially run 'storm kill' for each of the worker processes.
