I don't believe the supervisor needs to be running to run kill_workers. You can kill the supervisor first, then run kill_workers to get rid of the workers, do the restart and reboot the supervisor.
Den ons. 1. maj 2019 kl. 22.09 skrev Mitchell Rathbun (BLOOMBERG/ 731 LEX) < [email protected]>: > Say that we want to kill all topologies when a machine is brought down. > The machine will be brought back up shortly after, which includes > restarting supervisor. If supervisor always restarts worker processes after > kill_workers, then won't they still be restarted when supervisor is brought > back up, since active topologies are kept within ZooKeeper? And if > supervisor is required for this command to work, then supervisor must be > running until kill_workers completes. How can we guarantee that supervisor > is then killed before the worker processes are restarted? > > From: [email protected] At: 04/30/19 16:58:17 > To: [email protected] > Subject: Re: Kill_workers cli not working as expected > > I believe kill_workers is for cleaning up workers if e.g. you want to shut > down a supervisor node, or if you have an unstable machine you want to take > out of the cluster. The command was introduced because simply killing the > supervisor process would leave the workers alive. > > If you want to kill the workers and keep them dead, you should also kill > the supervisor on that machine. > > More context at https://issues.apache.org/jira/browse/STORM-1058 > > Den tir. 30. apr. 2019 kl. 22.28 skrev Mitchell Rathbun (BLOOMBERG/ 731 > LEX) <[email protected]>: > >> We currently run both Nimbus and Supervisor on the same cluster. When >> running 'storm kill_workers', I have noticed that all of the workers are >> killed, but then are restarted. In the supervisor log I see the following >> for each topology: >> >> 2019-04-30 16:21:17,571 INFO Slot [SLOT_19227] STATE KILL_AND_RELAUNCH >> msInState: 5 topo:WingmanTopology998-1-1556594165 worker:f0de5 >> 54d-81a1-48ce-82e8-9beef009969b -> WAITING_FOR_WORKER_START msInState: 0 >> topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48c >> e-82e8-9beef009969b >> 2019-04-30 16:21:25,574 INFO Slot [SLOT_19227] STATE >> WAITING_FOR_WORKER_START msInState: 8003 >> topo:WingmanTopology998-1-1556594165 wo >> rker:f0de554d-81a1-48ce-82e8-9beef009969b -> RUNNING msInState: 0 >> topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48ce-82e8- >> 9beef009969b >> >> Is this the expected behavior (worker process is bounced, not killed)? I >> thought that kill_workers would essentially run 'storm kill' for each of >> the worker processes. >> >> >
