Say that we want to kill all topologies when a machine is brought down. The 
machine will be brought back up shortly after, which includes restarting 
supervisor. If supervisor always restarts worker processes after kill_workers, 
then won't they still be restarted when supervisor is brought back up, since 
active topologies are kept within ZooKeeper? And if supervisor is required for 
this command to work, then supervisor must be running until kill_workers 
completes. How can we guarantee that supervisor is then killed before the 
worker processes are restarted?

From: user@storm.apache.org At: 04/30/19 16:58:17To:  user@storm.apache.org
Subject: Re: Kill_workers cli not working as expected

I believe kill_workers is for cleaning up workers if e.g. you want to shut down 
a supervisor node, or if you have an unstable machine you want to take out of 
the cluster. The command was introduced because simply killing the supervisor 
process would leave the workers alive.

If you want to kill the workers and keep them dead, you should also kill the 
supervisor on that machine.

More context at https://issues.apache.org/jira/browse/STORM-1058

Den tir. 30. apr. 2019 kl. 22.28 skrev Mitchell Rathbun (BLOOMBERG/ 731 LEX) 
<mrathb...@bloomberg.net>:

We currently run both Nimbus and Supervisor on the same cluster. When running 
'storm kill_workers', I have noticed that all of the workers are killed, but 
then are restarted. In the supervisor log I see the following for each topology:

2019-04-30 16:21:17,571 INFO  Slot [SLOT_19227] STATE KILL_AND_RELAUNCH 
msInState: 5 topo:WingmanTopology998-1-1556594165 worker:f0de5     
54d-81a1-48ce-82e8-9beef009969b -> WAITING_FOR_WORKER_START msInState: 0 
topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48c     
e-82e8-9beef009969b
2019-04-30 16:21:25,574 INFO  Slot [SLOT_19227] STATE WAITING_FOR_WORKER_START 
msInState: 8003 topo:WingmanTopology998-1-1556594165 wo     
rker:f0de554d-81a1-48ce-82e8-9beef009969b -> RUNNING msInState: 0 
topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48ce-82e8-     
9beef009969b

Is this the expected behavior (worker process is bounced, not killed)? I 
thought that kill_workers would essentially run 'storm kill' for each of the 
worker processes.


Reply via email to