Re: Kill_workers cli not working as expected

Mitchell Rathbun (BLOOMBERG/ 731 LEX) Thu, 02 May 2019 09:14:30 -0700

This does not seem to work as expected. I killed the storm cluster and then ran 
'storm kill_workers'. This part worked and all of the topologies were killed. 
However, when supervisor was restarted, it also restarted all of the 
topologies. We would like to kill all of the topologies permanently. We are 
using Storm version 1.2.1

From: [email protected] At: 05/01/19 16:49:39To:  [email protected]
Subject: Re: Kill_workers cli not working as expected

I don't believe the supervisor needs to be running to run kill_workers. You can 
kill the supervisor first, then run kill_workers to get rid of the workers, do 
the restart and reboot the supervisor.

Den ons. 1. maj 2019 kl. 22.09 skrev Mitchell Rathbun (BLOOMBERG/ 731 LEX) 
<[email protected]>:

Say that we want to kill all topologies when a machine is brought down. The 
machine will be brought back up shortly after, which includes restarting 
supervisor. If supervisor always restarts worker processes after kill_workers, 
then won't they still be restarted when supervisor is brought back up, since 
active topologies are kept within ZooKeeper? And if supervisor is required for 
this command to work, then supervisor must be running until kill_workers 
completes. How can we guarantee that supervisor is then killed before the 
worker processes are restarted?

From: [email protected] At: 04/30/19 16:58:17To:  [email protected]
Subject: Re: Kill_workers cli not working as expected

I believe kill_workers is for cleaning up workers if e.g. you want to shut down 
a supervisor node, or if you have an unstable machine you want to take out of 
the cluster. The command was introduced because simply killing the supervisor 
process would leave the workers alive.

If you want to kill the workers and keep them dead, you should also kill the 
supervisor on that machine.

More context at https://issues.apache.org/jira/browse/STORM-1058

Den tir. 30. apr. 2019 kl. 22.28 skrev Mitchell Rathbun (BLOOMBERG/ 731 LEX) 
<[email protected]>:

We currently run both Nimbus and Supervisor on the same cluster. When running 
'storm kill_workers', I have noticed that all of the workers are killed, but 
then are restarted. In the supervisor log I see the following for each topology:

2019-04-30 16:21:17,571 INFO  Slot [SLOT_19227] STATE KILL_AND_RELAUNCH 
msInState: 5 topo:WingmanTopology998-1-1556594165 worker:f0de5     
54d-81a1-48ce-82e8-9beef009969b -> WAITING_FOR_WORKER_START msInState: 0 
topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48c     
e-82e8-9beef009969b
2019-04-30 16:21:25,574 INFO  Slot [SLOT_19227] STATE WAITING_FOR_WORKER_START 
msInState: 8003 topo:WingmanTopology998-1-1556594165 wo     
rker:f0de554d-81a1-48ce-82e8-9beef009969b -> RUNNING msInState: 0 
topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48ce-82e8-     
9beef009969b

Is this the expected behavior (worker process is bounced, not killed)? I 
thought that kill_workers would essentially run 'storm kill' for each of the 
worker processes.

Re: Kill_workers cli not working as expected

Reply via email to