I don't believe the supervisor needs to be running to run kill_workers. You
can kill the supervisor first, then run kill_workers to get rid of the
workers, do the restart and reboot the supervisor.

Den ons. 1. maj 2019 kl. 22.09 skrev Mitchell Rathbun (BLOOMBERG/ 731 LEX) <
[email protected]>:

> Say that we want to kill all topologies when a machine is brought down.
> The machine will be brought back up shortly after, which includes
> restarting supervisor. If supervisor always restarts worker processes after
> kill_workers, then won't they still be restarted when supervisor is brought
> back up, since active topologies are kept within ZooKeeper? And if
> supervisor is required for this command to work, then supervisor must be
> running until kill_workers completes. How can we guarantee that supervisor
> is then killed before the worker processes are restarted?
>
> From: [email protected] At: 04/30/19 16:58:17
> To: [email protected]
> Subject: Re: Kill_workers cli not working as expected
>
> I believe kill_workers is for cleaning up workers if e.g. you want to shut
> down a supervisor node, or if you have an unstable machine you want to take
> out of the cluster. The command was introduced because simply killing the
> supervisor process would leave the workers alive.
>
> If you want to kill the workers and keep them dead, you should also kill
> the supervisor on that machine.
>
> More context at https://issues.apache.org/jira/browse/STORM-1058
>
> Den tir. 30. apr. 2019 kl. 22.28 skrev Mitchell Rathbun (BLOOMBERG/ 731
> LEX) <[email protected]>:
>
>> We currently run both Nimbus and Supervisor on the same cluster. When
>> running 'storm kill_workers', I have noticed that all of the workers are
>> killed, but then are restarted. In the supervisor log I see the following
>> for each topology:
>>
>> 2019-04-30 16:21:17,571 INFO Slot [SLOT_19227] STATE KILL_AND_RELAUNCH
>> msInState: 5 topo:WingmanTopology998-1-1556594165 worker:f0de5
>> 54d-81a1-48ce-82e8-9beef009969b -> WAITING_FOR_WORKER_START msInState: 0
>> topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48c
>> e-82e8-9beef009969b
>> 2019-04-30 16:21:25,574 INFO Slot [SLOT_19227] STATE
>> WAITING_FOR_WORKER_START msInState: 8003
>> topo:WingmanTopology998-1-1556594165 wo
>> rker:f0de554d-81a1-48ce-82e8-9beef009969b -> RUNNING msInState: 0
>> topo:WingmanTopology998-1-1556594165 worker:f0de554d-81a1-48ce-82e8-
>> 9beef009969b
>>
>> Is this the expected behavior (worker process is bounced, not killed)? I
>> thought that kill_workers would essentially run 'storm kill' for each of
>> the worker processes.
>>
>>
>

Reply via email to