You can rebalance your topology with proper wait time without killing all
workers manually.
When 'kill' or 'rebalance' is issued, topology is immediately
'deactivated', so spouts are not fetching / emitting tuples. In wait time,
bolts process tuples which are already emitted from Spout. If bolts can
process all flowing tuples, it's a graceful restart. Same thing applies to
kill, 'graceful stop' in this case.

- Jungtaek Lim (HeartSaVioR)


2016년 5월 26일 (목) 오후 11:06, Julián Bermejo Ferreiro | BEEVA <
[email protected]>님이 작성:

> Hi Jungtaek,
>
> We are running, Storm 0.9.4, but we are planning to migrate to 1.0.1
> version.
>
> We deploy our topologies to move messages inside RabbitMQ brokers.
>
> Certanly, we have made the test of forcing a worker's die, and once nimbus
> timeout has happened, a new worker appeared in another node,  but system
> doesn't behave as good as it should. It was necessary to kill some other
> workers and rebalance a couple of times in order to get everything OK (A
> constant message flow inside our brokers).
>
> Is it possible to kill all the workers inside a topology and rebalance
> (like a kind of graceful shutdown)? Or once you kill all of them you must
> redeploy de hole topology?
>
> Is 1.0.1 version a possible solution?
>
> Thanks again.
>
>
>
>
> *JULIÁN BERMEJO FERREIRO*
> *Departamento de Tecnología *
> *[email protected] <[email protected]>*
> <http://www.beeva.com/>
>
>
>
>
> 2016-05-26 15:34 GMT+02:00 Jungtaek Lim <[email protected]>:
>
>> Hi Julián,
>>
>> Which version of Storm do you use?
>> I remember some of Storm 0.9.x versions has some issues when workers are
>> failing, so I'd like to know about it.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2016년 5월 26일 (목) 오후 5:53, Julián Bermejo Ferreiro | BEEVA <
>> [email protected]>님이 작성:
>>
>>> Hello,
>>>
>>> We have a multiple-node storm cluster running on a Production
>>> environment. We have had some issues with a couple of machines, which have
>>> been out of service for a few hours.
>>>
>>> Because some workers of the deployed topologies were running on the
>>> failed machines, cluster's behaviour has been unusual (It has been running
>>> but not as it should).
>>>
>>> Once we recovered the failed nodes, and rebalanced the topologies, the
>>> cluster returned to work properly.
>>>
>>> We would like to know if there is any way to alert nimbus, when a node
>>> fall down, in order to rebalance the affected topologies and  create new
>>> workers in the healthy nodes of the cluster that supply those who were
>>> working on the failed ones.
>>>
>>> This would have helped us so much, because we could have kept
>>> consistency in our service in spite of the failed nodes.
>>>
>>> Any advice?
>>>
>>> Tahnks in advance!
>>>
>>>
>>>
>>>
>>>
>>>
>>> *JULIÁN BERMEJO FERREIRO*
>>> *Departamento de Tecnología *
>>> *[email protected] <[email protected]>*
>>> <http://www.beeva.com/>
>>>
>>>
>>>
>>>
>

Reply via email to