Sorry I have not responded sooner, I am trying to catch up on the mailing
list.

1. There is a patch for nimbus HA, but it looks like it has been
abandoned.  If you want to try and pick it up and try to address the
review comments that would be great.

https://issues.apache.org/jira/browse/STORM-166
https://github.com/apache/incubator-storm/pull/61


2. The spout has its own timeout and will call fail on itself if it has
not received an ack or fail message from the acker after the timeout
interval.  The acker itself also has a timeout, but it simply throws away
the tree after that timeout and relies on the spout to also timeout.  If
the acker gets a fail message from a bolt it will propagate the fail to
the spout and remove the tree.  If the tree is ever fully acked it
propagates the ack message to the spout.

3. It is up to the spout to decide how it will replay the tuple.  In some
cases it can ask the pub/sub system to replay the tuple, for others it may
not be able to and all it can do is keep track of the failure.

4. If a worker exits the supervisor will time it out and restart it.  If
the supervisor does not restart it fast enough nimbus may detect the
timeout and reschedule it on a different supervisor.

5. Supervisor starts and stops workers.  It also downloads the
dependencies and cleans up after them.

- Bobby

On 7/24/14, 3:13 AM, "Zhang,Anzhan" <[email protected]> wrote:

>Dear all,
>I have several questions during my learning of Storm implementation and
>architecture. Although I read
>http://storm.incubator.apache.org/documentation/Home.html carefully, but
>I still cannot get the answer, I am writing this email to ask your help,
>and any comments are very appreciated.
>
>1.       Nimbus is singleton in a storm cluster? I think it’s single in
>the storm.yaml confiugration file. As if it support more than 1, it
>should be configured there. If so, why Nimbus not set to be more than 1
>to let the ZK manages the leader selection of the Nimbus, then the nimbus
>is HA and not SPOF? If nimbus died, who will take resposibility for
>restarting it?
>
>2.       The success handling of the tuple will be updated to the task by
>acker. And the design for acker is so so so excellence. My question is by
>how the acker will detect the failure of the Tuple handler? Only by when
>ack val not == 0 when timeout?
>
>3.       If the acker reports the failure to the Spout task, how the
>Spout task restart emit the tuple? Will it choose some other worker? As
>if it emits the tuple to the same call stack, it may fail at the same
>place.
>
>4.       If a worker exits, who will take resposibility for restarting
>the worker?
>
>5.       What’s the duty for Supoervisor? Just for starting the defined
>number of worker?
>
>Thank you in advance!
>
>Best Regards
>Anzhan Zhang 张安站
>Baidu
>
>PS
>
>Ext: 3153
>Hi:   anzhsoft
>Cubicle:F4-B180
>

Reply via email to