Good question. At least one thing I know is that when a worker dies, it won't be able to ack the tuples it received, so those tuples will fail and you need to write code in your Spout's fail() method to re-emit them. My best guess is that your tuple with key 1 will continue going to task 1 when it is restarted. It'd help if someone could confirm.
On Wed, Jun 22, 2016 at 8:39 PM, Evgeniy Khyst <[email protected]> wrote: > Hi, > > I can't find information how fields grouping works in case of worker fail. > > With fields grouping tuples are partitioned by some key and are sent to > different tasks. > Tuples with the same key goes to the same task. > > When a worker dies, the supervisor will restart it. If it continuously > fails on startup and is unable to heartbeat to Nimbus, Nimbus will reassign > the worker to another machine. > > Does it mean that while supervisor restarts node or Nimbus reassigns > worker, tuples that were processed by tasks on failed worker will be routed > to other tasks? > > If yes, is it possible that while worker is restarted tuples by fields > grouping are directed to some other task, after worker is successfully > restarted or reassigned, tuples will be routed to taks on just restarted > worker? > > In this case there is a chance that tuple with key "1", for example, will > be processed by task 1, while worker for task 2 is restarted. After > successful restart of worker new tuple emitted by spout with the same key > "1" will be routed to task 2 on just restarted worker, while tuple with key > "1" processing on task 1 is still in progress. > > Does Storm provide guarantee that described situation will never happen > and when using fields grouping all tuples with the same key will be > processed by the same task even in the case of worker failure? > > > Best regards, > Evgeniy > -- Regards, Navin
