When Nimbus went down, other topologies were still processing messages correctly. It’s only because when 1 half of my topology went down it stopped processing messages for that particular topology.
Actually now that I said that - I am using 1 spout for 2 workers. Maybe the worker which went down had the spout and that’s why Storm wasn’t processing messages. I am going to try having Spouts=Number of workers of topology. Maybe that will fix this issue. Thanks, Ganesh From: Annabel Melongo [mailto:[email protected]] Sent: Friday, January 08, 2016 5:16 PM To: [email protected] Subject: Re: Storm worker crash scenario Ganesh, Nimbus is a sort of JobTracker. It makes sense that the job resumes only after Nimbus started working correctly. Otherwise, the state of the running job would have been lost. Thanks On Friday, January 8, 2016 1:07 PM, Ganesh Chandrasekaran <[email protected]<mailto:[email protected]>> wrote: I wanted to understand how Storm works when one of its worker crashes. SO here is the situation I ran into recently. My topology is distributed across 2 workers with a total of 6 threads. Somehow 3 threads died because one worker went down. At the same time nimbus service was also down because of which it could not spin up threads on other available workers. I noticed Storm wasn’t processing messages for the topology till Nimbus was restored and it spun up the remaining threads that were down. Is this the expected behavior? I was expecting Storm to continue processing messages with 1 half of the threads still up on the other worker. Thanks, Ganesh
