Hi Prasun, Acks and fails should continue to be handled. For step 2 I would consider adding a timeout just in case.
-Nathan On Thu, May 1, 2014 at 3:36 PM, Prasun Ghosh <[email protected]> wrote: > Thanks Nathan, > > So, my shutdown script should be > 1. Deactivate the topology > 2. Wait for the WIP size to become “Zero”. > The piece of code that removes from WIP resides in Spout's ack/fail > method. By deactivating the topology and hence the spout, will these pieces > of code (in spout) still execute to remove items from WIP ? > In short, when I deactivate topology, are we just pausing the call to > “nextTuple()” on spout and everything else will continue to work as is ? > 3. Initiate Shutdown... > - Thanks, > Prasun Ghosh > Apple Inc. > Information Security > > > > > > > On May 1, 2014, at 12:27 PM, Nathan Leung <[email protected]> wrote: > > You can deactivate the topology, which will shut off the spouts. Then > after a period of time (enough for your bolts to all drain), kill the > topology. I believe this is what kill with a non-zero timeout does as > well. Kill with a zero timeout will kill the worker process/es without > letting them drain, hence the tuples that were not acked or failed. > > > On Thu, May 1, 2014 at 3:16 PM, P Ghosh <[email protected]> wrote: > >> I have few topologies running. The spout puts the ID of the object it is >> emitting into an WIP list in REDIS. When the spout gets the ack or fail >> method called, it takes it out of the WIP list. >> >> The environment and application are undergoing lot of changes.. and as a >> result I'm required to occasionally restart the topology or the storm >> cluster itself. >> >> Problem is, as I restart, I see quite few messages are left in WIP..which >> means for these messages, spout didn't receive any ack or fail. >> >> My restart process has been >> 1. Kill the topology from UI (I find killing from UI is more >> responsive than from command line.... the killed topology goes off very >> quickly...if I do it from command line, the "killed" topology remains in >> the list for a long time , hindering my ability to relaunch the >> topology...). I typically kill it it with 0 secs. wait time..(may be this >> where I'm doing wrong) >> >> 2. Go to each VM and stop the >> a> supervisor >> b> logviewer >> 3. Go to nimbus,shutdown >> a> ui/nimbus/logviewer >> 4.Go to zookeeper and shutdown zookeeper >> >> >> This I thought is the proper flow...but I doubt that given the left over >> messages I see in WIP. >> >> Any thoughts...will be helpful. >> >> Thanks, >> Prasun >> >> >> >> > > >
