Have you figured out the rootcause/fix for this issue? I just hit it and would really appreciate some time-saving advise.
---------- Andrey Yegorov On Wed, Mar 12, 2014 at 10:31 AM, Josh Walton <[email protected]> wrote: > Overnight last night, it appears my Storm Trident topology restarted > itself. When I checked the Storm UI, it said the topology had been running > for 24 hours, and showed no error or exceptions in any of the bolts. > > I check the nimbus log and see the following: > > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Executor MITAS3-74-1394565794:[34 > 34] not alive > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Executor MITAS3-74-1394565794:[4 > 4] not alive > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Executor MITAS3-74-1394565794:[40 > 40] not alive > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Executor MITAS3-74-1394565794:[10 > 10] not alive > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Executor MITAS3-74-1394565794:[16 > 16] not alive > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Executor MITAS3-74-1394565794:[22 > 22] not alive > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Executor MITAS3-74-1394565794:[28 > 28] not alive > 2014-03-12 10:55:06 b.s.s.EvenScheduler [INFO] Available slots: > (["5d105f66-1add-421b-8265-e7340a95928c" 6700] > ["32ab1745-c260-4491-ae4d-92dcc5d14a62" 6700]) > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Reassigning MITAS3-74-1394565794 > to 6 slots > 2014-03-12 10:55:06 b.s.d.nimbus [INFO] Reassign executors: [[34 34] [4 4] > [40 40] [10 10] [16 16] [22 22] [28 28]] > > It appears that an executor was alive, and must have timed out somehow > since I didn't see any exceptions or stack traces in the logs. > > Is there a way to change the timeout? I see several timeout settings, but > I'm not sure if any of those would help prevent this type of restart. I am > using a custom TridentState which holds data in memory so we lost data as a > result of this restart, and would like to prevent this from happening again. > > Thanks > > Josh >
