[ https://issues.apache.org/jira/browse/STORM-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved STORM-1750. --------------------------------- Resolution: Fixed Fix Version/s: 1.0.2 0.10.2 2.0.0 Thanks [~Srdo], I merged into master, 1.x, 1.0.x, 0.10.x branches. > Report-error-and-die may not kill the worker > -------------------------------------------- > > Key: STORM-1750 > URL: https://issues.apache.org/jira/browse/STORM-1750 > Project: Apache Storm > Issue Type: Bug > Components: storm-core > Affects Versions: 0.10.0, 1.0.0, 2.0.0 > Reporter: Stig Rohde Døssing > Assignee: Stig Rohde Døssing > Priority: Critical > Fix For: 2.0.0, 0.10.2, 1.0.2 > > > The report-error-and-die function in executor.clj calls report-error, which > can throw exceptions if Curator runs into any kind of trouble while > registering the error. I suspect this may happen with network errors, but it > can also happen if two executors for the same component throw exceptions at > the same time and no errors have been registered for the component > previously. This is because both calls to report-error-and-die update the > lastErrorPath, and ZkStateStorage set_data doesn't catch the potential > NodeExistsException that may be thrown from the create call. > If an exception is thrown from report-error, the suicide-fn is never called, > and the worker keeps running sans the crashed executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)