[ https://issues.apache.org/jira/browse/STORM-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved STORM-1879. --------------------------------- Resolution: Fixed Assignee: Jungtaek Lim Fix Version/s: 1.1.0 1.0.2 2.0.0 Resolving this since STORM-1934 was merged. > Supervisor may not shut down workers cleanly > -------------------------------------------- > > Key: STORM-1879 > URL: https://issues.apache.org/jira/browse/STORM-1879 > Project: Apache Storm > Issue Type: Bug > Components: storm-core > Affects Versions: 1.0.1 > Reporter: Stig Rohde Døssing > Assignee: Jungtaek Lim > Fix For: 2.0.0, 1.0.2, 1.1.0 > > Attachments: fix_missing_worker_pid.patch, nimbus-supervisor.zip, > supervisor.log > > > We've run into a strange issue with a zombie worker process. It looks like > the worker pid file somehow got deleted without the worker process shutting > down. This causes the supervisor to try repeatedly to kill the worker > unsuccessfully, and means multiple workers may be assigned to the same port. > The worker root folder sticks around because the worker is still heartbeating > to it. > It may or may not be related that we've seen Nimbus occasionally enter an > infinite loop of printing logs similar to the below. > {code} > 2016-05-19 14:55:14.196 o.a.s.b.BlobStoreUtils [ERROR] Could not update the > blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser > 2016-05-19 14:55:14.210 o.a.s.b.BlobStoreUtils [ERROR] Could not update the > blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser > 2016-05-19 14:55:14.218 o.a.s.b.BlobStoreUtils [ERROR] Could not update the > blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser > 2016-05-19 14:55:14.256 o.a.s.b.BlobStoreUtils [ERROR] Could not update the > blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser > 2016-05-19 14:55:14.273 o.a.s.b.BlobStoreUtils [ERROR] Could not update the > blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser > 2016-05-19 14:55:14.316 o.a.s.b.BlobStoreUtils [ERROR] Could not update the > blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser > {code} > Which continues until Nimbus is rebooted. We also see repeating blocks > similar to the logs below. > {code} > 2016-06-02 07:45:03.656 o.a.s.d.nimbus [INFO] Cleaning up > ZendeskTicketTopology-127-1464780171 > 2016-06-02 07:45:04.132 o.a.s.d.nimbus [INFO] > ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormjar.jar) > 2016-06-02 07:45:04.144 o.a.s.d.nimbus [INFO] > ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormconf.ser) > 2016-06-02 07:45:04.155 o.a.s.d.nimbus [INFO] > ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormcode.ser) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)