Stig Rohde Døssing created STORM-1879:
-----------------------------------------
Summary: Supervisor may not shut down workers cleanly
Key: STORM-1879
URL: https://issues.apache.org/jira/browse/STORM-1879
Project: Apache Storm
Issue Type: Bug
Components: storm-core
Affects Versions: 1.0.1
Reporter: Stig Rohde Døssing
We've run into a strange issue with a zombie worker process. It looks like the
worker pid file somehow got deleted without the worker process shutting down.
This causes the supervisor to try repeatedly to kill the worker unsuccessfully,
and means multiple workers may be assigned to the same port. The worker root
folder sticks around because the worker is still heartbeating to it.
It may or may not be related that we've seen Nimbus occasionally enter an
infinite loop of printing logs similar to the below.
{code}
2016-05-19 14:55:14.196 o.a.s.b.BlobStoreUtils [ERROR] Could not update the
blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
2016-05-19 14:55:14.210 o.a.s.b.BlobStoreUtils [ERROR] Could not update the
blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
2016-05-19 14:55:14.218 o.a.s.b.BlobStoreUtils [ERROR] Could not update the
blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
2016-05-19 14:55:14.256 o.a.s.b.BlobStoreUtils [ERROR] Could not update the
blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
2016-05-19 14:55:14.273 o.a.s.b.BlobStoreUtils [ERROR] Could not update the
blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
2016-05-19 14:55:14.316 o.a.s.b.BlobStoreUtils [ERROR] Could not update the
blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
{code}
Which continues until Nimbus is rebooted. We also see repeating blocks similar
to the logs below.
{code}
2016-06-02 07:45:03.656 o.a.s.d.nimbus [INFO] Cleaning up
ZendeskTicketTopology-127-1464780171
2016-06-02 07:45:04.132 o.a.s.d.nimbus [INFO]
ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormjar.jar)
2016-06-02 07:45:04.144 o.a.s.d.nimbus [INFO]
ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormconf.ser)
2016-06-02 07:45:04.155 o.a.s.d.nimbus [INFO]
ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormcode.ser)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)