Stig Rohde Døssing created STORM-1879:
-----------------------------------------

             Summary: Supervisor may not shut down workers cleanly
                 Key: STORM-1879
                 URL: https://issues.apache.org/jira/browse/STORM-1879
             Project: Apache Storm
          Issue Type: Bug
          Components: storm-core
    Affects Versions: 1.0.1
            Reporter: Stig Rohde Døssing


We've run into a strange issue with a zombie worker process. It looks like the 
worker pid file somehow got deleted without the worker process shutting down. 
This causes the supervisor to try repeatedly to kill the worker unsuccessfully, 
and means multiple workers may be assigned to the same port. The worker root 
folder sticks around because the worker is still heartbeating to it.

It may or may not be related that we've seen Nimbus occasionally enter an 
infinite loop of printing logs similar to the below.

{code}
2016-05-19 14:55:14.196 o.a.s.b.BlobStoreUtils [ERROR] Could not update the 
blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
2016-05-19 14:55:14.210 o.a.s.b.BlobStoreUtils [ERROR] Could not update the 
blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
2016-05-19 14:55:14.218 o.a.s.b.BlobStoreUtils [ERROR] Could not update the 
blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
2016-05-19 14:55:14.256 o.a.s.b.BlobStoreUtils [ERROR] Could not update the 
blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
2016-05-19 14:55:14.273 o.a.s.b.BlobStoreUtils [ERROR] Could not update the 
blob with keyZendeskTicketTopology-5-1463647641-stormcode.ser
2016-05-19 14:55:14.316 o.a.s.b.BlobStoreUtils [ERROR] Could not update the 
blob with keyZendeskTicketTopology-5-1463647641-stormconf.ser
{code}

Which continues until Nimbus is rebooted. We also see repeating blocks similar 
to the logs below.

{code}
2016-06-02 07:45:03.656 o.a.s.d.nimbus [INFO] Cleaning up 
ZendeskTicketTopology-127-1464780171
2016-06-02 07:45:04.132 o.a.s.d.nimbus [INFO] 
ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormjar.jar)
2016-06-02 07:45:04.144 o.a.s.d.nimbus [INFO] 
ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormconf.ser)
2016-06-02 07:45:04.155 o.a.s.d.nimbus [INFO] 
ExceptionKeyNotFoundException(msg:ZendeskTicketTopology-127-1464780171-stormcode.ser)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to