[ 
https://issues.apache.org/jira/browse/STORM-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15331520#comment-15331520
 ] 

Simon Whittemore commented on STORM-1732:
-----------------------------------------

I think we are seeing this issue as well, also running on Windows, storm 1.0.1.

We can reproduce this issue by "rebalancing" the topology, subsequent to this, 
some of our multilang resource files are missing.

> Resources are deleted when worker dies
> --------------------------------------
>
>                 Key: STORM-1732
>                 URL: https://issues.apache.org/jira/browse/STORM-1732
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 1.0.0
>         Environment: Windows
>            Reporter: gareth smith
>            Priority: Critical
>         Attachments: potentalFix.patch
>
>
> *Lets say a worker has been started by the supervisor*
> 2016-04-26 16:11:48.716 [o.a.s.d.supervisor] INFO: Launching worker with 
> assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54] 
> [42 42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources 
> #object[org.apache.storm.generated.WorkerResources 0x10bac1e4 
> "WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this 
> supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:11:48.727 [o.a.s.d.supervisor] INFO: Launching worker with 
> command: 'C:\LightningDeployment\Java\bin\java' '-cp' ........ 
> 2016-04-26 16:11:48.910 [o.a.s.config] INFO: SET worker-user 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 LIGHTNINGVM14$
> *note this bit is is new for storm 1.0.0*
> 2016-04-26 16:11:49.405 [o.a.s.d.supervisor] INFO: Creating symlinks for 
> worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id: 
> Lightning-1-1461683473 to its port artifacts directory
> 2016-04-26 16:11:50.251 [o.a.s.d.supervisor] INFO: Creating symlinks for 
> worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id: 
> Lightning-1-1461683473 for files(1): ("resources")
> *When a worker dies we correctly see some clean up and a new worker 
> started...*
> 2016-04-26 16:15:35.520 [o.a.s.d.supervisor] INFO: Worker Process 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 exited with code: 20
> 2016-04-26 16:15:39.674 [o.a.s.d.supervisor] INFO: Worker Process 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:39.675 [o.a.s.d.supervisor] INFO: Shutting down and clearing 
> state for id a5d51626-6e9f-4614-9ebb-a6263c140ca2. Current supervisor time: 
> 1461683739. State: :timed-out, Heartbeat: {:time-secs 1461683734, :storm-id 
> "Lightning-1-1461683473", :executors [[12 12] [54 54] [42 42] [24 24] [18 18] 
> [6 6] [48 48] [30 30] [-1 -1] [36 36]], :port 6700}
> 2016-04-26 16:15:39.676 [o.a.s.d.supervisor] INFO: Shutting down 
> 477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:39.676 [o.a.s.config] INFO: GET worker-user 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:39.677 [o.a.s.d.supervisor] INFO: Worker Process 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:39.681 [o.a.s.d.supervisor] INFO: Worker Process 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:39.857 [o.a.s.util] INFO: Error when trying to kill 1352. 
> Process is probably already dead.
> 2016-04-26 16:15:39.955 [o.a.s.util] INFO: Error when trying to kill 2372. 
> Process is probably already dead.
> 2016-04-26 16:15:40.009 [o.a.s.util] INFO: Error when trying to kill 4932. 
> Process is probably already dead.
> 2016-04-26 16:15:40.009 [o.a.s.d.supervisor] INFO: Sleep 10 seconds for 
> execution of cleanup threads on worker.
> 2016-04-26 16:15:49.677 [o.a.s.d.supervisor] INFO: Worker Process 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:49.679 [o.a.s.d.supervisor] INFO: Worker Process 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:50.056 [o.a.s.util] INFO: Error when trying to kill 1352. 
> Process is probably already dead.
> 2016-04-26 16:15:50.119 [o.a.s.util] INFO: Error when trying to kill 2372. 
> Process is probably already dead.
> 2016-04-26 16:15:50.175 [o.a.s.util] INFO: Error when trying to kill 4932. 
> Process is probably already dead.
> 2016-04-26 16:15:50.257 [o.a.s.config] INFO: REMOVE worker-user 
> a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Shut down 
> 477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Launching worker with 
> assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54] 
> [42 42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources 
> #object[org.apache.storm.generated.WorkerResources 0x20e1ad4f 
> "WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this 
> supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id 
> e413447b-c9ca-417d-8e55-e10dd0edc6a4
> *When the worker has been cleaned up, it seems the folders that the symlinks 
> are pointing to are also cleaned (this maybe a windows only problem)*
> *This is bad as it deletes the contents of the "resources" directory and 
> hence any multilang stuff that was in those directories*
> *also I think STORM-876 introduced this problem*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to