[
https://issues.apache.org/jira/browse/STORM-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
gareth smith updated STORM-1732:
--------------------------------
Description:
*Lets say a worker has been started by the supervisor*
2016-04-26 16:11:48.716 [o.a.s.d.supervisor] INFO: Launching worker with
assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54] [42
42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources
#object[org.apache.storm.generated.WorkerResources 0x10bac1e4
"WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this
supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id
a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:11:48.727 [o.a.s.d.supervisor] INFO: Launching worker with
command: 'C:\LightningDeployment\Java\bin\java' '-cp' ........
2016-04-26 16:11:48.910 [o.a.s.config] INFO: SET worker-user
a5d51626-6e9f-4614-9ebb-a6263c140ca2 LIGHTNINGVM14$
*note this bit is is new for storm 1.0.0*
2016-04-26 16:11:49.405 [o.a.s.d.supervisor] INFO: Creating symlinks for
worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id:
Lightning-1-1461683473 to its port artifacts directory
2016-04-26 16:11:50.251 [o.a.s.d.supervisor] INFO: Creating symlinks for
worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id:
Lightning-1-1461683473 for files(1): ("resources")
*When a worker dies we correctly see some clean up and a new worker started...*
2016-04-26 16:15:35.520 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 exited with code: 20
2016-04-26 16:15:39.674 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:39.675 [o.a.s.d.supervisor] INFO: Shutting down and clearing
state for id a5d51626-6e9f-4614-9ebb-a6263c140ca2. Current supervisor time:
1461683739. State: :timed-out, Heartbeat: {:time-secs 1461683734, :storm-id
"Lightning-1-1461683473", :executors [[12 12] [54 54] [42 42] [24 24] [18 18]
[6 6] [48 48] [30 30] [-1 -1] [36 36]], :port 6700}
2016-04-26 16:15:39.676 [o.a.s.d.supervisor] INFO: Shutting down
477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:39.676 [o.a.s.config] INFO: GET worker-user
a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:39.677 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:39.681 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:39.857 [o.a.s.util] INFO: Error when trying to kill 1352.
Process is probably already dead.
2016-04-26 16:15:39.955 [o.a.s.util] INFO: Error when trying to kill 2372.
Process is probably already dead.
2016-04-26 16:15:40.009 [o.a.s.util] INFO: Error when trying to kill 4932.
Process is probably already dead.
2016-04-26 16:15:40.009 [o.a.s.d.supervisor] INFO: Sleep 10 seconds for
execution of cleanup threads on worker.
2016-04-26 16:15:49.677 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:49.679 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:50.056 [o.a.s.util] INFO: Error when trying to kill 1352.
Process is probably already dead.
2016-04-26 16:15:50.119 [o.a.s.util] INFO: Error when trying to kill 2372.
Process is probably already dead.
2016-04-26 16:15:50.175 [o.a.s.util] INFO: Error when trying to kill 4932.
Process is probably already dead.
2016-04-26 16:15:50.257 [o.a.s.config] INFO: REMOVE worker-user
a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Shut down
477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Launching worker with
assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54] [42
42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources
#object[org.apache.storm.generated.WorkerResources 0x20e1ad4f
"WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this
supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id
e413447b-c9ca-417d-8e55-e10dd0edc6a4
*When the worker has been cleaned up, it seems the folders that the symlinks
are pointing to are also cleaned (this maybe a windows only problem)*
*This is bad as it deletes the contents of the "resources" directory and hence
any multilang stuff that was in those directories*
*also I think STORM-876 indroduced this problem*
was:
*Lets say a worker has been started by the supervisor*
2016-04-26 16:11:48.716 [o.a.s.d.supervisor] INFO: Launching worker with
assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54] [42
42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources
#object[org.apache.storm.generated.WorkerResources 0x10bac1e4
"WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this
supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id
a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:11:48.727 [o.a.s.d.supervisor] INFO: Launching worker with
command: 'C:\LightningDeployment\Java\bin\java' '-cp'
'C:\LightningDeployment\Storm\lib\asm-5.0.3.jar;C:\LightningDeployment\Storm\lib\clojure-1.7.0.jar;C:\LightningDeployment\Storm\lib\disruptor-3.3.2.jar;C:\LightningDeployment\Storm\lib\kryo-3.0.3.jar;C:\LightningDeployment\Storm\lib\log4j-api-2.1.jar;C:\LightningDeployment\Storm\lib\log4j-core-2.1.jar;C:\LightningDeployment\Storm\lib\log4j-over-slf4j-1.6.6.jar;C:\LightningDeployment\Storm\lib\log4j-slf4j-impl-2.1.jar;C:\LightningDeployment\Storm\lib\minlog-1.3.0.jar;C:\LightningDeployment\Storm\lib\objenesis-2.1.jar;C:\LightningDeployment\Storm\lib\reflectasm-1.10.1.jar;C:\LightningDeployment\Storm\lib\servlet-api-2.5.jar;C:\LightningDeployment\Storm\lib\slf4j-api-1.7.7.jar;C:\LightningDeployment\Storm\lib\storm-core-1.0.0.jar;C:\LightningDeployment\Storm\lib\storm-rename-hack-1.0.0.jar;C:\LightningDeployment\Storm\conf;C:\LightningDeployment\Storm\storm-local\supervisor\stormdist\Lightning-1-1461683473\stormjar.jar'
'-Xmx64m' '-Dlogfile.name=worker.log'
'-Dstorm.home=C:\LightningDeployment\Storm'
'-Dworkers.artifacts=C:\LightningDeployment\Storm\logs\workers-artifacts'
'-Dstorm.id=Lightning-1-1461683473'
'-Dworker.id=a5d51626-6e9f-4614-9ebb-a6263c140ca2' '-Dworker.port=6700'
'-Dstorm.log.dir=C:\LightningDeployment\Storm\logs'
'-Dlog4j.configurationFile=file:///C:/LightningDeployment\Storm\log4j2\worker.xml'
'-DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector'
'org.apache.storm.LogWriter' 'C:\LightningDeployment\Java\bin\java' '-server'
'-Xmx20g' '-XX:+UseStringDeduplication' '-XX:+UseG1GC'
'-Djava.net.preferIPv4Stack=true'
'-Djava.library.path=C:\LightningDeployment\Storm\storm-local\supervisor\stormdist\Lightning-1-1461683473\resources\Windows_Server_2012_R2-amd64;C:\LightningDeployment\Storm\storm-local\supervisor\stormdist\Lightning-1-1461683473\resources;/usr/local/lib:/opt/local/lib:/usr/lib'
'-Dlogfile.name=worker.log' '-Dstorm.home=C:\LightningDeployment\Storm'
'-Dworkers.artifacts=C:\LightningDeployment\Storm\logs\workers-artifacts'
'-Dstorm.conf.file=' '-Dstorm.options='
'-Dstorm.log.dir=C:\LightningDeployment\Storm\logs'
'-Djava.io.tmpdir=C:\LightningDeployment\Storm\storm-local\workers\a5d51626-6e9f-4614-9ebb-a6263c140ca2\tmp'
'-Dlogging.sensitivity=S3'
'-Dlog4j.configurationFile=file:///C:/LightningDeployment\Storm\log4j2\worker.xml'
'-DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector'
'-Dstorm.id=Lightning-1-1461683473'
'-Dworker.id=a5d51626-6e9f-4614-9ebb-a6263c140ca2' '-Dworker.port=6700' '-cp'
'C:\LightningDeployment\Storm\lib\asm-5.0.3.jar;C:\LightningDeployment\Storm\lib\clojure-1.7.0.jar;C:\LightningDeployment\Storm\lib\disruptor-3.3.2.jar;C:\LightningDeployment\Storm\lib\kryo-3.0.3.jar;C:\LightningDeployment\Storm\lib\log4j-api-2.1.jar;C:\LightningDeployment\Storm\lib\log4j-core-2.1.jar;C:\LightningDeployment\Storm\lib\log4j-over-slf4j-1.6.6.jar;C:\LightningDeployment\Storm\lib\log4j-slf4j-impl-2.1.jar;C:\LightningDeployment\Storm\lib\minlog-1.3.0.jar;C:\LightningDeployment\Storm\lib\objenesis-2.1.jar;C:\LightningDeployment\Storm\lib\reflectasm-1.10.1.jar;C:\LightningDeployment\Storm\lib\servlet-api-2.5.jar;C:\LightningDeployment\Storm\lib\slf4j-api-1.7.7.jar;C:\LightningDeployment\Storm\lib\storm-core-1.0.0.jar;C:\LightningDeployment\Storm\lib\storm-rename-hack-1.0.0.jar;C:\LightningDeployment\Storm\conf;C:\LightningDeployment\Storm\storm-local\supervisor\stormdist\Lightning-1-1461683473\stormjar.jar'
'org.apache.storm.daemon.worker' 'Lightning-1-1461683473'
'477ae22e-1a2b-4ea3-afd5-cb969f25e732' '6700'
'a5d51626-6e9f-4614-9ebb-a6263c140ca2'
2016-04-26 16:11:48.910 [o.a.s.config] INFO: SET worker-user
a5d51626-6e9f-4614-9ebb-a6263c140ca2 LIGHTNINGVM14$
*note this bit is is new for storm 1.0.0*
2016-04-26 16:11:49.405 [o.a.s.d.supervisor] INFO: Creating symlinks for
worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id:
Lightning-1-1461683473 to its port artifacts directory
2016-04-26 16:11:50.251 [o.a.s.d.supervisor] INFO: Creating symlinks for
worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id:
Lightning-1-1461683473 for files(1): ("resources")
*When a worker dies we correctly see some clean up and a new worker started...*
2016-04-26 16:15:35.520 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 exited with code: 20
2016-04-26 16:15:39.674 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:39.675 [o.a.s.d.supervisor] INFO: Shutting down and clearing
state for id a5d51626-6e9f-4614-9ebb-a6263c140ca2. Current supervisor time:
1461683739. State: :timed-out, Heartbeat: {:time-secs 1461683734, :storm-id
"Lightning-1-1461683473", :executors [[12 12] [54 54] [42 42] [24 24] [18 18]
[6 6] [48 48] [30 30] [-1 -1] [36 36]], :port 6700}
2016-04-26 16:15:39.676 [o.a.s.d.supervisor] INFO: Shutting down
477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:39.676 [o.a.s.config] INFO: GET worker-user
a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:39.677 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:39.681 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:39.857 [o.a.s.util] INFO: Error when trying to kill 1352.
Process is probably already dead.
2016-04-26 16:15:39.955 [o.a.s.util] INFO: Error when trying to kill 2372.
Process is probably already dead.
2016-04-26 16:15:40.009 [o.a.s.util] INFO: Error when trying to kill 4932.
Process is probably already dead.
2016-04-26 16:15:40.009 [o.a.s.d.supervisor] INFO: Sleep 10 seconds for
execution of cleanup threads on worker.
2016-04-26 16:15:49.677 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:49.679 [o.a.s.d.supervisor] INFO: Worker Process
a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
2016-04-26 16:15:50.056 [o.a.s.util] INFO: Error when trying to kill 1352.
Process is probably already dead.
2016-04-26 16:15:50.119 [o.a.s.util] INFO: Error when trying to kill 2372.
Process is probably already dead.
2016-04-26 16:15:50.175 [o.a.s.util] INFO: Error when trying to kill 4932.
Process is probably already dead.
2016-04-26 16:15:50.257 [o.a.s.config] INFO: REMOVE worker-user
a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Shut down
477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Launching worker with
assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54] [42
42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources
#object[org.apache.storm.generated.WorkerResources 0x20e1ad4f
"WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this
supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id
e413447b-c9ca-417d-8e55-e10dd0edc6a4
*When the worker has been cleaned up, it seems the folders that the symlinks
are pointing to are also cleaned (this maybe a windows only problem)*
*This is bad as it deletes the contents of the "resources" directory and hence
any multilang stuff that was in those directories*
*also I think STORM-876 indroduced this problem*
> Resources are deleted when worker dies
> --------------------------------------
>
> Key: STORM-1732
> URL: https://issues.apache.org/jira/browse/STORM-1732
> Project: Apache Storm
> Issue Type: Bug
> Components: storm-core
> Affects Versions: 1.0.0
> Environment: Windows
> Reporter: gareth smith
> Priority: Critical
>
> *Lets say a worker has been started by the supervisor*
> 2016-04-26 16:11:48.716 [o.a.s.d.supervisor] INFO: Launching worker with
> assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54]
> [42 42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources
> #object[org.apache.storm.generated.WorkerResources 0x10bac1e4
> "WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this
> supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id
> a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:11:48.727 [o.a.s.d.supervisor] INFO: Launching worker with
> command: 'C:\LightningDeployment\Java\bin\java' '-cp' ........
> 2016-04-26 16:11:48.910 [o.a.s.config] INFO: SET worker-user
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 LIGHTNINGVM14$
> *note this bit is is new for storm 1.0.0*
> 2016-04-26 16:11:49.405 [o.a.s.d.supervisor] INFO: Creating symlinks for
> worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id:
> Lightning-1-1461683473 to its port artifacts directory
> 2016-04-26 16:11:50.251 [o.a.s.d.supervisor] INFO: Creating symlinks for
> worker-id: a5d51626-6e9f-4614-9ebb-a6263c140ca2 storm-id:
> Lightning-1-1461683473 for files(1): ("resources")
> *When a worker dies we correctly see some clean up and a new worker
> started...*
> 2016-04-26 16:15:35.520 [o.a.s.d.supervisor] INFO: Worker Process
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 exited with code: 20
> 2016-04-26 16:15:39.674 [o.a.s.d.supervisor] INFO: Worker Process
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:39.675 [o.a.s.d.supervisor] INFO: Shutting down and clearing
> state for id a5d51626-6e9f-4614-9ebb-a6263c140ca2. Current supervisor time:
> 1461683739. State: :timed-out, Heartbeat: {:time-secs 1461683734, :storm-id
> "Lightning-1-1461683473", :executors [[12 12] [54 54] [42 42] [24 24] [18 18]
> [6 6] [48 48] [30 30] [-1 -1] [36 36]], :port 6700}
> 2016-04-26 16:15:39.676 [o.a.s.d.supervisor] INFO: Shutting down
> 477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:39.676 [o.a.s.config] INFO: GET worker-user
> a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:39.677 [o.a.s.d.supervisor] INFO: Worker Process
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:39.681 [o.a.s.d.supervisor] INFO: Worker Process
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:39.857 [o.a.s.util] INFO: Error when trying to kill 1352.
> Process is probably already dead.
> 2016-04-26 16:15:39.955 [o.a.s.util] INFO: Error when trying to kill 2372.
> Process is probably already dead.
> 2016-04-26 16:15:40.009 [o.a.s.util] INFO: Error when trying to kill 4932.
> Process is probably already dead.
> 2016-04-26 16:15:40.009 [o.a.s.d.supervisor] INFO: Sleep 10 seconds for
> execution of cleanup threads on worker.
> 2016-04-26 16:15:49.677 [o.a.s.d.supervisor] INFO: Worker Process
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:49.679 [o.a.s.d.supervisor] INFO: Worker Process
> a5d51626-6e9f-4614-9ebb-a6263c140ca2 has died!
> 2016-04-26 16:15:50.056 [o.a.s.util] INFO: Error when trying to kill 1352.
> Process is probably already dead.
> 2016-04-26 16:15:50.119 [o.a.s.util] INFO: Error when trying to kill 2372.
> Process is probably already dead.
> 2016-04-26 16:15:50.175 [o.a.s.util] INFO: Error when trying to kill 4932.
> Process is probably already dead.
> 2016-04-26 16:15:50.257 [o.a.s.config] INFO: REMOVE worker-user
> a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Shut down
> 477ae22e-1a2b-4ea3-afd5-cb969f25e732:a5d51626-6e9f-4614-9ebb-a6263c140ca2
> 2016-04-26 16:15:50.257 [o.a.s.d.supervisor] INFO: Launching worker with
> assignment {:storm-id "Lightning-1-1461683473", :executors [[12 12] [54 54]
> [42 42] [24 24] [18 18] [6 6] [48 48] [30 30] [36 36]], :resources
> #object[org.apache.storm.generated.WorkerResources 0x20e1ad4f
> "WorkerResources(mem_on_heap:0.0, mem_off_heap:0.0, cpu:0.0)"]} for this
> supervisor 477ae22e-1a2b-4ea3-afd5-cb969f25e732 on port 6700 with id
> e413447b-c9ca-417d-8e55-e10dd0edc6a4
> *When the worker has been cleaned up, it seems the folders that the symlinks
> are pointing to are also cleaned (this maybe a windows only problem)*
> *This is bad as it deletes the contents of the "resources" directory and
> hence any multilang stuff that was in those directories*
> *also I think STORM-876 indroduced this problem*
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)