Sunsy Sun created STORM-3784:
--------------------------------
Summary: my supervisor will shut down on 2:00 am everyday
Key: STORM-3784
URL: https://issues.apache.org/jira/browse/STORM-3784
Project: Apache Storm
Issue Type: Bug
Components: storm-server
Affects Versions: 2.1.0
Environment: centos 7 x64
Reporter: Sunsy Sun
The cluster has one nimbus and two supervisors.one of the supervisors is alone
with nimbus.
I deployed two topology that PradarLinkTopology and PradarLogTopology.
PradarLogTopology run with 4 workers.PradarLinkTopology run with 1 workers.
on 2:00 am everyday, all supervisors will shut down,i havn't find out the
reason.
I try to clean up the status directory,but the problem still exsit.
this is my supervisor.log
{code:java}
//代码占位符
2021-07-21 02:03:42.070 o.a.s.u.Utils Thread-17 [INFO] Worker Process
dcae9231-4be4-4842-9ed0-988e1b8a2b28:Error occurred during initialization of
VM2021-07-21 02:03:42.070 o.a.s.u.Utils Thread-17 [INFO] Worker Process
dcae9231-4be4-4842-9ed0-988e1b8a2b28:Error occurred during initialization of
VM2021-07-21 02:03:42.071 o.a.s.u.Utils Thread-17 [INFO] Worker Process
dcae9231-4be4-4842-9ed0-988e1b8a2b28:java.lang.Error: Properties init: Could
not determine current working directory.2021-07-21 02:03:42.071 o.a.s.u.Utils
Thread-17 [INFO] Worker Process dcae9231-4be4-4842-9ed0-988e1b8a2b28: at
java.lang.System.initProperties(Native Method)2021-07-21 02:03:42.071
o.a.s.u.Utils Thread-17 [INFO] Worker Process
dcae9231-4be4-4842-9ed0-988e1b8a2b28: at
java.lang.System.initializeSystemClass(System.java:1166)2021-07-21 02:03:42.071
o.a.s.u.Utils Thread-17 [INFO] Worker Process
dcae9231-4be4-4842-9ed0-988e1b8a2b28:2021-07-21 02:03:42.323
o.a.s.d.s.BasicContainer SLOT_6702 [INFO] Removed Worker ID
dcae9231-4be4-4842-9ed0-988e1b8a2b282021-07-21 02:03:42.329 o.a.s.d.s.Slot
SLOT_6702 [INFO] STATE kill msInState: 68588
topo:PradarLogTopology-3-1626751922 worker:null -> empty msInState: 32021-07-21
02:03:42.329 o.a.s.d.s.Slot SLOT_6702 [INFO] SLOT 6702: Changing current
assignment from LocalAssignment(topology_id:PradarLogTopology-3-1626751922,
executors:[ExecutorInfo(task_start:4, task_end:4), ExecutorInfo(task_start:1,
task_end:1)], resources:WorkerResources(mem_on_heap:256.0, mem_off_heap:0.0,
cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0,
resources:{offheap.memory.mb=0.0, onheap.memory.mb=256.0,
cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null2021-07-21
02:03:42.353 o.a.s.d.s.Supervisor pool-10-thread-1 [WARN] Topology config is
not localized yet...2021-07-21 02:03:42.449 o.a.s.d.s.Slot SLOT_6700 [INFO]
SLOT 6700 all processes are dead...2021-07-21 02:03:42.449 o.a.s.d.s.Container
SLOT_6700 [INFO] Cleaning up
8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86:b7963273-452a-43af-bc00-d814e0629f962021-07-21
02:03:42.450 o.a.s.d.s.Container SLOT_6700 [INFO] GET worker-user for
b7963273-452a-43af-bc00-d814e0629f962021-07-21 02:03:42.450
o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/pids/163262021-07-21
02:03:43.322 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad8/pids2021-07-21
02:03:43.322 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad8/tmp2021-07-21
02:03:45.209 o.a.s.d.s.BasicContainer Thread-17 [INFO] Worker Process
dcae9231-4be4-4842-9ed0-988e1b8a2b28 exited with code: 12021-07-21 02:03:45.224
o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21
02:03:45.224 o.a.s.d.s.Supervisor pool-10-thread-7 [WARN] Topology config is
not localized yet...2021-07-21 02:03:45.224 o.a.s.d.s.Container SLOT_6701
[INFO] REMOVE worker-user 26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21
02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/heartbeats2021-07-21
02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers-users/26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21
02:03:45.224 o.a.s.t.ProcessFunction pool-10-thread-7 [ERROR] Internal error
processing
sendSupervisorWorkerHeartbeatorg.apache.storm.utils.WrappedNotAliveException:
PradarLinkTopology-2-1626337413 does not appear to be alive, you should
probably exit at
org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:442)
~[storm-server-2.1.0.jar:2.1.0] at
org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374)
~[storm-client-2.1.0.jar:2.1.0] at
org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353)
~[storm-client-2.1.0.jar:2.1.0] at
org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38)
[storm-shaded-deps-2.1.0.jar:2.1.0] at
org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
[storm-shaded-deps-2.1.0.jar:2.1.0] at
org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:174)
[storm-client-2.1.0.jar:2.1.0] at
org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
[storm-shaded-deps-2.1.0.jar:2.1.0] at
org.apache.storm.thrift.server.Invocation.run(Invocation.java:18)
[storm-shaded-deps-2.1.0.jar:2.1.0] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_201] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]2021-07-21
02:03:45.225 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/pids2021-07-21
02:03:45.225 o.a.s.d.s.BasicContainer Thread-16 [INFO] Worker Process
b7963273-452a-43af-bc00-d814e0629f96 exited with code: 2542021-07-21
02:03:45.225 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/tmp2021-07-21
02:03:45.226 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f962021-07-21
02:03:45.226 o.a.s.d.s.Container SLOT_6700 [INFO] REMOVE worker-user
b7963273-452a-43af-bc00-d814e0629f962021-07-21 02:03:45.226
o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path
/data/apache-storm-2.1.0/status/workers-users/b7963273-452a-43af-bc00-d814e0629f962021-07-21
02:03:45.227 o.a.s.d.s.BasicContainer SLOT_6701 [INFO] Removed Worker ID
26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21 02:03:45.228
o.a.s.d.s.BasicContainer SLOT_6700 [INFO] Removed Worker ID
b7963273-452a-43af-bc00-d814e0629f962021-07-21 02:03:45.229 o.a.s.d.s.Slot
SLOT_6700 [INFO] STATE kill msInState: 81385
topo:PradarLogTopology-3-1626751922 worker:null -> empty msInState: 02021-07-21
02:03:45.229 o.a.s.d.s.Slot SLOT_6700 [INFO] SLOT 6700: Changing current
assignment from LocalAssignment(topology_id:PradarLogTopology-3-1626751922,
executors:[ExecutorInfo(task_start:3, task_end:3)],
resources:WorkerResources(mem_on_heap:128.0, mem_off_heap:0.0, cpu:10.0,
shared_mem_on_heap:0.0, shared_mem_off_heap:0.0,
resources:{offheap.memory.mb=0.0, onheap.memory.mb=128.0,
cpu.pcore.percent=10.0}, shared_resources:{}), owner:root) to null2021-07-21
02:03:45.230 o.a.s.d.s.Slot SLOT_6701 [INFO] STATE kill-and-relaunch msInState:
95356 topo:PradarLogTopology-3-1626751922 worker:null ->
waiting-for-blob-localization msInState: 12021-07-21 02:03:45.231
o.a.s.d.s.Slot SLOT_6701 [INFO] SLOT 6701: Changing current assignment from
LocalAssignment(topology_id:PradarLogTopology-3-1626751922,
executors:[ExecutorInfo(task_start:3, task_end:3)],
resources:WorkerResources(mem_on_heap:128.0, mem_off_heap:0.0, cpu:10.0,
shared_mem_on_heap:0.0, shared_mem_off_heap:0.0,
resources:{offheap.memory.mb=0.0, onheap.memory.mb=128.0,
cpu.pcore.percent=10.0}, shared_resources:{}), owner:root) to null2021-07-21
02:03:45.231 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE empty msInState: 2 ->
waiting-for-blob-localization msInState: 02021-07-21 02:03:45.232
o.a.s.d.s.Slot SLOT_6701 [ERROR] Error when processing
eventjava.io.FileNotFoundException: File
'/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLinkTopology-4-1626751925/stormconf.ser'
does not exist at
org.apache.storm.shade.org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:297)
~[storm-shaded-deps-2.1.0.jar:2.1.0] at
org.apache.storm.shade.org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851)
~[storm-shaded-deps-2.1.0.jar:2.1.0] at
org.apache.storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java:303)
~[storm-client-2.1.0.jar:2.1.0] at
org.apache.storm.utils.ConfigUtils.readSupervisorStormConfImpl(ConfigUtils.java:464)
~[storm-client-2.1.0.jar:2.1.0] at
org.apache.storm.utils.ConfigUtils.readSupervisorStormConf(ConfigUtils.java:298)
~[storm-client-2.1.0.jar:2.1.0] at
org.apache.storm.localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:351)
~[storm-server-2.1.0.jar:2.1.0] at
org.apache.storm.localizer.AsyncLocalizer.releaseSlotFor(AsyncLocalizer.java:452)
~[storm-server-2.1.0.jar:2.1.0] at
org.apache.storm.daemon.supervisor.Slot.handleWaitingForBlobLocalization(Slot.java:440)
~[storm-server-2.1.0.jar:2.1.0] at
org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:228)
~[storm-server-2.1.0.jar:2.1.0] at
org.apache.storm.daemon.supervisor.Slot.run(Slot.java:931)
[storm-server-2.1.0.jar:2.1.0]2021-07-21 02:03:45.234 o.a.s.u.Utils SLOT_6701
[ERROR] Halting process: Error when processing an
eventjava.lang.RuntimeException: Halting process: Error when processing an
event at org.apache.storm.utils.Utils.exitProcess(Utils.java:512)
[storm-client-2.1.0.jar:2.1.0] at
org.apache.storm.daemon.supervisor.Slot.run(Slot.java:978)
[storm-server-2.1.0.jar:2.1.0]2021-07-21 02:03:45.235 o.a.s.d.s.BasicContainer
SLOT_6700 [INFO] Created Worker ID
68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21 02:03:45.236 o.a.s.d.s.Container
SLOT_6700 [INFO] Setting up
8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86:68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21
02:03:45.236 o.a.s.d.s.Container SLOT_6700 [INFO] GET worker-user for
68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21 02:03:45.240 o.a.s.d.s.Container
SLOT_6700 [INFO] SET worker-user 68102ac7-a341-4d84-b1aa-db0f72934f99
root2021-07-21 02:03:45.241 o.a.s.d.s.Container SLOT_6700 [INFO] Creating
symlinks for worker-id: 68102ac7-a341-4d84-b1aa-db0f72934f99 storm-id:
PradarLogTopology-3-1626751922 for files(1): [resources]2021-07-21 02:03:45.241
o.a.s.d.s.BasicContainer SLOT_6700 [INFO] Launching worker with assignment
LocalAssignment(topology_id:PradarLogTopology-3-1626751922,
executors:[ExecutorInfo(task_start:4, task_end:4), ExecutorInfo(task_start:1,
task_end:1)], resources:WorkerResources(mem_on_heap:256.0, mem_off_heap:0.0,
cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0,
resources:{offheap.memory.mb=0.0, onheap.memory.mb=256.0,
cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) for this supervisor
8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86 on port 6700 with id
68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21 02:03:45.243
o.a.s.d.s.BasicContainer SLOT_6700 [INFO] Launching worker with command:
'/usr/local/java/bin/java' '-cp'
'/data/apache-storm-2.1.0/lib-worker/*:/data/apache-storm-2.1.0/extlib/*:/data/apache-storm-2.1.0/conf:/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/stormjar.jar'
'-Xmx64m' '-Dlogging.sensitivity=S3' '-Dlogfile.name=worker.log'
'-Dstorm.home=/data/apache-storm-2.1.0'
'-Dworkers.artifacts=/data/apache-storm-2.1.0/logs/workers-artifacts'
'-Dstorm.id=PradarLogTopology-3-1626751922'
'-Dworker.id=68102ac7-a341-4d84-b1aa-db0f72934f99' '-Dworker.port=6700'
'-Dstorm.log.dir=/data/apache-storm-2.1.0/logs'
'-DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector'
'-Dstorm.local.dir=/data/apache-storm-2.1.0/status'
'-Dworker.memory_limit_mb=256'
'-Dlog4j.configurationFile=/data/apache-storm-2.1.0/log4j2/worker.xml'
'org.apache.storm.LogWriter' '/usr/local/java/bin/java' '-server'
'-Dlogging.sensitivity=S3' '-Dlogfile.name=worker.log'
'-Dstorm.home=/data/apache-storm-2.1.0'
'-Dworkers.artifacts=/data/apache-storm-2.1.0/logs/workers-artifacts'
'-Dstorm.id=PradarLogTopology-3-1626751922'
'-Dworker.id=68102ac7-a341-4d84-b1aa-db0f72934f99' '-Dworker.port=6700'
'-Dstorm.log.dir=/data/apache-storm-2.1.0/logs'
'-DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector'
'-Dstorm.local.dir=/data/apache-storm-2.1.0/status'
'-Dworker.memory_limit_mb=256'
'-Dlog4j.configurationFile=/data/apache-storm-2.1.0/log4j2/worker.xml'
'-Xmx256m' '-XX:+PrintGCDetails' '-Xloggc:artifacts/gc.log'
'-XX:+PrintGCDateStamps' '-XX:+PrintGCTimeStamps' '-XX:+UseGCLogFileRotation'
'-XX:NumberOfGCLogFiles=10' '-XX:GCLogFileSize=1M'
'-XX:+HeapDumpOnOutOfMemoryError' '-XX:HeapDumpPath=artifacts/heapdump'
'-Xms2g' '-Xmx2g' '-XX:MaxDirectMemorySize=512m'
'-XX:+HeapDumpOnOutOfMemoryError' '-XX:HeapDumpPath=java.hprof'
'-XX:MetaspaceSize=256m' '-XX:MaxMetaspaceSize=256m'
'-XX:-OmitStackTraceInFastThrow'
'-Djava.library.path=/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/resources/Linux-amd64:/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/resources:/usr/local/lib:/opt/local/lib:/usr/lib:/usr/lib64'
'-Dstorm.conf.file=' '-Dstorm.options='
'-Djava.io.tmpdir=/data/apache-storm-2.1.0/status/workers/68102ac7-a341-4d84-b1aa-db0f72934f99/tmp'
'-cp'
'/data/apache-storm-2.1.0/lib-worker/*:/data/apache-storm-2.1.0/extlib/*:/data/apache-storm-2.1.0/conf:/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/stormjar.jar'
'org.apache.storm.daemon.worker.Worker' 'PradarLogTopology-3-1626751922'
'8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86' '6628' '6700'
'68102ac7-a341-4d84-b1aa-db0f72934f99'. 2021-07-21 02:03:45.243 o.a.s.u.Utils
Thread-5 [INFO] Halting after 1 seconds2021-07-21 02:03:45.244
o.a.s.d.s.Supervisor Thread-6 [INFO] Shutting down supervisor
8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)