[ https://issues.apache.org/jira/browse/STORM-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Zowalla closed STORM-3784. ---------------------------------- Resolution: Cannot Reproduce > my supervisor will shut down on 2:00 am everyday > ------------------------------------------------ > > Key: STORM-3784 > URL: https://issues.apache.org/jira/browse/STORM-3784 > Project: Apache Storm > Issue Type: Bug > Components: storm-server > Affects Versions: 2.1.0 > Environment: centos 7 x64 > Reporter: Sunsy Sun > Priority: Major > Attachments: supervisor(1).log > > > The cluster has one nimbus and two supervisors.one of the supervisors is > alone with nimbus. > I deployed two topology that PradarLinkTopology and PradarLogTopology. > PradarLogTopology run with 4 workers.PradarLinkTopology run with 1 workers. > on 2:00 am everyday, all supervisors will shut down,i havn't find out the > reason. > I try to clean up the status directory,but the problem still exsit. > this is my supervisor.log > {code:java} > //代码占位符 > 2021-07-21 02:03:42.070 o.a.s.u.Utils Thread-17 [INFO] Worker Process > dcae9231-4be4-4842-9ed0-988e1b8a2b28:Error occurred during initialization of > VM2021-07-21 02:03:42.070 o.a.s.u.Utils Thread-17 [INFO] Worker Process > dcae9231-4be4-4842-9ed0-988e1b8a2b28:Error occurred during initialization of > VM2021-07-21 02:03:42.071 o.a.s.u.Utils Thread-17 [INFO] Worker Process > dcae9231-4be4-4842-9ed0-988e1b8a2b28:java.lang.Error: Properties init: Could > not determine current working directory.2021-07-21 02:03:42.071 o.a.s.u.Utils > Thread-17 [INFO] Worker Process dcae9231-4be4-4842-9ed0-988e1b8a2b28: at > java.lang.System.initProperties(Native Method)2021-07-21 02:03:42.071 > o.a.s.u.Utils Thread-17 [INFO] Worker Process > dcae9231-4be4-4842-9ed0-988e1b8a2b28: at > java.lang.System.initializeSystemClass(System.java:1166)2021-07-21 > 02:03:42.071 o.a.s.u.Utils Thread-17 [INFO] Worker Process > dcae9231-4be4-4842-9ed0-988e1b8a2b28:2021-07-21 02:03:42.323 > o.a.s.d.s.BasicContainer SLOT_6702 [INFO] Removed Worker ID > dcae9231-4be4-4842-9ed0-988e1b8a2b282021-07-21 02:03:42.329 o.a.s.d.s.Slot > SLOT_6702 [INFO] STATE kill msInState: 68588 > topo:PradarLogTopology-3-1626751922 worker:null -> empty msInState: > 32021-07-21 02:03:42.329 o.a.s.d.s.Slot SLOT_6702 [INFO] SLOT 6702: Changing > current assignment from > LocalAssignment(topology_id:PradarLogTopology-3-1626751922, > executors:[ExecutorInfo(task_start:4, task_end:4), ExecutorInfo(task_start:1, > task_end:1)], resources:WorkerResources(mem_on_heap:256.0, mem_off_heap:0.0, > cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=256.0, > cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null2021-07-21 > 02:03:42.353 o.a.s.d.s.Supervisor pool-10-thread-1 [WARN] Topology config is > not localized yet...2021-07-21 02:03:42.449 o.a.s.d.s.Slot SLOT_6700 [INFO] > SLOT 6700 all processes are dead...2021-07-21 02:03:42.449 > o.a.s.d.s.Container SLOT_6700 [INFO] Cleaning up > 8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86:b7963273-452a-43af-bc00-d814e0629f962021-07-21 > 02:03:42.450 o.a.s.d.s.Container SLOT_6700 [INFO] GET worker-user for > b7963273-452a-43af-bc00-d814e0629f962021-07-21 02:03:42.450 > o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/pids/163262021-07-21 > 02:03:43.322 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad8/pids2021-07-21 > 02:03:43.322 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad8/tmp2021-07-21 > 02:03:45.209 o.a.s.d.s.BasicContainer Thread-17 [INFO] Worker Process > dcae9231-4be4-4842-9ed0-988e1b8a2b28 exited with code: 12021-07-21 > 02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21 > 02:03:45.224 o.a.s.d.s.Supervisor pool-10-thread-7 [WARN] Topology config is > not localized yet...2021-07-21 02:03:45.224 o.a.s.d.s.Container SLOT_6701 > [INFO] REMOVE worker-user 26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21 > 02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/heartbeats2021-07-21 > 02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers-users/26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21 > 02:03:45.224 o.a.s.t.ProcessFunction pool-10-thread-7 [ERROR] Internal error > processing > sendSupervisorWorkerHeartbeatorg.apache.storm.utils.WrappedNotAliveException: > PradarLinkTopology-2-1626337413 does not appear to be alive, you should > probably exit at > org.apache.storm.daemon.supervisor.Supervisor$1.sendSupervisorWorkerHeartbeat(Supervisor.java:442) > ~[storm-server-2.1.0.jar:2.1.0] at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:374) > ~[storm-client-2.1.0.jar:2.1.0] at > org.apache.storm.generated.Supervisor$Processor$sendSupervisorWorkerHeartbeat.getResult(Supervisor.java:353) > ~[storm-client-2.1.0.jar:2.1.0] at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:38) > [storm-shaded-deps-2.1.0.jar:2.1.0] at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > [storm-shaded-deps-2.1.0.jar:2.1.0] at > org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:174) > [storm-client-2.1.0.jar:2.1.0] at > org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518) > [storm-shaded-deps-2.1.0.jar:2.1.0] at > org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) > [storm-shaded-deps-2.1.0.jar:2.1.0] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [?:1.8.0_201] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [?:1.8.0_201] at java.lang.Thread.run(Thread.java:748) > [?:1.8.0_201]2021-07-21 02:03:45.225 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] > Deleting path > /data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/pids2021-07-21 > 02:03:45.225 o.a.s.d.s.BasicContainer Thread-16 [INFO] Worker Process > b7963273-452a-43af-bc00-d814e0629f96 exited with code: 2542021-07-21 > 02:03:45.225 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/tmp2021-07-21 > 02:03:45.226 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f962021-07-21 > 02:03:45.226 o.a.s.d.s.Container SLOT_6700 [INFO] REMOVE worker-user > b7963273-452a-43af-bc00-d814e0629f962021-07-21 02:03:45.226 > o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path > /data/apache-storm-2.1.0/status/workers-users/b7963273-452a-43af-bc00-d814e0629f962021-07-21 > 02:03:45.227 o.a.s.d.s.BasicContainer SLOT_6701 [INFO] Removed Worker ID > 26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21 02:03:45.228 > o.a.s.d.s.BasicContainer SLOT_6700 [INFO] Removed Worker ID > b7963273-452a-43af-bc00-d814e0629f962021-07-21 02:03:45.229 o.a.s.d.s.Slot > SLOT_6700 [INFO] STATE kill msInState: 81385 > topo:PradarLogTopology-3-1626751922 worker:null -> empty msInState: > 02021-07-21 02:03:45.229 o.a.s.d.s.Slot SLOT_6700 [INFO] SLOT 6700: Changing > current assignment from > LocalAssignment(topology_id:PradarLogTopology-3-1626751922, > executors:[ExecutorInfo(task_start:3, task_end:3)], > resources:WorkerResources(mem_on_heap:128.0, mem_off_heap:0.0, cpu:10.0, > shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=128.0, > cpu.pcore.percent=10.0}, shared_resources:{}), owner:root) to null2021-07-21 > 02:03:45.230 o.a.s.d.s.Slot SLOT_6701 [INFO] STATE kill-and-relaunch > msInState: 95356 topo:PradarLogTopology-3-1626751922 worker:null -> > waiting-for-blob-localization msInState: 12021-07-21 02:03:45.231 > o.a.s.d.s.Slot SLOT_6701 [INFO] SLOT 6701: Changing current assignment from > LocalAssignment(topology_id:PradarLogTopology-3-1626751922, > executors:[ExecutorInfo(task_start:3, task_end:3)], > resources:WorkerResources(mem_on_heap:128.0, mem_off_heap:0.0, cpu:10.0, > shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=128.0, > cpu.pcore.percent=10.0}, shared_resources:{}), owner:root) to null2021-07-21 > 02:03:45.231 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE empty msInState: 2 -> > waiting-for-blob-localization msInState: 02021-07-21 02:03:45.232 > o.a.s.d.s.Slot SLOT_6701 [ERROR] Error when processing > eventjava.io.FileNotFoundException: File > '/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLinkTopology-4-1626751925/stormconf.ser' > does not exist at > org.apache.storm.shade.org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:297) > ~[storm-shaded-deps-2.1.0.jar:2.1.0] at > org.apache.storm.shade.org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851) > ~[storm-shaded-deps-2.1.0.jar:2.1.0] at > org.apache.storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java:303) > ~[storm-client-2.1.0.jar:2.1.0] at > org.apache.storm.utils.ConfigUtils.readSupervisorStormConfImpl(ConfigUtils.java:464) > ~[storm-client-2.1.0.jar:2.1.0] at > org.apache.storm.utils.ConfigUtils.readSupervisorStormConf(ConfigUtils.java:298) > ~[storm-client-2.1.0.jar:2.1.0] at > org.apache.storm.localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:351) > ~[storm-server-2.1.0.jar:2.1.0] at > org.apache.storm.localizer.AsyncLocalizer.releaseSlotFor(AsyncLocalizer.java:452) > ~[storm-server-2.1.0.jar:2.1.0] at > org.apache.storm.daemon.supervisor.Slot.handleWaitingForBlobLocalization(Slot.java:440) > ~[storm-server-2.1.0.jar:2.1.0] at > org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:228) > ~[storm-server-2.1.0.jar:2.1.0] at > org.apache.storm.daemon.supervisor.Slot.run(Slot.java:931) > [storm-server-2.1.0.jar:2.1.0]2021-07-21 02:03:45.234 o.a.s.u.Utils SLOT_6701 > [ERROR] Halting process: Error when processing an > eventjava.lang.RuntimeException: Halting process: Error when processing an > event at org.apache.storm.utils.Utils.exitProcess(Utils.java:512) > [storm-client-2.1.0.jar:2.1.0] at > org.apache.storm.daemon.supervisor.Slot.run(Slot.java:978) > [storm-server-2.1.0.jar:2.1.0]2021-07-21 02:03:45.235 > o.a.s.d.s.BasicContainer SLOT_6700 [INFO] Created Worker ID > 68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21 02:03:45.236 > o.a.s.d.s.Container SLOT_6700 [INFO] Setting up > 8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86:68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21 > 02:03:45.236 o.a.s.d.s.Container SLOT_6700 [INFO] GET worker-user for > 68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21 02:03:45.240 > o.a.s.d.s.Container SLOT_6700 [INFO] SET worker-user > 68102ac7-a341-4d84-b1aa-db0f72934f99 root2021-07-21 02:03:45.241 > o.a.s.d.s.Container SLOT_6700 [INFO] Creating symlinks for worker-id: > 68102ac7-a341-4d84-b1aa-db0f72934f99 storm-id: PradarLogTopology-3-1626751922 > for files(1): [resources]2021-07-21 02:03:45.241 o.a.s.d.s.BasicContainer > SLOT_6700 [INFO] Launching worker with assignment > LocalAssignment(topology_id:PradarLogTopology-3-1626751922, > executors:[ExecutorInfo(task_start:4, task_end:4), ExecutorInfo(task_start:1, > task_end:1)], resources:WorkerResources(mem_on_heap:256.0, mem_off_heap:0.0, > cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=256.0, > cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) for this > supervisor 8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86 on port 6700 > with id 68102ac7-a341-4d84-b1aa-db0f72934f992021-07-21 02:03:45.243 > o.a.s.d.s.BasicContainer SLOT_6700 [INFO] Launching worker with command: > '/usr/local/java/bin/java' '-cp' > '/data/apache-storm-2.1.0/lib-worker/*:/data/apache-storm-2.1.0/extlib/*:/data/apache-storm-2.1.0/conf:/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/stormjar.jar' > '-Xmx64m' '-Dlogging.sensitivity=S3' '-Dlogfile.name=worker.log' > '-Dstorm.home=/data/apache-storm-2.1.0' > '-Dworkers.artifacts=/data/apache-storm-2.1.0/logs/workers-artifacts' > '-Dstorm.id=PradarLogTopology-3-1626751922' > '-Dworker.id=68102ac7-a341-4d84-b1aa-db0f72934f99' '-Dworker.port=6700' > '-Dstorm.log.dir=/data/apache-storm-2.1.0/logs' > '-DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector' > '-Dstorm.local.dir=/data/apache-storm-2.1.0/status' > '-Dworker.memory_limit_mb=256' > '-Dlog4j.configurationFile=/data/apache-storm-2.1.0/log4j2/worker.xml' > 'org.apache.storm.LogWriter' '/usr/local/java/bin/java' '-server' > '-Dlogging.sensitivity=S3' '-Dlogfile.name=worker.log' > '-Dstorm.home=/data/apache-storm-2.1.0' > '-Dworkers.artifacts=/data/apache-storm-2.1.0/logs/workers-artifacts' > '-Dstorm.id=PradarLogTopology-3-1626751922' > '-Dworker.id=68102ac7-a341-4d84-b1aa-db0f72934f99' '-Dworker.port=6700' > '-Dstorm.log.dir=/data/apache-storm-2.1.0/logs' > '-DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector' > '-Dstorm.local.dir=/data/apache-storm-2.1.0/status' > '-Dworker.memory_limit_mb=256' > '-Dlog4j.configurationFile=/data/apache-storm-2.1.0/log4j2/worker.xml' > '-Xmx256m' '-XX:+PrintGCDetails' '-Xloggc:artifacts/gc.log' > '-XX:+PrintGCDateStamps' '-XX:+PrintGCTimeStamps' '-XX:+UseGCLogFileRotation' > '-XX:NumberOfGCLogFiles=10' '-XX:GCLogFileSize=1M' > '-XX:+HeapDumpOnOutOfMemoryError' '-XX:HeapDumpPath=artifacts/heapdump' > '-Xms2g' '-Xmx2g' '-XX:MaxDirectMemorySize=512m' > '-XX:+HeapDumpOnOutOfMemoryError' '-XX:HeapDumpPath=java.hprof' > '-XX:MetaspaceSize=256m' '-XX:MaxMetaspaceSize=256m' > '-XX:-OmitStackTraceInFastThrow' > '-Djava.library.path=/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/resources/Linux-amd64:/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/resources:/usr/local/lib:/opt/local/lib:/usr/lib:/usr/lib64' > '-Dstorm.conf.file=' '-Dstorm.options=' > '-Djava.io.tmpdir=/data/apache-storm-2.1.0/status/workers/68102ac7-a341-4d84-b1aa-db0f72934f99/tmp' > '-cp' > '/data/apache-storm-2.1.0/lib-worker/*:/data/apache-storm-2.1.0/extlib/*:/data/apache-storm-2.1.0/conf:/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLogTopology-3-1626751922/stormjar.jar' > 'org.apache.storm.daemon.worker.Worker' 'PradarLogTopology-3-1626751922' > '8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86' '6628' '6700' > '68102ac7-a341-4d84-b1aa-db0f72934f99'. 2021-07-21 02:03:45.243 o.a.s.u.Utils > Thread-5 [INFO] Halting after 1 seconds2021-07-21 02:03:45.244 > o.a.s.d.s.Supervisor Thread-6 [INFO] Shutting down supervisor > 8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)