TongWenbin commented on PR #10541:
URL:
https://github.com/apache/dolphinscheduler/pull/10541#issuecomment-1842268243
> The old task instance logs are still in the same place as the new task
instance. I can successfully reproduce it in version 2.0.8-release.
i just run a test shell node which sleep 61s and set timeout = 1min, the
number of retry times is 3, and the retry interval is 1 min
and there are logs of each time
origin:
`[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.146
[INFO] 2023-12-06 15:39:38.213 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:39:38.216 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 105015
[INFO] 2023-12-06 15:39:39.218 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]-getOutputLogService] - ->
welcome to use bigdata scheduling system...
[ERROR] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.IllegalThreadStateException: process hasn't exited
at java.lang.UNIXProcess.exitValue(UNIXProcess.java:421)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:197)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - FINALIZE_SESSION
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 105015
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:105015, cmd:sudo -u root kill 105015`
first retry:
`[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.146
[INFO] 2023-12-06 15:39:38.213 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:39:38.216 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 105015
[INFO] 2023-12-06 15:39:39.218 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]-getOutputLogService] - ->
welcome to use bigdata scheduling system...
[ERROR] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.IllegalThreadStateException: process hasn't exited
at java.lang.UNIXProcess.exitValue(UNIXProcess.java:421)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:197)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - FINALIZE_SESSION
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 105015
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:105015, cmd:sudo -u root kill 105015
[INFO] 2023-12-06 15:40:39.225 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]-getOutputLogService] -
FINALIZE_SESSION
[INFO] 2023-12-06 15:41:40.372 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:41:40.375 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 105632
[ERROR] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.RuntimeException: task execution time out
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.getRemainTime(AbstractCommandExecutor.java:431)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:181)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[INFO] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - -> welcome to use
bigdata scheduling system...
[INFO] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - FINALIZE_SESSION
[INFO] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 105632
[INFO] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:105632, cmd:sudo -u root kill 105632`
second retry:
`[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.151
[INFO] 2023-12-06 15:42:45.506 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:42:45.506 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304
[INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:42:45.509 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 3850
[ERROR] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.RuntimeException: task execution time out
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.getRemainTime(AbstractCommandExecutor.java:431)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:181)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - -> welcome to use
bigdata scheduling system...
[INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - FINALIZE_SESSION
[INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 3850
[INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:3850, cmd:sudo -u root kill 3850`
third retry:
`[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.223
[INFO] 2023-12-06 15:43:50.443 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd `dirname $0`; pwd)
cd $BASEDIR
source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:43:50.447 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:43:50.447 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 113359
[ERROR] 2023-12-06 15:43:50.447 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.RuntimeException: task execution time out
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.getRemainTime(AbstractCommandExecutor.java:431)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:181)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[INFO] 2023-12-06 15:43:50.448 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - -> welcome to use
bigdata scheduling system...
[INFO] 2023-12-06 15:43:50.448 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - FINALIZE_SESSION
[INFO] 2023-12-06 15:43:50.448 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 113359
[INFO] 2023-12-06 15:43:50.448 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:113359, cmd:sudo -u root kill 113359
`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]