TongWenbin commented on issue #15283:
URL:
https://github.com/apache/dolphinscheduler/issues/15283#issuecomment-1842346225
i just run a test shell node which sleep 61s and set timeout = 1min, the
number of retry times is 3, and the retry interval is 1 min
and there are logs of each time
**origin:**
```
[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.146 [INFO] 2023-12-06 15:39:38.213 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304 [INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436
092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:39:38.216 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 105015 [INFO] 2023-12-06 15:39:39.218 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]-getOut
putLogService] - -> welcome to use bigdata scheduling system... [ERROR]
2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.IllegalThreadStateException: process hasn't exited at
java.lang.UNIXProcess.exitValue(UNIXProcess.java:421) at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:197)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenabl
eFutureTask.java:78) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) [INFO] 2023-12-06 15:40:38.217
[TaskLogInfo- - [taskAppId=TASK-11829436092416_1-2508353-10894304]] -
FINALIZE_SESSION [INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 105015
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:105015, cmd:sudo -u root kill 105015
```
**first retry:**
```
[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.146 [INFO] 2023-12-06 15:39:38.213 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304 [INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436
092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.214 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:39:38.216 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:39:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 105015 [INFO] 2023-12-06 15:39:39.218 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]-getOut
putLogService] - -> welcome to use bigdata scheduling system... [ERROR]
2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.IllegalThreadStateException: process hasn't exited at
java.lang.UNIXProcess.exitValue(UNIXProcess.java:421) at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:197)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenabl
eFutureTask.java:78) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) [INFO] 2023-12-06 15:40:38.217
[TaskLogInfo- - [taskAppId=TASK-11829436092416_1-2508353-10894304]] -
FINALIZE_SESSION [INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 105015
[INFO] 2023-12-06 15:40:38.217 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:105015, cmd:sudo -u root kill 105015 [INFO]
2023-12-06 15:40:39.225 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]-getOutputLogService] -
FINALIZE_SESSION [INFO] 2023-12-06 15:41:40.372 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTa
skList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304 [INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:41:40.373 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR
source /opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:41:40.375 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 105632 [ERROR] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task error
java.lang.RuntimeException: task execution time out at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.getRemainTime(AbstractCommandExecutor.java:431)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:181)
at org.apache.dolphinschedu
ler.plugin.task.shell.ShellTask.handle(ShellTask.java:97) at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748) [INFO] 2023-12-06 15:41:40.376
[TaskLogInfo- - [taskAppId=TASK-11829436092416_1-2508353-10894304]] - ->
welcome to use bigdata scheduling system... [INFO] 2023-12-06 15:41:40.376
[TaskLogInfo- - [taskAppId=TASK
-11829436092416_1-2508353-10894304]] - FINALIZE_SESSION [INFO] 2023-12-06
15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 105632
[INFO] 2023-12-06 15:41:40.376 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:105632, cmd:sudo -u root kill 105632
```
**second retry:**
```
[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.151 [INFO] 2023-12-06 15:42:45.506 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:42:45.506 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304 [INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436
092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:42:45.507 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:42:45.509 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 3850 [ERROR] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - she
ll task error java.lang.RuntimeException: task execution time out at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.getRemainTime(AbstractCommandExecutor.java:431)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:181)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.co
ncurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at
java.lang.Thread.run(Thread.java:748) [INFO] 2023-12-06 15:42:45.510
[TaskLogInfo- - [taskAppId=TASK-11829436092416_1-2508353-10894304]] - ->
welcome to use bigdata scheduling system... [INFO] 2023-12-06 15:42:45.510
[TaskLogInfo- - [taskAppId=TASK-11829436092416_1-2508353-10894304]] -
FINALIZE_SESSION [INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 3850
[INFO] 2023-12-06 15:42:45.510 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:3850, cmd:sudo -u root kill 3850
```
**third retry:**
```
[LOG-PATH]:
/opt/apache-dolphinscheduler-2.0.9-bin/logs/11829436092416_1/2508353/10894304.log,
[HOST]: 7.212.150.223 [INFO] 2023-12-06 15:43:50.443 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - shell task params
{"resourceList":[],"localParams":[],"rawScript":"sleep
61s","dependence":{"dependTaskList":[],"relation":"AND"},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - raw script : sleep 61s
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task execute path :
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - tenantCode user:root,
task dir:2508353_10894304 [INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436
092416_1-2508353-10894304]] - create command
file:/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:43:50.444 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - command : #!/bin/sh
BASEDIR=$(cd dirname $0; pwd) cd $BASEDIR source
/opt/apache-dolphinscheduler-2.0.9-bin/conf/env/dolphinscheduler_env.sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304_node.sh
[INFO] 2023-12-06 15:43:50.447 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - task run command: sudo
-u root sh
/tmp/dolphinscheduler/exec/process/7149515337600/11829436092416_1/2508353/10894304/2508353_10894304.command
[INFO] 2023-12-06 15:43:50.447 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - process start, process
id is: 113359 [ERROR] 2023-12-06 15:43:50.447 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - s
hell task error java.lang.RuntimeException: task execution time out at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.getRemainTime(AbstractCommandExecutor.java:431)
at
org.apache.dolphinscheduler.plugin.task.api.AbstractCommandExecutor.run(AbstractCommandExecutor.java:181)
at
org.apache.dolphinscheduler.plugin.task.shell.ShellTask.handle(ShellTask.java:97)
at
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:198)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.
concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at
java.lang.Thread.run(Thread.java:748) [INFO] 2023-12-06 15:43:50.448
[TaskLogInfo- - [taskAppId=TASK-11829436092416_1-2508353-10894304]] - ->
welcome to use bigdata scheduling system... [INFO] 2023-12-06 15:43:50.448
[TaskLogInfo- - [taskAppId=TASK-11829436092416_1-2508353-10894304]] -
FINALIZE_SESSION [INFO] 2023-12-06 15:43:50.448 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - cancel process: 113359
[INFO] 2023-12-06 15:43:50.448 [TaskLogInfo- -
[taskAppId=TASK-11829436092416_1-2508353-10894304]] - soft kill
task:2508353_10894304, process id:113359, cmd:sudo -u root kill 113359
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]