fishhami opened a new issue #6250: URL: https://github.com/apache/dolphinscheduler/issues/6250
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened **After I upgraded to 1.3.8, the task flow that was normally executed by the previous version 1.3.6 failed to execute. The following log appeared:** > [INFO] 2021-09-17 15:02:39.717 org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor:[110] - received command : TaskExecuteRequestCommand{taskExecutionContext='{"cmdTypeIfComplement":7,"dataxTaskExecutionContext":{"dataSourceId":0,"dataTargetId":0,"sourcetype":0,"targetType":0},"executorId":1,"procedureTaskExecutionContext":{},"processDefineId":610,"processId":0,"processInstanceId":715,"projectId":2,"queue":"batch","resources":{"/shell/dws/dws_execute_agg.sh":"root"},"sqlTaskExecutionContext":{"warningGroupId":0},"sqoopTaskExecutionContext":{"dataSourceId":0,"dataTargetId":0,"sourcetype":0,"targetType":0},"taskInstanceId":920,"taskJson":"{\"conditionResult\":\"{\\\"successNode\\\":[\\\"\\\"],\\\"failedNode\\\":[\\\"\\\"]}\",\"conditionsTask\":false,\"depList\":[],\"dependence\":\"{}\",\"forbidden\":false,\"id\":\"tasks-85378\",\"maxRetryTimes\":0,\"name\":\"targetHistoryAgg\",\"params\":\"{\\\"rawScript\\\":\\\"sh shell/dws/dws_execute_agg.sh ${day} ${yesterda y}\\\",\\\"localParams\\\":[{\\\"prop\\\":\\\"day\\\",\\\"direct\\\":\\\"IN\\\",\\\"type\\\":\\\"VARCHAR\\\",\\\"value\\\":\\\"$[yyyy-MM-dd-1]\\\"},{\\\"prop\\\":\\\"yesterday\\\",\\\"direct\\\":\\\"IN\\\",\\\"type\\\":\\\"VARCHAR\\\",\\\"value\\\":\\\"$[yyyy-MM-dd-2]\\\"}],\\\"resourceList\\\":[{\\\"res\\\":\\\"shell/dws/dws_execute_agg.sh\\\",\\\"name\\\":\\\"dws_execute_agg.sh\\\",\\\"id\\\":451}]}\",\"preTasks\":\"[]\",\"retryInterval\":1,\"runFlag\":\"NORMAL\",\"taskInstancePriority\":\"MEDIUM\",\"taskTimeoutParameter\":{\"enable\":false,\"interval\":0},\"timeout\":\"{\\\"enable\\\":false,\\\"strategy\\\":\\\"\\\"}\",\"type\":\"SHELL\",\"workerGroup\":\"default\"}","taskName":"targetHistoryAgg","taskTimeout":0,"taskTimeoutStrategy":0,"taskType":"SHELL","tenantCode":"root","workerGroup":"default"}'} > [INFO] 2021-09-17 15:02:39.745 org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor:[137] - task instance local execute path : /exec/process/2/610/715/920 > [ERROR] 2021-09-17 15:02:39.750 org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor:[110] - create execLocalPath : /exec/process/2/610/715/920 > java.io.IOException: Unable to create directory /exec/process/2/610/715/920 > at org.apache.commons.io.FileUtils.forceMkdir(FileUtils.java:2384) > at org.apache.dolphinscheduler.common.utils.FileUtils.createWorkDirAndUserIfAbsent(FileUtils.java:170) > at org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor.process(TaskExecuteProcessor.java:142) > at org.apache.dolphinscheduler.remote.handler.NettyServerHandler$1.run(NettyServerHandler.java:134) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [ERROR] 2021-09-17 15:02:39.751 - [taskAppId=TASK-610-715-920]:[110] - create execLocalPath : /exec/process/2/610/715/920 > java.io.IOException: Unable to create directory /exec/process/2/610/715/920 > at org.apache.commons.io.FileUtils.forceMkdir(FileUtils.java:2384) > at org.apache.dolphinscheduler.common.utils.FileUtils.createWorkDirAndUserIfAbsent(FileUtils.java:170) > at org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor.process(TaskExecuteProcessor.java:142) > at org.apache.dolphinscheduler.remote.handler.NettyServerHandler$1.run(NettyServerHandler.java:134) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [INFO] 2021-09-17 15:02:39.777 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[112] - script path : /exec/process/2/610/715/920 > [INFO] 2021-09-17 15:02:39.808 org.apache.dolphinscheduler.common.utils.PropertyUtils:[140] - For input string: "" > java.lang.NumberFormatException: For input string: "" > at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:592) > at java.lang.Integer.parseInt(Integer.java:615) > at org.apache.dolphinscheduler.common.utils.PropertyUtils.getInt(PropertyUtils.java:138) > at org.apache.dolphinscheduler.common.utils.HadoopUtils.<clinit>(HadoopUtils.java:79) > at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.downloadResource(TaskExecuteThread.java:291) > at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:117) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [INFO] 2021-09-17 15:02:39.824 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[293] - get resource file from hdfs :/dolphinscheduler/root/resources/shell/dws/dws_execute_agg.sh > [ERROR] 2021-09-17 15:02:40.658 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[296] - /exec/process/2/610/715/920/shell/dws/dws_execute_agg.sh (没有那个文件或目录) > java.io.FileNotFoundException: /exec/process/2/610/715/920/shell/dws/dws_execute_agg.sh (没有那个文件或目录) > at java.io.FileOutputStream.open0(Native Method) > at java.io.FileOutputStream.open(FileOutputStream.java:270) > at java.io.FileOutputStream.<init>(FileOutputStream.java:213) > at java.io.FileOutputStream.<init>(FileOutputStream.java:162) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:485) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:465) > at org.apache.dolphinscheduler.common.utils.HadoopUtils.copyHdfsToLocal(HadoopUtils.java:341) > at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.downloadResource(TaskExecuteThread.java:294) > at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:117) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [ERROR] 2021-09-17 15:02:40.658 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[149] - task scheduler failure > java.lang.RuntimeException: /exec/process/2/610/715/920/shell/dws/dws_execute_agg.sh (没有那个文件或目录) > at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.downloadResource(TaskExecuteThread.java:297) > at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:117) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > [INFO] 2021-09-17 15:02:40.664 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[185] - develop mode is: false > [INFO] 2021-09-17 15:02:40.665 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[203] - exec local path: /exec/process/2/610/715/920 cleared. > [INFO] 2021-09-17 15:09:31.929 org.apache.dolphinscheduler.server.worker.WorkerServer:[151] - worker server is stopping ..., cause : shutdownHook > [INFO] 2021-09-17 15:09:31.936 org.apache.dolphinscheduler.remote.NettyRemotingClient:[403] - netty client closed > [INFO] 2021-09-17 15:09:31.936 org.apache.dolphinscheduler.service.log.LogClientService:[59] - logger client closed > [INFO] 2021-09-17 15:09:34.934 org.apache.dolphinscheduler.remote.NettyRemotingServer:[237] - netty server closed > [INFO] 2021-09-17 15:09:34.943 org.apache.dolphinscheduler.server.worker.registry.WorkerRegistry:[141] - worker node : 10.100.0.225:1234 unRegistry from ZK /dolphinscheduler/nodes/worker/default/10.100.0.225:1234. > [INFO] 2021-09-17 15:09:34.944 org.apache.dolphinscheduler.server.worker.registry.WorkerRegistry:[144] - heartbeat executor shutdown > > **When I roll back to version 1.3.6, the task can be executed normally。** ### What you expected to happen > [INFO] 2021-09-17 15:16:36.585 org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor:[110] - received command : TaskExecuteRequestCommand{taskExecutionContext='{"cmdTypeIfComplement":7,"dataxTaskExecutionContext":{"dataSourceId":0,"dataTargetId":0,"sourcetype":0,"targetType":0},"executorId":1,"procedureTaskExecutionContext":{},"processDefineId":610,"processId":0,"processInstanceId":715,"projectId":2,"queue":"batch","resources":{"/shell/dws/dws_execute_agg.sh":"root"},"sqlTaskExecutionContext":{"warningGroupId":0},"sqoopTaskExecutionContext":{"dataSourceId":0,"dataTargetId":0,"sourcetype":0,"targetType":0},"taskInstanceId":922,"taskJson":"{\"conditionResult\":\"{\\\"successNode\\\":[\\\"\\\"],\\\"failedNode\\\":[\\\"\\\"]}\",\"conditionsTask\":false,\"depList\":[],\"dependence\":\"{}\",\"forbidden\":false,\"id\":\"tasks-85378\",\"maxRetryTimes\":0,\"name\":\"targetHistoryAgg\",\"params\":\"{\\\"rawScript\\\":\\\"sh shell/dws/dws_execute_agg.sh ${day} ${yesterda y}\\\",\\\"localParams\\\":[{\\\"prop\\\":\\\"day\\\",\\\"direct\\\":\\\"IN\\\",\\\"type\\\":\\\"VARCHAR\\\",\\\"value\\\":\\\"$[yyyy-MM-dd-1]\\\"},{\\\"prop\\\":\\\"yesterday\\\",\\\"direct\\\":\\\"IN\\\",\\\"type\\\":\\\"VARCHAR\\\",\\\"value\\\":\\\"$[yyyy-MM-dd-2]\\\"}],\\\"resourceList\\\":[{\\\"res\\\":\\\"shell/dws/dws_execute_agg.sh\\\",\\\"name\\\":\\\"dws_execute_agg.sh\\\",\\\"id\\\":451}]}\",\"preTasks\":\"[]\",\"retryInterval\":1,\"runFlag\":\"NORMAL\",\"taskInstancePriority\":\"MEDIUM\",\"taskTimeoutParameter\":{\"enable\":false,\"interval\":0},\"timeout\":\"{\\\"enable\\\":false,\\\"strategy\\\":\\\"\\\"}\",\"type\":\"SHELL\",\"workerGroup\":\"default\"}","taskName":"targetHistoryAgg","taskTimeout":0,"taskTimeoutStrategy":0,"taskType":"SHELL","tenantCode":"root","workerGroup":"default"}'} > [INFO] 2021-09-17 15:16:36.606 org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor:[137] - task instance local execute path : /tmp/dolphinscheduler/exec/process/2/610/715/922 > [INFO] 2021-09-17 15:16:36.611 org.apache.dolphinscheduler.common.utils.FileUtils:[115] - create dir success /tmp/dolphinscheduler/exec/process/2/610/715/922 > [INFO] 2021-09-17 15:16:36.611 - [taskAppId=TASK-610-715-922]:[115] - create dir success /tmp/dolphinscheduler/exec/process/2/610/715/922 > [INFO] 2021-09-17 15:16:36.632 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[106] - script path : /tmp/dolphinscheduler/exec/process/2/610/715/922 > [INFO] 2021-09-17 15:16:36.691 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[269] - get resource file from hdfs :/dolphinscheduler/root/resources/shell/dws/dws_execute_agg.sh **The above is the log of the normal execution. I observed that the two logs are different when creating the local path, one is "/tmp/dolphinscheduler/exec/process/2/610/715/922", the other is "/exec/process/2/610/715/922",I am not sure if it is related to this** ### How to reproduce Upgrade from 1.3.6 to 1.3.8 ### Anything else _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
