gjzhappy opened a new issue, #14203: URL: https://github.com/apache/dolphinscheduler/issues/14203
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened Hi, I try to run a yarn job by shell script, the user I submitted this task is dolphinscheduler, which already had the required sudo permission, something weird is that this job can be successful sometime, but some time not. I wonder know why. Here is my log: [LOG-PATH]: /opt/software/dolphinscheduler_server/worker-server/logs/20230525/9670932225760_4-161-310.log, [HOST]: Host{address='192.168.16.21:1234', ip='192.168.16.21', port=1234} [INFO] 2023-05-25 05:50:30.772 +0000 - Begin to pulling task [INFO] 2023-05-25 05:50:30.774 +0000 - Begin to initialize task [INFO] 2023-05-25 05:50:30.774 +0000 - Set task startTime: Thu May 25 05:50:30 UTC 2023 [INFO] 2023-05-25 05:50:30.774 +0000 - Set task envFile: /opt/software/dolphinscheduler_server/worker-server/conf/dolphinscheduler_env.sh [INFO] 2023-05-25 05:50:30.774 +0000 - Set task appId: 161_310 [INFO] 2023-05-25 05:50:30.775 +0000 - End initialize task [INFO] 2023-05-25 05:50:30.776 +0000 - Set task status to TaskExecutionStatus{code=1, desc='running'} [INFO] 2023-05-25 05:50:30.777 +0000 - TenantCode:root check success [ERROR] 2023-05-25 05:50:30.777 +0000 - Task execute failed, due to meet an exception org.apache.dolphinscheduler.plugin.task.api.TaskException: Cannot create process execute dir at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.createProcessLocalPathIfAbsent(TaskExecutionCheckerUtils.java:93) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.beforeExecute(WorkerTaskExecuteRunnable.java:213) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecuteRunnable.run(WorkerTaskExecuteRunnable.java:170) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.dolphinscheduler.plugin.task.api.TaskException: Set tenant directory permission failed, tenant: root at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.createDirectoryWithOwner(TaskExecutionCheckerUtils.java:157) at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.createProcessLocalPathIfAbsent(TaskExecutionCheckerUtils.java:91) ... 9 common frames omitted Caused by: java.nio.file.AccessDeniedException: /tmp/dolphinscheduler/exec/process/root/9253872458752/9670932225760_4 at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384) at java.nio.file.Files.createDirectory(Files.java:674) at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781) at java.nio.file.Files.createDirectories(Files.java:767) at org.apache.dolphinscheduler.server.worker.utils.TaskExecutionCheckerUtils.createDirectoryWithOwner(TaskExecutionCheckerUtils.java:147) ... 10 common frames omitted [INFO] 2023-05-25 05:50:30.778 +0000 - Get a exception when execute the task, will send the task execute result to master, the current task execute result is TaskExecutionStatus{code=6, desc='failure'} ### What you expected to happen I expected this job to succeed. ### How to reproduce Submit this shell work flow, the script like following: "yarn jar shellcommand.jar -jar shellcommand.jar -shell_command "touch /usr/tmp/test0525.txt; sleep 300;" -container_resources memory-mb=1024,vcores=1 -num_containers 1 -queue "root.dev"", it happens to go wrong sometime. ### Anything else _No response_ ### Version 3.1.x ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
