caishunfeng opened a new issue #6829: URL: https://github.com/apache/dolphinscheduler/issues/6829
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened version: 2.0 ``` [ERROR] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.common.utils.OSUtils:[175] - /etc/passwd (Too many open files) java.io.FileNotFoundException: /etc/passwd (Too many open files) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:93) at org.apache.dolphinscheduler.common.utils.OSUtils.getUserListFromLinux(OSUtils.java:189) at org.apache.dolphinscheduler.common.utils.OSUtils.getUserList(OSUtils.java:172) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [ERROR] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[141] - tenantCode: root does not exist [INFO] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[232] - develop mode is: false [ERROR] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[252] - delete exec dir failed : Failed to list contents of /tmp/dolphinscheduler/exec/process/851650632065024/851651851886592_1/3524626/4979296 java.io.IOException: Failed to list contents of /tmp/dolphinscheduler/exec/process/851650632065024/851651851886592_1/3524626/4979296 at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1647) at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.clearTaskExecPath(TaskExecuteThread.java:249) at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:220) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` ``` [root@ds8 apache-dolphinscheduler-2.0.1-alpha-SNAPSHOT-bin]# ulimit -n 65535 [root@ds8 apache-dolphinscheduler-2.0.1-alpha-SNAPSHOT-bin]# jps 3767833 Jps 3767487 WorkerServer [root@ds8 apache-dolphinscheduler-2.0.1-alpha-SNAPSHOT-bin]# lsof -p 3767487 | wc -l 66016 ``` When I run a dryRun model by more than 6w+ tasks, I found that worker had many `Too many open files` error. It seems like worker didn't close files, because open files number is continued growth even though tasks are fail and finish. ### What you expected to happen Worker can close file normally. ### How to reproduce Run 6w+ tasks with dryRun Model. ### Anything else _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
