[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4731: - Fix Version/s: 2.8.2 I committed this to branch-2.8 and branch-2.8.2 as well. > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Colin P. McCabe >Priority: Blocker > Fix For: 2.9.0, 3.0.0-alpha1, 2.8.2 > > Attachments: YARN-4731.001.patch, YARN-4731.002.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ > -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* > drwxr-s--- 2 hdfs hadoop 4096
[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4731: Assignee: Colin Patrick McCabe (was: Varun Vasudev) > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Colin Patrick McCabe >Priority: Blocker > Attachments: YARN-4731.001.patch, YARN-4731.002.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ > -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/ > {noformat} -- This message was sent by
[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated YARN-4731: --- Attachment: YARN-4731.002.patch > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Varun Vasudev >Priority: Blocker > Attachments: YARN-4731.001.patch, YARN-4731.002.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ > -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/ > {noformat} -- This message was sent by Atlassian JIRA
[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated YARN-4731: --- Attachment: (was: YARN-4731.001.patch) > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Varun Vasudev >Priority: Blocker > Attachments: YARN-4731.001.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ > -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated YARN-4731: --- Attachment: YARN-4731.001.patch Here is a patch which changes {{recursive_unlink_children}} to skip removing symlinks. It doesn't open up a TOCTOU security issue, since it opens files with {{O_NOFOLLOW}} after doing the symlink check. I added a unit test case for {{recursive_unlink_children}}. > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Varun Vasudev >Priority: Blocker > Attachments: YARN-4731.001.patch, YARN-4731.001.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> >
[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated YARN-4731: --- Summary: container-executor should not follow symlinks in recursive_unlink_children (was: Linux container executor fails to delete nmlocal folders) > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Varun Vasudev >Priority: Blocker > Attachments: YARN-4731.001.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ > -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* > drwxr-s--- 2