[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children

2017-07-21 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-4731:
-
Fix Version/s: 2.8.2

I committed this to branch-2.8 and branch-2.8.2 as well.

> container-executor should not follow symlinks in recursive_unlink_children
> --
>
> Key: YARN-4731
> URL: https://issues.apache.org/jira/browse/YARN-4731
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Bibin A Chundatt
>Assignee: Colin P. McCabe
>Priority: Blocker
> Fix For: 2.9.0, 3.0.0-alpha1, 2.8.2
>
> Attachments: YARN-4731.001.patch, YARN-4731.002.patch
>
>
> Enable LCE and CGroups
> Submit a mapreduce job
> {noformat}
> 2016-02-24 18:56:46,889 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
> 2016-02-24 18:56:46,894 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 255. Privileged Execution Operation 
> Output:
> main : command provided 3
> main : run as user is dsperf
> main : requested yarn user is dsperf
> failed to rmdir job.jar: Not a directory
> Error while deleting 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01:
>  20 (Not a directory)
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor,
>  dsperf, dsperf, 3, 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01]
> 2016-02-24 18:56:46,894 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: 
> DeleteAsUser for 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
>  returned with exit code: 255
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  ExitCodeException exitCode=255:
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
> ... 10 more
> {noformat}
> As a result nodemanager-local directory are not getting deleted for each 
> application
> {noformat}
> total 36
> drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./
> drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../
> -rw--- 1 hdfs hadoop  340 Feb 25 08:25 container_tokens
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.jar -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.xml -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/
> -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh*
> drwxr-s--- 2 hdfs hadoop 4096 

[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children

2016-02-28 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-4731:

Assignee: Colin Patrick McCabe  (was: Varun Vasudev)

> container-executor should not follow symlinks in recursive_unlink_children
> --
>
> Key: YARN-4731
> URL: https://issues.apache.org/jira/browse/YARN-4731
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Bibin A Chundatt
>Assignee: Colin Patrick McCabe
>Priority: Blocker
> Attachments: YARN-4731.001.patch, YARN-4731.002.patch
>
>
> Enable LCE and CGroups
> Submit a mapreduce job
> {noformat}
> 2016-02-24 18:56:46,889 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
> 2016-02-24 18:56:46,894 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 255. Privileged Execution Operation 
> Output:
> main : command provided 3
> main : run as user is dsperf
> main : requested yarn user is dsperf
> failed to rmdir job.jar: Not a directory
> Error while deleting 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01:
>  20 (Not a directory)
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor,
>  dsperf, dsperf, 3, 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01]
> 2016-02-24 18:56:46,894 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: 
> DeleteAsUser for 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
>  returned with exit code: 255
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  ExitCodeException exitCode=255:
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
> ... 10 more
> {noformat}
> As a result nodemanager-local directory are not getting deleted for each 
> application
> {noformat}
> total 36
> drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./
> drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../
> -rw--- 1 hdfs hadoop  340 Feb 25 08:25 container_tokens
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.jar -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.xml -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/
> -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/
> {noformat}



--
This message was sent by 

[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children

2016-02-26 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated YARN-4731:
---
Attachment: YARN-4731.002.patch

> container-executor should not follow symlinks in recursive_unlink_children
> --
>
> Key: YARN-4731
> URL: https://issues.apache.org/jira/browse/YARN-4731
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Bibin A Chundatt
>Assignee: Varun Vasudev
>Priority: Blocker
> Attachments: YARN-4731.001.patch, YARN-4731.002.patch
>
>
> Enable LCE and CGroups
> Submit a mapreduce job
> {noformat}
> 2016-02-24 18:56:46,889 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
> 2016-02-24 18:56:46,894 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 255. Privileged Execution Operation 
> Output:
> main : command provided 3
> main : run as user is dsperf
> main : requested yarn user is dsperf
> failed to rmdir job.jar: Not a directory
> Error while deleting 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01:
>  20 (Not a directory)
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor,
>  dsperf, dsperf, 3, 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01]
> 2016-02-24 18:56:46,894 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: 
> DeleteAsUser for 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
>  returned with exit code: 255
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  ExitCodeException exitCode=255:
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
> ... 10 more
> {noformat}
> As a result nodemanager-local directory are not getting deleted for each 
> application
> {noformat}
> total 36
> drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./
> drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../
> -rw--- 1 hdfs hadoop  340 Feb 25 08:25 container_tokens
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.jar -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.xml -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/
> -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/
> {noformat}



--
This message was sent by Atlassian JIRA

[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children

2016-02-26 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated YARN-4731:
---
Attachment: (was: YARN-4731.001.patch)

> container-executor should not follow symlinks in recursive_unlink_children
> --
>
> Key: YARN-4731
> URL: https://issues.apache.org/jira/browse/YARN-4731
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Bibin A Chundatt
>Assignee: Varun Vasudev
>Priority: Blocker
> Attachments: YARN-4731.001.patch
>
>
> Enable LCE and CGroups
> Submit a mapreduce job
> {noformat}
> 2016-02-24 18:56:46,889 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
> 2016-02-24 18:56:46,894 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 255. Privileged Execution Operation 
> Output:
> main : command provided 3
> main : run as user is dsperf
> main : requested yarn user is dsperf
> failed to rmdir job.jar: Not a directory
> Error while deleting 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01:
>  20 (Not a directory)
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor,
>  dsperf, dsperf, 3, 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01]
> 2016-02-24 18:56:46,894 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: 
> DeleteAsUser for 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
>  returned with exit code: 255
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  ExitCodeException exitCode=255:
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
> ... 10 more
> {noformat}
> As a result nodemanager-local directory are not getting deleted for each 
> application
> {noformat}
> total 36
> drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./
> drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../
> -rw--- 1 hdfs hadoop  340 Feb 25 08:25 container_tokens
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.jar -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.xml -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/
> -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 tmp/
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children

2016-02-26 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated YARN-4731:
---
Attachment: YARN-4731.001.patch

Here is a patch which changes {{recursive_unlink_children}} to skip removing 
symlinks.  It doesn't open up a TOCTOU security issue, since it opens files 
with {{O_NOFOLLOW}} after doing the symlink check.  I added a unit test case 
for {{recursive_unlink_children}}.

> container-executor should not follow symlinks in recursive_unlink_children
> --
>
> Key: YARN-4731
> URL: https://issues.apache.org/jira/browse/YARN-4731
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Bibin A Chundatt
>Assignee: Varun Vasudev
>Priority: Blocker
> Attachments: YARN-4731.001.patch, YARN-4731.001.patch
>
>
> Enable LCE and CGroups
> Submit a mapreduce job
> {noformat}
> 2016-02-24 18:56:46,889 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
> 2016-02-24 18:56:46,894 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 255. Privileged Execution Operation 
> Output:
> main : command provided 3
> main : run as user is dsperf
> main : requested yarn user is dsperf
> failed to rmdir job.jar: Not a directory
> Error while deleting 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01:
>  20 (Not a directory)
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor,
>  dsperf, dsperf, 3, 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01]
> 2016-02-24 18:56:46,894 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: 
> DeleteAsUser for 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
>  returned with exit code: 255
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  ExitCodeException exitCode=255:
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
> ... 10 more
> {noformat}
> As a result nodemanager-local directory are not getting deleted for each 
> application
> {noformat}
> total 36
> drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./
> drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../
> -rw--- 1 hdfs hadoop  340 Feb 25 08:25 container_tokens
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.jar -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.xml -> 
> 

[jira] [Updated] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children

2016-02-26 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated YARN-4731:
---
Summary: container-executor should not follow symlinks in 
recursive_unlink_children  (was: Linux container executor fails to delete 
nmlocal folders)

> container-executor should not follow symlinks in recursive_unlink_children
> --
>
> Key: YARN-4731
> URL: https://issues.apache.org/jira/browse/YARN-4731
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Bibin A Chundatt
>Assignee: Varun Vasudev
>Priority: Blocker
> Attachments: YARN-4731.001.patch
>
>
> Enable LCE and CGroups
> Submit a mapreduce job
> {noformat}
> 2016-02-24 18:56:46,889 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
> 2016-02-24 18:56:46,894 WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
>  Shell execution returned exit code: 255. Privileged Execution Operation 
> Output:
> main : command provided 3
> main : run as user is dsperf
> main : requested yarn user is dsperf
> failed to rmdir job.jar: Not a directory
> Error while deleting 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01:
>  20 (Not a directory)
> Full command array for failed execution:
> [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor,
>  dsperf, dsperf, 3, 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01]
> 2016-02-24 18:56:46,894 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: 
> DeleteAsUser for 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01
>  returned with exit code: 255
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
>  ExitCodeException exitCode=255:
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199)
> at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569)
> at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: ExitCodeException exitCode=255:
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:927)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150)
> ... 10 more
> {noformat}
> As a result nodemanager-local directory are not getting deleted for each 
> application
> {noformat}
> total 36
> drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./
> drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../
> -rw--- 1 hdfs hadoop  340 Feb 25 08:25 container_tokens
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.jar -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/
> lrwxrwxrwx 1 hdfs hadoop  111 Feb 25 08:25 job.xml -> 
> /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml*
> drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/
> -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh*
> drwxr-s--- 2