[ https://issues.apache.org/jira/browse/MAPREDUCE-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hemanth Yamijala updated MAPREDUCE-1435: ---------------------------------------- Attachment: 1435.v4.patch Patch incorporates review comments from Amarsri and Ravi. Changes are: - I am now using ClusterWithLinuxTaskController.taskTrackerSpecialGroup as the expected group for private distributed cache files. - Added ownership and group ownership checks for public distributed cache files. Group owner for public distributed cache is the primary owner of the tasktracker. I added a ClusterWithLinuxTaskController.taskTrackerPrimaryGroup on similar lines as ClusterWithLinuxTaskController.taskTrackerSpecialGroup. However, bq. Once we add that also to the checks of public distributed cache files, then ClusterWithLinuxTaskController.checkPermissionsOnDir() can be reused for these checks also and can avoid TestTrackerDistributedCacheManager.checkPublicFilePermissions() possibly. I have not done the above. This is because the checks for permissions of private distributed cache files includes exact match of all the permissions for owner, group and others. For public distributed cache files, the code only adds 'read' and 'execute' bits for all users. Specifically, it does not modify the 'write' bits. This means that the write permissions are indeterminate (for e.g. they could depend on permissions of files in an archive which are unarchived in distributed cache). Hence, instead of reusing the model for checking permissions, I have retained the original model for checking permissions of the public cache files. I ran all task-controller tests on this, and they passed. > symlinks in cwd of the task are not handled properly after MAPREDUCE-896 > ------------------------------------------------------------------------ > > Key: MAPREDUCE-1435 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1435 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Affects Versions: 0.22.0 > Reporter: Amareshwari Sriramadasu > Assignee: Ravi Gummadi > Fix For: 0.22.0 > > Attachments: 1435.patch, 1435.v1.patch, 1435.v2.patch, 1435.v3.patch, > 1435.v4.patch, MR-1435-y20s.patch > > > With JVM reuse, TaskRunner.setupWorkDir() lists the contents of workDir and > does a fs.delete on each path listed. If the listed file is a symlink to > directory, it will delete the contents of those linked directories. This > would delete files from distributed cache and jars directory,if > mapred.create.symlink is true. > Changing ownership/permissions of symlinks through ENABLE_TASK_FOR_CLEANUP > would change ownership/permissions of underlying files. > This is observed by Karam while running streaming jobs with DistributedCache > and jvm reuse. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.