[
https://issues.apache.org/jira/browse/HADOOP-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585876#action_12585876
]
Tsz Wo (Nicholas), SZE commented on HADOOP-3182:
------------------------------------------------
After changed the job-dir permission to 777. Tried the following:
* Start NN, DN as nn_account (superuser by definition)
* Start JT, TT as jt_account (non-superuser)
* Run wordcount as user_account (non-superuser)
Below is an observation of the files/dir creation and their permission setting.
h4. Step 0: During JobTracker startup
mapred-sys-dir (delete-mkdirs-setPerm as jt_account)
733
h4. Step 1: Client side job submission (JobClient .submitJob)
job-dir (aka submitJobDir or mapred-sys-dir/jobId, mkdirs-setPerm as
user_account)
(was 733) 777
job-dir/job.jar (may not exists, create-setPerm by JobClient)
job-dir/job.split (create-setPerm by JobClient)
644
job-dir/job.xml (aka jobconf, create-setPerm by JobClient; Should it be
visible in JobTracker webpage??? It is visible now.)
644
Miscellaneous dirs: Current using the default permission, i.e. umask. What is
the correct permission for them???
- filesDir = new Path(submitJobDir, "files");
- archivesDir = new Path(submitJobDir, "archives");
- libjarsDir = new Path(submitJobDir, "libjars");
h4. Step 2: JobTracker side job submission (JobTracker.submitJob)
job-output-dir (aka mapred.output.dir)
job-history-dir
What are the correct permissions???
They may be created by anto-mkdir in JobTracker as user_account
- For example the first file created under job-output-dir in wordcount is
job-output-dir/_logs/history/hostname_1207356629390_job_200804041750_0001_username_wordcount.
Note that job-output-dir/_logs/history is the job-history-dir.
- This file is created as user_account
at
org.apache.hadoop.mapred.JobHistory$JobInfo.logSubmitted(JobHistory.java:413)
at org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:194)
at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1751)
Task does mkdirs job-output-dir as user_account
at org.apache.hadoop.mapred.Task.moveTaskOutputs(Task.java:594)
at org.apache.hadoop.mapred.Task.saveTaskOutput(Task.java:554)
at
org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2208)
- In Task.moveTaskOutputs(...), is finalOutputPath a constant? If yes, it
should not be computed again and again in Task.moveTaskOutputs(...).
job-temp-dir (aka _temporary, created in JobInProgress.<init> as user_account)
What is the correct permission???
Still need to look at the directories clean-up and the case in LocalJobRunner.
> JobClient creates submitJobDir with SYSTEM_DIR_PERMISSION ( rwx-wx-wx)
> ----------------------------------------------------------------------
>
> Key: HADOOP-3182
> URL: https://issues.apache.org/jira/browse/HADOOP-3182
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.16.2
> Reporter: lohit vijayarenu
> Assignee: Tsz Wo (Nicholas), SZE
> Priority: Blocker
> Fix For: 0.16.3, 0.17.0
>
>
> JobClient creates submitJobDir with SYSTEM_DIR_PERMISSION ( rwx-wx-wx ) which
> causes problem while sharing a cluster.
> Consider the case where userA starts jobtracker/tasktrackers and userB
> submits a job to this cluster. When userB creates submitJobDir it is created
> with rwx-wx-wx which cannot be read by tasktracker started by userA
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.