[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478887#comment-13478887
 ] 

Luke Lu commented on MAPREDUCE-4398:
------------------------------------

The "magic" number of 4 is the default number of job init threads 
(mapred.jobinit.threads). You have to submit 4 (or precisely 
mapred.jobinit.threads) or more jobs as the jobtracker user at the same time to 
make sure the job init thread are initialized as the system user so they can 
access the mapred.system.dir (for security reasons, it must be 700). Otherwise, 
some of the job init threads will be initialized as whatever user who first 
submits a job. This can lead to seemingly more bizarre behavior: some time it 
works (the job is initialized by one of the system threads) and sometime it 
doesn't (the job is initialized by one of the user threads). Once you know the 
root cause, it's pretty trivial to come up with a patch. The default fifo 
scheduler and capacity scheduler do not have this bug.
                
> Fix mapred.system.dir permission error with FairScheduler
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-4398
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4398
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/fair-share
>    Affects Versions: 1.0.3
>            Reporter: Luke Lu
>            Assignee: Yu Gao
>
> Incorrect job initialization logic in FairScheduler causes mysterious 
> intermittent mapred.system.dir permission errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to