[ https://issues.apache.org/jira/browse/MAPREDUCE-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565722#comment-13565722 ]
Karthik Kambatla commented on MAPREDUCE-4964: --------------------------------------------- To add more context: We were seeing a bunch of DiskErrorExceptions (before this patch) on a large cluster under heavy load, because the jobconf file was missing as it was written to the wrong user's dir. After deploying this patch, we haven't seen any of those errors so far. More importantly, we haven't noticed any regressions. > JobLocalizer#localizeJobFiles can potentially write job.xml to the wrong > user's directory > ----------------------------------------------------------------------------------------- > > Key: MAPREDUCE-4964 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4964 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv1 > Affects Versions: 1.1.1 > Reporter: Karthik Kambatla > Assignee: Karthik Kambatla > Attachments: MR-4964.patch, MR-4964.patch > > > In the following code, if jobs corresponding to different users (X and Y) are > localized simultaneously, it is possible that jobconf can be written to the > wrong user's directory. (X's job.xml can be written to Y's directory) > {code} > public void localizeJobFiles(JobID jobid, JobConf jConf, > Path localJobTokenFile, TaskUmbilicalProtocol taskTracker) > throws IOException, InterruptedException { > localizeJobFiles(jobid, jConf, > lDirAlloc.getLocalPathForWrite(JOBCONF, ttConf), localJobTokenFile, > taskTracker); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira