[ https://issues.apache.org/jira/browse/MAPREDUCE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roman Shaposhnik updated MAPREDUCE-4820: ---------------------------------------- Priority: Blocker (was: Critical) Fix Version/s: 2.0.3-alpha > MRApps distributed-cache duplicate checks are incorrect > ------------------------------------------------------- > > Key: MAPREDUCE-4820 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4820 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am > Affects Versions: 2.0.2-alpha > Reporter: Alejandro Abdelnur > Priority: Blocker > Fix For: 2.0.3-alpha > > > This seems a combination of issues that are being exposed in 2.0.2-alpha by > MAPREDUCE-4549. > MAPREDUCE-4549 introduces a check to to ensure there are not duplicate JARs > in the distributed-cache (using the JAR name as identity). > In Hadoop 2 (different from Hadoop 1), all JARs in the distributed-cache are > symlink-ed to the current directory of the task. > MRApps, when setting up the DistributedCache > (MRApps#setupDistributedCache->parseDistributedCacheArtifacts) assumes that > the local resources (this includes files in the CURRENT_DIR/, > CURRENT_DIR/classes/ and files in CURRENT_DIR/lib/) are part of the > distributed-cache already. > For systems, like Oozie, which use a launcher job to submit the real job this > poses a problem because MRApps is run from the launcher job to submit the > real job. The configuration of the real job has the correct distributed-cache > entries (no duplicates), but because the current dir has the same files, the > submission fails. > It seems that MRApps should not be checking dups in the distributed-cached > against JARs in the CURRENT_DIR/ or CURRENT_DIR/lib/. The dup check should be > done among distributed-cached entries only. > It seems YARNRunner is symlink-ing all files in the distributed cached in the > current directory. In Hadoop 1 this was done only for files added to the > distributed-cache using a fragment (ie "#FOO") to trigger a symlink creation. > Marking as a blocker because without a fix for this, Oozie cannot submit jobs > to Hadoop 2 (i've debugged Oozie in a live cluster being used by BigTop > -thanks Roman- to test their release work, and I've verified that Oozie 3.3 > does not create duplicated entries in the distributed-cache) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira