Alejandro Abdelnur created MAPREDUCE-4820:
---------------------------------------------
Summary: MRApps distributed-cache duplicate checks are incorrect
Key: MAPREDUCE-4820
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4820
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mr-am
Affects Versions: 2.0.2-alpha
Reporter: Alejandro Abdelnur
Priority: Blocker
Fix For: 2.0.3-alpha
This seems a combination of issues that are being exposed in 2.0.2-alpha by
MAPREDUCE-4549.
MAPREDUCE-4549 introduces a check to to ensure there are not duplicate JARs in
the distributed-cache (using the JAR name as identity).
In Hadoop 2 (different from Hadoop 1), all JARs in the distributed-cache are
symlink-ed to the current directory of the task.
MRApps, when setting up the DistributedCache
(MRApps#setupDistributedCache->parseDistributedCacheArtifacts) assumes that the
local resources (this includes files in the CURRENT_DIR/, CURRENT_DIR/classes/
and files in CURRENT_DIR/lib/) are part of the distributed-cache already.
For systems, like Oozie, which use a launcher job to submit the real job this
poses a problem because MRApps is run from the launcher job to submit the real
job. The configuration of the real job has the correct distributed-cache
entries (no duplicates), but because the current dir has the same files, the
submission fails.
It seems that MRApps should not be checking dups in the distributed-cached
against JARs in the CURRENT_DIR/ or CURRENT_DIR/lib/. The dup check should be
done among distributed-cached entries only.
It seems YARNRunner is symlink-ing all files in the distributed cached in the
current directory. In Hadoop 1 this was done only for files added to the
distributed-cache using a fragment (ie "#FOO") to trigger a symlink creation.
Marking as a blocker because without a fix for this, Oozie cannot submit jobs
to Hadoop 2 (i've debugged Oozie in a live cluster being used by BigTop -thanks
Roman- to test their release work, and I've verified that Oozie 3.3 does not
create duplicated entries in the distributed-cache)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira