[
https://issues.apache.org/jira/browse/MAPREDUCE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609277#comment-13609277
]
Robert Joseph Evans commented on MAPREDUCE-4820:
------------------------------------------------
Alejandro,
MAPREDUCE-4549 is the thing that undid MAPREDUCE-4503. MAPREDUCE-4503 made it
an exception to have distributed cache collisions. MAPREDUCE-4549, which is
fixed in 2.0.4-alpha, turned that error into a warning. I suspect you just
need to go to 2.0.4-alpha and your issue is "fixed".
We still need to decided if we want to put the warning into trunk as well, or
if there is a viable long term solution where oozie and others can be sure to
not have duplicate entries.
> MRApps distributed-cache duplicate checks are incorrect
> -------------------------------------------------------
>
> Key: MAPREDUCE-4820
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4820
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am
> Affects Versions: 2.0.2-alpha
> Reporter: Alejandro Abdelnur
> Priority: Blocker
> Fix For: 2.0.4-alpha
>
> Attachments: launcher-job.conf.xml, launcher-job.logs.txt,
> mr-job.conf.xml, mr-job.logs.txt
>
>
> This seems a combination of issues that are being exposed in 2.0.2-alpha by
> MAPREDUCE-4549.
> MAPREDUCE-4549 introduces a check to to ensure there are not duplicate JARs
> in the distributed-cache (using the JAR name as identity).
> In Hadoop 2 (different from Hadoop 1), all JARs in the distributed-cache are
> symlink-ed to the current directory of the task.
> MRApps, when setting up the DistributedCache
> (MRApps#setupDistributedCache->parseDistributedCacheArtifacts) assumes that
> the local resources (this includes files in the CURRENT_DIR/,
> CURRENT_DIR/classes/ and files in CURRENT_DIR/lib/) are part of the
> distributed-cache already.
> For systems, like Oozie, which use a launcher job to submit the real job this
> poses a problem because MRApps is run from the launcher job to submit the
> real job. The configuration of the real job has the correct distributed-cache
> entries (no duplicates), but because the current dir has the same files, the
> submission fails.
> It seems that MRApps should not be checking dups in the distributed-cached
> against JARs in the CURRENT_DIR/ or CURRENT_DIR/lib/. The dup check should be
> done among distributed-cached entries only.
> It seems YARNRunner is symlink-ing all files in the distributed cached in the
> current directory. In Hadoop 1 this was done only for files added to the
> distributed-cache using a fragment (ie "#FOO") to trigger a symlink creation.
> Marking as a blocker because without a fix for this, Oozie cannot submit jobs
> to Hadoop 2 (i've debugged Oozie in a live cluster being used by BigTop
> -thanks Roman- to test their release work, and I've verified that Oozie 3.3
> does not create duplicated entries in the distributed-cache)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira