[
https://issues.apache.org/jira/browse/MAPREDUCE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614482#comment-13614482
]
Roman Shaposhnik commented on MAPREDUCE-4820:
---------------------------------------------
[~revans2] I think you may be absolutely right. This could be a residual effect
of enabling OOZIE-1089 (which we probably shouldn't do now that the actual
issue seems to be fixed). Let me poke on the Oozie side and report back. Thanks
for your help so far!
> MRApps distributed-cache duplicate checks are incorrect
> -------------------------------------------------------
>
> Key: MAPREDUCE-4820
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4820
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am
> Affects Versions: 2.0.2-alpha
> Reporter: Alejandro Abdelnur
> Priority: Blocker
> Fix For: 2.0.4-alpha
>
> Attachments: launcher-job.conf.xml, launcher-job.logs.txt,
> mr-job.conf.xml, mr-job.logs.txt
>
>
> This seems a combination of issues that are being exposed in 2.0.2-alpha by
> MAPREDUCE-4549.
> MAPREDUCE-4549 introduces a check to to ensure there are not duplicate JARs
> in the distributed-cache (using the JAR name as identity).
> In Hadoop 2 (different from Hadoop 1), all JARs in the distributed-cache are
> symlink-ed to the current directory of the task.
> MRApps, when setting up the DistributedCache
> (MRApps#setupDistributedCache->parseDistributedCacheArtifacts) assumes that
> the local resources (this includes files in the CURRENT_DIR/,
> CURRENT_DIR/classes/ and files in CURRENT_DIR/lib/) are part of the
> distributed-cache already.
> For systems, like Oozie, which use a launcher job to submit the real job this
> poses a problem because MRApps is run from the launcher job to submit the
> real job. The configuration of the real job has the correct distributed-cache
> entries (no duplicates), but because the current dir has the same files, the
> submission fails.
> It seems that MRApps should not be checking dups in the distributed-cached
> against JARs in the CURRENT_DIR/ or CURRENT_DIR/lib/. The dup check should be
> done among distributed-cached entries only.
> It seems YARNRunner is symlink-ing all files in the distributed cached in the
> current directory. In Hadoop 1 this was done only for files added to the
> distributed-cache using a fragment (ie "#FOO") to trigger a symlink creation.
> Marking as a blocker because without a fix for this, Oozie cannot submit jobs
> to Hadoop 2 (i've debugged Oozie in a live cluster being used by BigTop
> -thanks Roman- to test their release work, and I've verified that Oozie 3.3
> does not create duplicated entries in the distributed-cache)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira