[
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated MAPREDUCE-4421:
----------------------------------
Attachment: MAPREDUCE-4421-3.patch
Thanks for taking another look, Hitesh.
bq. Regarding addMRFrameworkToDistributedCache() - one minor question: the code
allows for a non-qualified URI. Should we enforce provision of a
fully-qualified path always?
I thought it would be easier to let it be qualified by the cluster's configured
defaults if not already fully qualified. Otherwise users/admins would have to
not only say "hdfs:/path/to/archive" but "hdfs://namenode:port/path/to/archive"
and if/when the name or port of the filesystem changes then it breaks. If we
let it be qualified by cluster defaults then admins can update the default
filesystem in core-site and the simpler forms continue to work unmodified.
bq. Minor nit: I believe there should be nothing in the implementation that
requires HDFS as the storage for the MR tarball?
Good point. I updated the documentation to refer to a distributed cache deploy
rather than an HDFS deploy. However I did call out in the docs the performance
ramifications of not using the cluster's default filesystem and a
publicly-readable path for the archive. Otherwise the job submitter could end
up re-uploading and the nodes re-localizing the framework for each job or each
user. It will work, but it will be slower than necessary.
> Remove dependency on deployed MR jars
> -------------------------------------
>
> Key: MAPREDUCE-4421
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 2.0.0-alpha
> Reporter: Arun C Murthy
> Assignee: Jason Lowe
> Attachments: MAPREDUCE-4421-2.patch, MAPREDUCE-4421-3.patch,
> MAPREDUCE-4421.patch, MAPREDUCE-4421.patch
>
>
> Currently MR AM depends on MR jars being deployed on all nodes via implicit
> dependency on YARN_APPLICATION_CLASSPATH.
> We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and,
> probably, just rely on adding a shaded MR jar along with job.jar to the
> dist-cache.
--
This message was sent by Atlassian JIRA
(v6.1#6144)