[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

Vinod Kumar Vavilapalli (JIRA) Thu, 13 Sep 2012 14:11:11 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455292#comment-13455292
 ]


Vinod Kumar Vavilapalli commented on MAPREDUCE-4421:
----------------------------------------------------

bq. This means the cluster setup should not have AMs site.xml files deployed in 
it.
bq. Then the 'yarn' client script should support a '-ampath=' (HDFS path to AM 
resources) option and/or a '-amname=' (logical name of the AM resources, a new 
config file am-site.xml in the cluster would have this mapping for blessed AMs, 
as suggested in my prev comment). 
That is possible today itself with the separate YARN_CONF_DIR. I haven't tested 
with separate conf-dirs but can check right away. Essentially, MR has its conf 
file (mapred-default.xml), dist-shell could have its own. We can argue either 
ways about creating a new config file am-site.xml for 'blessed' AMs. 

bq. This means that the cluster setup should not have AMs JARs deployed in it.
This is already the case. I have the test-cluster with HADOOP_YARN_HOME ahd 
HADOOP_MAPRED_HOME separate. So yeah, YARN doesn't have any mapred jars (except 
the shuffle related ones, which is not for the AMs)

bq. For one off custom AMs, JARs and config would be all provided by the client 
on submission.
bq. For AMs like MapReduce, DistributedShell and widely used AMs in a given 
cluster, their JARs and config site.xml files would be in HDFS.
Yarn doesn't care how the AM related jars are managed. All it needs to know at 
the end of the day is a FS location of all the jars needed by the app. So the 
jars can be managed in two ways. The framework specific clients can pick up the 
AM jars
 - from a public location on DFS and populate dist-cache
 - or a local installation on the client and upload it to a private location on 
DFS and populate the dist-cache
 - or if the AM jars happen to be installed on every node, construct the 
classpath referring to those jars.

Today MR AM implements the third option above, this JIRA is to enable the first 
two options.

                
> Remove dependency on deployed MR jars
> -------------------------------------
>
>                 Key: MAPREDUCE-4421
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4421
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 2.0.0-alpha
>            Reporter: Arun C Murthy
>            Assignee: Vinod Kumar Vavilapalli
>
> Currently MR AM depends on MR jars being deployed on all nodes via implicit 
> dependency on YARN_APPLICATION_CLASSPATH. 
> We should stop adding mapreduce jars to YARN_APPLICATION_CLASSPATH and, 
> probably, just rely on adding a shaded MR jar along with job.jar to the 
> dist-cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4421) Remove dependency on deployed MR jars

Reply via email to