[jira] [Commented] (YARN-2625) Problems with CLASSPATH in Job Submission REST API

2015-01-05 Thread Doug Haigh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264577#comment-14264577
 ] 

Doug Haigh commented on YARN-2625:
--

I am not using MR. I am writing my own AppMaster which requires knowing the 
path to the Hadoop ecosystem's CLASSPATH. Unless you expect Hadoop to be 
rewritten in something other than JAVA, most AppMaster jars written in Java 
will be required to know the CLASSPATH of the Hadoop ecosystem they are 
expected to run under. If you want to add a REST API to specifically get the 
CLASSPATH of the Hadoop ecosystem a Java AppMaster will run under, that is fine 
with me.

> Problems with CLASSPATH in Job Submission REST API
> --
>
> Key: YARN-2625
> URL: https://issues.apache.org/jira/browse/YARN-2625
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.5.1
>Reporter: Doug Haigh
>
> There are a couple of issues I have found specifying the CLASSPATH 
> environment variable using the REST API.
> 1) In the Java client, the CLASSPATH environment is usually made up of either 
> the value of the yarn.application.classpath in yarn-site.xml value or the 
> default YARN classpath value as defined by 
> YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH. REST API 
> consumers have no method of telling the resource manager to use the default 
> unless they hardcode the default value themselves. If the default ever 
> changes, the code would need to change. 
> 2) If any environment variables are used in the CLASSPATH environment 'value' 
> field, they are evaluated when the values are NULL resulting in bad values in 
> the CLASSPATH. For example, if I had hardcoded the CLASSPATH value to the 
> default of "$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, 
> $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*" the classpath passed to the 
> application master is 
> ":/share/hadoop/common/*:/share/hadoop/common/lib/*:/share/hadoop/hdfs/*:/share/hadoop/hdfs/lib/*:/share/hadoop/yarn/*:/share/hadoop/yarn/lib/*"
> These two problems require REST API consumers to always have the fully 
> resolved path defined in the yarn.application.classpath value. If the 
> property is missing or contains environment varaibles, the application 
> created by the REST API will fail due to the CLASSPATH being incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2625) Problems with CLASSPATH in Job Submission REST API

2014-12-23 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257618#comment-14257618
 ] 

Vinod Kumar Vavilapalli commented on YARN-2625:
---

bq.  If the default ever changes, the code would need to change.
The way the layering works, YARN doesn't know whether the app is JVM based or 
not (though predominant apps that exist today are all JVM based). Given this, 
we chose to introduce this yarn.application.classpath as a _simplifying_ 
configuration property that apps can use. So far, it didn't have any semantic 
meaning to the platform and it is important to keep it so to keep it so. I 
think the right solution is for us to introduce some sort of a JAVA universe as 
a API, we can auto-expand it the way you are asking only if the app 
specifically says it is a JAVA universe.

bq. For example, if I had hardcoded the CLASSPATH value to the default of 
"$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/, 
$HADOOP_COMMON_HOME/share/hadoop/common/lib/, 
$HADOOP_HDFS_HOME/share/hadoop/hdfs/, $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/, 
$HADOOP_YARN_HOME/share/hadoop/yarn/, $HADOOP_YARN_HOME/share/hadoop/yarn/lib/" 
the classpath passed to the application master is 
":/share/hadoop/common/:/share/hadoop/common/lib/:/share/hadoop/hdfs/:/share/hadoop/hdfs/lib/:/share/hadoop/yarn/:/share/hadoop/yarn/lib/"
This is a bug. NodeManager by default exports these environment variables and 
so the expansion should happen automatically.

Are you using MR jobs? If so, once your clusters start moving to using MR 
tarballs through distributed-cache scheme (MAPREDUCE-4421), the classpath 
dictated by mapreduce.application.classpath will simply be a reflection of the 
directory hierarchy in the tarball and is completely decoupled from the 
cluster-install or the environment variables.

> Problems with CLASSPATH in Job Submission REST API
> --
>
> Key: YARN-2625
> URL: https://issues.apache.org/jira/browse/YARN-2625
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.5.1
>Reporter: Doug Haigh
>
> There are a couple of issues I have found specifying the CLASSPATH 
> environment variable using the REST API.
> 1) In the Java client, the CLASSPATH environment is usually made up of either 
> the value of the yarn.application.classpath in yarn-site.xml value or the 
> default YARN classpath value as defined by 
> YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH. REST API 
> consumers have no method of telling the resource manager to use the default 
> unless they hardcode the default value themselves. If the default ever 
> changes, the code would need to change. 
> 2) If any environment variables are used in the CLASSPATH environment 'value' 
> field, they are evaluated when the values are NULL resulting in bad values in 
> the CLASSPATH. For example, if I had hardcoded the CLASSPATH value to the 
> default of "$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, 
> $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*" the classpath passed to the 
> application master is 
> ":/share/hadoop/common/*:/share/hadoop/common/lib/*:/share/hadoop/hdfs/*:/share/hadoop/hdfs/lib/*:/share/hadoop/yarn/*:/share/hadoop/yarn/lib/*"
> These two problems require REST API consumers to always have the fully 
> resolved path defined in the yarn.application.classpath value. If the 
> property is missing or contains environment varaibles, the application 
> created by the REST API will fail due to the CLASSPATH being incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2625) Problems with CLASSPATH in Job Submission REST API

2014-10-06 Thread Doug Haigh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160455#comment-14160455
 ] 

Doug Haigh commented on YARN-2625:
--

To be honest, I never knew about that option - but that does not get me 
anything more than what I can read from the yarn-site.xml file (although I like 
not having to have those files around). 

It still has the two problems described above because

1) If the {{yarn.application.classpath}} value is not specified, I still have 
no way to know the default classpath
2) If the {{yarn.application.classpath}} value has environment variables in it, 
they still need to be resolved somehow.

If the value returned by that URL was the *resolved* classpath, that would be 
the work.

> Problems with CLASSPATH in Job Submission REST API
> --
>
> Key: YARN-2625
> URL: https://issues.apache.org/jira/browse/YARN-2625
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.5.1
>Reporter: Doug Haigh
>
> There are a couple of issues I have found specifying the CLASSPATH 
> environment variable using the REST API.
> 1) In the Java client, the CLASSPATH environment is usually made up of either 
> the value of the yarn.application.classpath in yarn-site.xml value or the 
> default YARN classpath value as defined by 
> YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH. REST API 
> consumers have no method of telling the resource manager to use the default 
> unless they hardcode the default value themselves. If the default ever 
> changes, the code would need to change. 
> 2) If any environment variables are used in the CLASSPATH environment 'value' 
> field, they are evaluated when the values are NULL resulting in bad values in 
> the CLASSPATH. For example, if I had hardcoded the CLASSPATH value to the 
> default of "$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, 
> $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*" the classpath passed to the 
> application master is 
> ":/share/hadoop/common/*:/share/hadoop/common/lib/*:/share/hadoop/hdfs/*:/share/hadoop/hdfs/lib/*:/share/hadoop/yarn/*:/share/hadoop/yarn/lib/*"
> These two problems require REST API consumers to always have the fully 
> resolved path defined in the yarn.application.classpath value. If the 
> property is missing or contains environment varaibles, the application 
> created by the REST API will fail due to the CLASSPATH being incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2625) Problems with CLASSPATH in Job Submission REST API

2014-10-06 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160442#comment-14160442
 ] 

Varun Vasudev commented on YARN-2625:
-

[~cdhaigh], you can fetch the config via http://:/conf. 
That gives you the xml configuration for the RM. Can you extract 
yarn.application.classpath from there and use it when submitting your app?

> Problems with CLASSPATH in Job Submission REST API
> --
>
> Key: YARN-2625
> URL: https://issues.apache.org/jira/browse/YARN-2625
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.5.1
>Reporter: Doug Haigh
>
> There are a couple of issues I have found specifying the CLASSPATH 
> environment variable using the REST API.
> 1) In the Java client, the CLASSPATH environment is usually made up of either 
> the value of the yarn.application.classpath in yarn-site.xml value or the 
> default YARN classpath value as defined by 
> YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH. REST API 
> consumers have no method of telling the resource manager to use the default 
> unless they hardcode the default value themselves. If the default ever 
> changes, the code would need to change. 
> 2) If any environment variables are used in the CLASSPATH environment 'value' 
> field, they are evaluated when the values are NULL resulting in bad values in 
> the CLASSPATH. For example, if I had hardcoded the CLASSPATH value to the 
> default of "$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, 
> $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*" the classpath passed to the 
> application master is 
> ":/share/hadoop/common/*:/share/hadoop/common/lib/*:/share/hadoop/hdfs/*:/share/hadoop/hdfs/lib/*:/share/hadoop/yarn/*:/share/hadoop/yarn/lib/*"
> These two problems require REST API consumers to always have the fully 
> resolved path defined in the yarn.application.classpath value. If the 
> property is missing or contains environment varaibles, the application 
> created by the REST API will fail due to the CLASSPATH being incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2625) Problems with CLASSPATH in Job Submission REST API

2014-10-06 Thread Doug Haigh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160255#comment-14160255
 ] 

Doug Haigh commented on YARN-2625:
--

Yes, I agree the classpath should be set by the client, but the REST client 
should not have to know the default classpath just as the Java client does not 
need to know it. Just as the REST API resolves {{}} to either {{:}} or 
{{;}} based on the underlying operating system, the REST API could look for 
{{}} and resolve it to 
{{YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH}}.


As for environment variables being resolved, when running a Java client against 
a CDH 5.0.0 cluster, I am able to set the environment to 

{{./*:$HADOOP_CLIENT_CONF_DIR:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_HDFS_HOME/*:$HADOOP_HDFS_HOME/lib/*:$HADOOP_YARN_HOME/*:$HADOOP_YARN_HOME/lib/*}}

and it works - the environment variables are resolved. Maybe it is the way CDH 
is setting things up, but the path is not fully resolved when the client 
specifies it.

> Problems with CLASSPATH in Job Submission REST API
> --
>
> Key: YARN-2625
> URL: https://issues.apache.org/jira/browse/YARN-2625
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.5.1
>Reporter: Doug Haigh
>
> There are a couple of issues I have found specifying the CLASSPATH 
> environment variable using the REST API.
> 1) In the Java client, the CLASSPATH environment is usually made up of either 
> the value of the yarn.application.classpath in yarn-site.xml value or the 
> default YARN classpath value as defined by 
> YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH. REST API 
> consumers have no method of telling the resource manager to use the default 
> unless they hardcode the default value themselves. If the default ever 
> changes, the code would need to change. 
> 2) If any environment variables are used in the CLASSPATH environment 'value' 
> field, they are evaluated when the values are NULL resulting in bad values in 
> the CLASSPATH. For example, if I had hardcoded the CLASSPATH value to the 
> default of "$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, 
> $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*" the classpath passed to the 
> application master is 
> ":/share/hadoop/common/*:/share/hadoop/common/lib/*:/share/hadoop/hdfs/*:/share/hadoop/hdfs/lib/*:/share/hadoop/yarn/*:/share/hadoop/yarn/lib/*"
> These two problems require REST API consumers to always have the fully 
> resolved path defined in the yarn.application.classpath value. If the 
> property is missing or contains environment varaibles, the application 
> created by the REST API will fail due to the CLASSPATH being incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2625) Problems with CLASSPATH in Job Submission REST API

2014-10-06 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14160170#comment-14160170
 ] 

Rohith commented on YARN-2625:
--

While submitting appliclaiton from REST api, it expects CLASSPATH to be added 
by application client. As a support, 
ContainerLaunchContextInfo#setEnvironment() provide to add classpath  
variables. I think adding default values at server side may not feasible.

bq. 2) If any environment variables are used in the CLASSPATH environment 
'value' field, they are evaluated when the values are NULL resulting in bad 
values in the CLASSPATH
This is because cilent JVM does not resolve $HADOOP_COMMON_HOME environment 
variable.It is expected to provide full path when submitting from client.

> Problems with CLASSPATH in Job Submission REST API
> --
>
> Key: YARN-2625
> URL: https://issues.apache.org/jira/browse/YARN-2625
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api
>Affects Versions: 2.5.1
>Reporter: Doug Haigh
>
> There are a couple of issues I have found specifying the CLASSPATH 
> environment variable using the REST API.
> 1) In the Java client, the CLASSPATH environment is usually made up of either 
> the value of the yarn.application.classpath in yarn-site.xml value or the 
> default YARN classpath value as defined by 
> YarnConfiguration.DEFAULT_YARN_CROSS_PLATFORM_APPLICATION_CLASSPATH. REST API 
> consumers have no method of telling the resource manager to use the default 
> unless they hardcode the default value themselves. If the default ever 
> changes, the code would need to change. 
> 2) If any environment variables are used in the CLASSPATH environment 'value' 
> field, they are evaluated when the values are NULL resulting in bad values in 
> the CLASSPATH. For example, if I had hardcoded the CLASSPATH value to the 
> default of "$HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*, 
> $HADOOP_COMMON_HOME/share/hadoop/common/lib/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/*, 
> $HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/*, 
> $HADOOP_YARN_HOME/share/hadoop/yarn/lib/*" the classpath passed to the 
> application master is 
> ":/share/hadoop/common/*:/share/hadoop/common/lib/*:/share/hadoop/hdfs/*:/share/hadoop/hdfs/lib/*:/share/hadoop/yarn/*:/share/hadoop/yarn/lib/*"
> These two problems require REST API consumers to always have the fully 
> resolved path defined in the yarn.application.classpath value. If the 
> property is missing or contains environment varaibles, the application 
> created by the REST API will fail due to the CLASSPATH being incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)