[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838908#comment-13838908
 ] 

Attila Pados commented on MAPREDUCE-5655:
-----------------------------------------

I checked the patches of MAPREDUCE-4052, the patch for ContainerLaunch.java 
finds the part to modify in the 136th line. The same code in the 2.2.0 version 
begins at 173th line. So, i think the patch for 0.23.x is not compatible with 
2.2.0, and vice versa. The 2 bugs originate from the same problem, but they are 
also different, the first problem i faced was that the launch_container.sh 
tried to start java with %JAVA_HOME% and i got  a "task returned with -1" or 
similar error, because the shell script could not be executed. 

My patch fixes this issue too.

There is a config entry in mapred-default.xml:

<property>
  <description>...</description>
   <name>mapreduce.application.classpath</name>
   <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,
  $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*.</value>
</property>

This probably needs to be altered also, due to the $ - %--% difference between 
windows- linux as well.
If you set up your environment to run the job in local/windows, you need to set 
it it to %, but when the job will run on a linux cluster, it must be set back 
to the $ which is the default i think.

The patch may cause the failure of a unit test, if the mapred.remote.os is set, 
and the test runs a job in local mode. This is not tested by me, because i had 
several other issues with running the unit tests, so i skipped this.

I have to repeat, that this is merely a workaround, probably more deep changes 
would be needed, that the launch_container.sh would be created by a java 
component running on the cluster side instead of the client side, but i don't 
feel myself capable of doing that. 

> Remote job submit from windows to a linux hadoop cluster fails due to wrong 
> classpath
> -------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5655
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client, job submission
>    Affects Versions: 2.2.0
>         Environment: Client machine is a Windows 7 box, with Eclipse
> Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any 
> linux)
>            Reporter: Attila Pados
>         Attachments: MRApps.patch, YARNRunner.patch
>
>
> I was trying to run a java class on my client, windows 7 developer 
> environment, which submits a job to the remote Hadoop cluster, initiates a 
> mapreduce there, and then downloads the results back to the local machine.
> General use case is to use hadoop services from a web application installed 
> on a non-cluster computer, or as part of a developer environment.
> The problem was, that the ApplicationMaster's startup shell script 
> (launch_container.sh) was generated with wrong CLASSPATH entry. Together with 
> the java process call on the bottom of the file, these entries were generated 
> in windows style, using % as shell variable marker and ; as the CLASSPATH 
> delimiter.
> I tracked down the root cause, and found that the MrApps.java, and the 
> YarnRunner.java classes create these entries, and is passed forward to the 
> ApplicationMaster, assuming that the OS that runs these classes will match 
> the one running the ApplicationMaster. But it's not the case, these are in 2 
> different jvm, and also the OS can be different, the strings are generated 
> based on the client/submitter side's OS.
> I made some workaround changes to these 2 files, so i could launch my job, 
> however there may be more problems ahead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to