[jira] [Commented] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath

Jian He (JIRA) Tue, 04 Mar 2014 23:33:07 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920609#comment-13920609
 ]


Jian He commented on MAPREDUCE-5655:
------------------------------------

Hi [~padisah], are you still working on this?  I'd like to take it over.
We should give a more generic solution and instead of forcing client to make 
the configuration, we can give a platform-agnostic syntax to the NM and have 
the NM itself to replace that with the appropriate syntax according to the OS 
its running with, just like the way we are doing with nm-log-dir.

> Remote job submit from windows to a linux hadoop cluster fails due to wrong 
> classpath
> -------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5655
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5655
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client, job submission
>    Affects Versions: 2.2.0
>         Environment: Client machine is a Windows 7 box, with Eclipse
> Remote: there is a multi node hadoop cluster, installed on Ubuntu boxes (any 
> linux)
>            Reporter: Attila Pados
>         Attachments: MRApps.patch, YARNRunner.patch
>
>
> I was trying to run a java class on my client, windows 7 developer 
> environment, which submits a job to the remote Hadoop cluster, initiates a 
> mapreduce there, and then downloads the results back to the local machine.
> General use case is to use hadoop services from a web application installed 
> on a non-cluster computer, or as part of a developer environment.
> The problem was, that the ApplicationMaster's startup shell script 
> (launch_container.sh) was generated with wrong CLASSPATH entry. Together with 
> the java process call on the bottom of the file, these entries were generated 
> in windows style, using % as shell variable marker and ; as the CLASSPATH 
> delimiter.
> I tracked down the root cause, and found that the MrApps.java, and the 
> YarnRunner.java classes create these entries, and is passed forward to the 
> ApplicationMaster, assuming that the OS that runs these classes will match 
> the one running the ApplicationMaster. But it's not the case, these are in 2 
> different jvm, and also the OS can be different, the strings are generated 
> based on the client/submitter side's OS.
> I made some workaround changes to these 2 files, so i could launch my job, 
> however there may be more problems ahead.
> update
>  error message:
> 13/12/04 16:33:15 INFO mapreduce.Job: Job job_1386170530016_0001 failed with 
> state FAILED due to: Application application_1386170530016_0001 failed 2 
> times due to AM Container for appattempt_1386170530016_0001_000002 exited 
> with  exitCode: 1 due to: Exception from container-launch: 
> org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job 
> control
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>       at org.apache.hadoop.util.Shell.run(Shell.java:379)
>       at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>       at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:724)
> update2: 
>  It also reqires to add the following property to 
>  mapred-site.xml (or mapred-default.xml), on the windows box, so that the job 
> launcher knows, that the job runner will be a linux:
>   <property>
>   <name>mapred.remote.os</name>
>   <value>Linux</value>
>   <description>Remote MapReduce framework's OS, can be either Linux or 
> Windows</description>
>  </property
> without this entry, the patched jar does the same as the unpatched, so it's 
> required to work!



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MAPREDUCE-5655) Remote job submit from windows to a linux hadoop cluster fails due to wrong classpath

Reply via email to