Run job not from namenode

Andrey Pankov Mon, 31 Mar 2008 04:58:25 -0700

Hi all,

Currently I'm able to run map-reduce jobs from box where NameNode andJobTracker are running. But I'd like to run my jobs from separate box,from which I have access to HDFS. I have updated params fs.default.nameand mapred.job.tracker in local hadoop dir to point to the clustersmaster. Now Hadoop returns me following error:

[EMAIL PROTECTED]:/usr/local/hadoop-0.16.0$ bin/hadoop jarhadoop-0.16.0-examples.jar wordcount /user/username/gutenberg/user/username/gutenberg-output08/03/31 10:21:46 INFO mapred.FileInputFormat: Total input paths toprocess : 3org.apache.hadoop.ipc.RemoteException: java.io.IOException:/mnt/hadoop/mapred/system/job_200803210640_0852/job.xml: No such file ordirectory

        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:159)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:133)

atorg.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1083)

...

Here account 'username' has passwordless access to master box. Clusterruns over EC2.


As a variant I can run tasks via ssh, i.e.

ssh master /usr/local/hadoop-0.16.0bin/hadoop jar/home/username/jobs/hadoop-0.16.0-examples.jar wordcount/user/username/gutenberg /user/username/gutenberg-output


But you need to put your jar file to the NameNode box before you run it.

Thanks in advance.

--
Andrey Pankov

Run job not from namenode

Reply via email to