On Aug 23, 2007, at 7:58 AM, Thomas Friol wrote:

Important point : the client submitting jobs is on a total different
machine from the master and the slaves and also it is a total different
user.

The main problem is the parameter 'hadoop.tmp.dir' which default value
is '/tmp/hadoop-${user.name} which means it is based on the user name.

Step 1 : The client (user A) is submitting a job using the JobClient
class. So the job jar and job files are uploaded to the DFS into the
directory /tmp/hadoop-A/mared/system/job-id

The problem is that you haven't configured your map/reduce system directory. The default works for single node systems, but not for "real" clusters. I like to use:

<property>
  <name>mapred.system.dir</name>
  <value>/hadoop/mapred/system</value>
<description>The shared directory where MapReduce stores control files.
  </description>
</property>

Note that this directory is in your default file system and must be accessible from both the client and server machines and is typically in HDFS. I've added a slight extension on HADOOP-1100 to have the system directory passed back from the job tracker to the client.

-- Owen

Reply via email to