On Aug 23, 2007, at 7:58 AM, Thomas Friol wrote:
Important point : the client submitting jobs is on a total different
machine from the master and the slaves and also it is a total
different
user.
The main problem is the parameter 'hadoop.tmp.dir' which default value
is '/tmp/hadoop-${user.name} which means it is based on the user name.
Step 1 : The client (user A) is submitting a job using the JobClient
class. So the job jar and job files are uploaded to the DFS into the
directory /tmp/hadoop-A/mared/system/job-id
The problem is that you haven't configured your map/reduce system
directory. The default works for single node systems, but not for
"real" clusters. I like to use:
<property>
<name>mapred.system.dir</name>
<value>/hadoop/mapred/system</value>
<description>The shared directory where MapReduce stores control
files.
</description>
</property>
Note that this directory is in your default file system and must be
accessible from both the client and server machines and is typically
in HDFS. I've added a slight extension on HADOOP-1100 to have the
system directory passed back from the job tracker to the client.
-- Owen