Re: dfs.replication factor for MR jobs

2012-05-17 Thread Aishwarya Venkataraman
The MR job that Im running has zero reducers (Sorry I should have mentioned this earlier). Its a mapper only job. Thanks, On Thu, May 17, 2012 at 2:31 PM, Abhishek Pratap Singh wrote: > Hi Aishwarya, > > Temporary output of mapper is used for reducer. And number of Reduce jobs > are based on th

Re: dfs.replication factor for MR jobs

2012-05-17 Thread Aishwarya Venkataraman
Apologies this works now if I set the dfs.replication=1 when I launch the job i.e. hadoop jar foo.jar com.foo -D dfs.replication=1 input output On Thu, May 17, 2012 at 2:06 PM, Aishwarya Venkataraman wrote: > Hello, > > I have a 4-node cluster. One namenode and 3 other datanodes. I want to > exp

Re: dfs.replication factor for MR jobs

2012-05-17 Thread Abhishek Pratap Singh
Hi Aishwarya, Temporary output of mapper is used for reducer. And number of Reduce jobs are based on the output keys of Mapper. It has nothing to do with replication factor. It is writing to three nodes because at least three keys has been generated from mapper and assigned reducer to three diffe

dfs.replication factor for MR jobs

2012-05-17 Thread Aishwarya Venkataraman
Hello, I have a 4-node cluster. One namenode and 3 other datanodes. I want to explicitly set the dfs.replication factor to 1 inorder to run some experiments. I tried setting this via the hdfs-site.xml file and via the command line as well (hadoop dfs -setrep -R -w 1 /). But I have a feeling that t