The MR job that Im running has zero reducers (Sorry I should have mentioned this earlier). Its a mapper only job.
Thanks, On Thu, May 17, 2012 at 2:31 PM, Abhishek Pratap Singh <manu.i...@gmail.com> wrote: > Hi Aishwarya, > > Temporary output of mapper is used for reducer. And number of Reduce jobs > are based on the output keys of Mapper. It has nothing to do with > replication factor. It is writing to three nodes because at least three > keys has been generated from mapper and assigned reducer to three different > nodes. > > Regards, > Abhishek > > On Thu, May 17, 2012 at 2:06 PM, Aishwarya Venkataraman < > avenk...@cs.ucsd.edu> wrote: > >> Hello, >> >> I have a 4-node cluster. One namenode and 3 other datanodes. I want to >> explicitly set the dfs.replication factor to 1 inorder to run some >> experiments. I tried setting this via the hdfs-site.xml file and via >> the command line as well (hadoop dfs -setrep -R -w 1 /). But I have a >> feeling that the replication factor that hdfs is seeing is 3. It seems >> to be writing the temporary mapper outputs to all the 3 datanodes. Is >> this the default configuration for MR jobs ? If no, how can I set this >> to 1 ? >> >> Thanks, >> Aishwarya >> -- Thanks, Aishwarya Venkataraman avenkata[at]cs[dot]ucsd[dot]edu Graduate Student | Department of Computer Science University of California, San Diego