Apologies this works now if I set the dfs.replication=1 when I launch the job i.e.
hadoop jar foo.jar com.foo -D dfs.replication=1 input output On Thu, May 17, 2012 at 2:06 PM, Aishwarya Venkataraman <avenk...@cs.ucsd.edu> wrote: > Hello, > > I have a 4-node cluster. One namenode and 3 other datanodes. I want to > explicitly set the dfs.replication factor to 1 inorder to run some > experiments. I tried setting this via the hdfs-site.xml file and via > the command line as well (hadoop dfs -setrep -R -w 1 /). But I have a > feeling that the replication factor that hdfs is seeing is 3. It seems > to be writing the temporary mapper outputs to all the 3 datanodes. Is > this the default configuration for MR jobs ? If no, how can I set this > to 1 ? > > Thanks, > Aishwarya -- Thanks, Aishwarya Venkataraman avenkata[at]cs[dot]ucsd[dot]edu Graduate Student | Department of Computer Science University of California, San Diego