Hi, I'm having some trouble controlling output replication level with PigRunner. In particular, when on a single box dev environment I want to set replication to 1. Else, with replication at 3x, the name node marks all blocks as under replicated and eventually starts freaking out.
Here are some details: -I set replication to 1 in hdfs-site.xml. I set all relevant environment variables like HADOOP_CONF_DIR and PIG_HOME -When I run pig on the command line I get my desired output replication of 1. -When I run pig through PigRunner I get output replication of 3. -I checked on all ENV variables within my process using PigRunner. They match what I see in the shell. (Not sure if PigRunner would pick these up anyway). -I pass a properties file to PigRunner. The only relevant property there is 'mapred.submit.replication=1' My best guess is I'm not passing in the correct properties, but I am not sure. Thanks in advance for any suggestions here. Thanks, Adam
