Alan Gates
Fri, 11 Apr 2008 15:21:30 -0700
Alan. mickey hsieh wrote:
I tried to figure how Pig set the number of task for Map and Reduce jobs. The number of Map task is always tied to the number of input file. Since there is one input file, number of Map tasks is 1, enven I had a 5.4 GB file and more than 1000 blocks. setting mapred.amp.taks has no effect what so ever. <property> <name>mapred.map.tasks</name> <value>7</value> <description>The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". </description> </property> The number of Reduce tasks could be set by Hadoop-site.xml <property> <name>mapred.reduce.tasks</name> <value>2</value> <description>The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". </description> </property> Please advise, MIckey Hsieh