Hi, all,
I have configured "mapred.map.tasks" and "mapred.reduce.tasks" in mapreduce.xml
to "32"(see below), but Hadoop uses only 2 or 3 tasks when running a job(see
bellow).How can I let Hadoop use more tasks so it can run faster?
<property> <name>mapred.map.tasks</name> <value>32</value> <description>
define mapred.map tasks to be number of slave hosts </description></property>
<property> <name>mapred.reduce.tasks</name> <value>32</value> <description>
define mapred.reduce tasks to be number of slave
hosts </description></property>
JobidPriorityUserNameMap % CompleteMap TotalMaps CompletedReduce %
CompleteReduce TotalReduces CompletedJob Scheduling
Informationjob_201010051740_0001NORMALbillinject
dmoz100.00%22100.00%11NAjob_201010051740_0002NORMALbillcrawldb
crawl/crawldb100.00%33100.00%11NAjob_201010051740_0003NORMALbillgenerate:
select from
crawl/crawldb100.00%22100.00%11NAjob_201010051740_0004NORMALbillgenerate:
partition
crawl/segments/20101005180708100.00%22100.00%22NAjob_201010051740_0007NORMALbillinject
dmoz100.00%22100.00%11
Dennis