Re: need a larger map task number

Dennis Tue, 05 Oct 2010 17:46:56 -0700

Thanks, Steve
I'am using Nutch 1.1, and I installed it following 
this: http://wiki.apache.org/nutch/NutchHadoopTutorial.But I did not see any 
hadoop-site.xml file. I used grep to see anything related with 'task' (see 
bellow). Besides, the "crawldb crawl/crawldb" job uses more mapreduce tasks, 
usually 4, while other jobs uses only 2.Any Idea?
b...@nutch03:~/nutch/search$ grep task conf/*conf/capacity-scheduler.xml:  <!-- 
The default configuration settings for the capacity task scheduler 
-->conf/domain-suffixes.xml:    <!--  ke : 
http://www.kenic.or.ke/index.php?option=com_content&task=view&id=117&Itemid=145-->conf/domain-suffixes.xml:
    <!--  TASK geographical domains 
(www.task.gda.pl/uslugi/dns)-->conf/hadoop-policy.xml:    <description>ACL for 
InterTrackerProtocol, used by the tasktrackers to conf/hadoop-policy.xml:    
<name>security.task.umbilical.protocol.acl</name>conf/hadoop-policy.xml:    
tasks to communicate with the parent tasktracker. conf/mapred-site.xml:    
reduce task.conf/mapred-site.xml:  
<name>mapred.map.tasks</name>conf/mapred-site.xml:    define mapred.map tasks 
to be number of slave hostsconf/mapred-site.xml:  
<name>mapred.reduce.tasks</name>conf/mapred-site.xml:    define mapred.reduce 
tasks to be number of slave hosts
Dennis

--- On Tue, 10/5/10, Steve Cohen <[email protected]> wrote:

From: Steve Cohen <[email protected]>
Subject: Re: need a larger map task number
To: [email protected]
Date: Tuesday, October 5, 2010, 9:40 PM

For nutch, I found that updating the values in hadoop-site.xml was enough,
though I also set values for mapred.tasktracker.map.tasks.maximum and
mapred.tasktracker.reduce.tasks.maximum.

On Tue, Oct 5, 2010 at 9:24 AM, Dennis <[email protected]> wrote:

> Hi, all
> My "fetch" job uses only 2 map tasks and 2 reduce tasks although I
> configured "mapred.map.tasks" and "mapred.reduce.tasks" in "mapreduce.xml"
> to "32", while I need it run faster.How can I make nutch to use more map and
> reduce tasks when it's fetching?
> Dennis
>
>
>

Re: need a larger map task number

Reply via email to