Slides 17 & 18 give a glimpse into this scheduler, 
http://wiki.apache.org/hadoop-data/attachments/HadoopPresentations/attachments/dhruba_apachecon2008.pdf


Oh, and I see the JIRA issue contains a patch for 0.18.1 (applicable to 0.18.2 
possibly).

But I'm really curious if others think this would work for and help with Nutch 
generate/fetch/parse/etc. operations.

Otis 
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch




________________________________
From: Otis Gospodnetic <[EMAIL PROTECTED]>
To: Nutch User List <[email protected]>
Sent: Thursday, November 20, 2008 3:51:31 PM
Subject: Hadoop's new fair sharing job scheduler

Hi,

Just noticed Hadoop's new  fair sharing job scheduler ( 
https://issues.apache.org/jira/browse/HADOOP-3746
).  It seems to be in 0.19, which I think Nutch is not on yet... but still:

- is this something that would benefit Nutch?

The last time I used Nutch I remember having to be careful about mostly 
sequential job runs and having to pay close attention to number of max 
map/reduce tasks, etc. in order to maximize the cluster, and I wonder if the 
above would make that easier, less manual, or more efficient?


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

Reply via email to