Slides 17 & 18 give a glimpse into this scheduler, http://wiki.apache.org/hadoop-data/attachments/HadoopPresentations/attachments/dhruba_apachecon2008.pdf
Oh, and I see the JIRA issue contains a patch for 0.18.1 (applicable to 0.18.2 possibly). But I'm really curious if others think this would work for and help with Nutch generate/fetch/parse/etc. operations. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ________________________________ From: Otis Gospodnetic <[EMAIL PROTECTED]> To: Nutch User List <[email protected]> Sent: Thursday, November 20, 2008 3:51:31 PM Subject: Hadoop's new fair sharing job scheduler Hi, Just noticed Hadoop's new fair sharing job scheduler ( https://issues.apache.org/jira/browse/HADOOP-3746 ). It seems to be in 0.19, which I think Nutch is not on yet... but still: - is this something that would benefit Nutch? The last time I used Nutch I remember having to be careful about mostly sequential job runs and having to pay close attention to number of max map/reduce tasks, etc. in order to maximize the cluster, and I wonder if the above would make that easier, less manual, or more efficient? Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
