Author: ab Date: Tue Oct 31 13:46:26 2006 New Revision: 469667 URL: http://svn.apache.org/viewvc?view=rev&rev=469667 Log: NUTCH-361, NUTCH-136 - When jobtracker is 'local' generate only one partition.
Modified: lucene/nutch/branches/branch-0.8/CHANGES.txt lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/crawl/Generator.java Modified: lucene/nutch/branches/branch-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/lucene/nutch/branches/branch-0.8/CHANGES.txt?view=diff&rev=469667&r1=469666&r2=469667 ============================================================================== --- lucene/nutch/branches/branch-0.8/CHANGES.txt (original) +++ lucene/nutch/branches/branch-0.8/CHANGES.txt Tue Oct 31 13:46:26 2006 @@ -8,6 +8,9 @@ 2. NUTCH-379 - ParseUtil does not pass through the content's URL to the ParserFactory (Chris A. Mattmann via siren) + 3. NUTCH-361, NUTCH-136 - When jobtracker is 'local' generate only one + partition. (ab) + Release 0.8.1 - 2006-09-24 1. Changed log4j confiquration to log to stdout on commandline Modified: lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/crawl/Generator.java URL: http://svn.apache.org/viewvc/lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/crawl/Generator.java?view=diff&rev=469667&r1=469666&r2=469667 ============================================================================== --- lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/crawl/Generator.java (original) +++ lucene/nutch/branches/branch-0.8/src/java/org/apache/nutch/crawl/Generator.java Tue Oct 31 13:46:26 2006 @@ -299,6 +299,12 @@ numLists = job.getNumMapTasks(); // a partition per fetch task } + if ("local".equals(job.get("mapred.job.tracker")) && numLists != 1) { + // override + LOG.info("Generator: jobtracker is 'local', generating exactly one partition."); + numLists = 1; + } + job.setLong("crawl.gen.curTime", curTime); job.setLong("crawl.topN", topN);