Hello everyone.

I was attempting to use a nightly build of nutch to run some internet crawl tests.

I have 2 boxes that do fetching and use shared storage to keep things all together.

Is it just me or has the numFetchers option been removed from org.apache.nutch.crawl.Generator?

I've browsed over the code (downloaded from SVN) and noticed this:

[from org.apache.nutch.crawl.Generator.generate()]
   if (numLists == -1) {                         // for politeness make
numLists = job.getNumMapTasks(); // a partition per fetch task
   }

   job.setLong("crawl.gen.curTime", curTime);
   job.setLong("crawl.topN", topN);
[-]

numLists starts out being = to numFetchers
but I don't see where it is used other then the check above.

Would it work to do a
job.setLong("crawl.numFetchers",numLists);

I'm hoping that someone knows whats up.

Jeff.



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to