Hello everyone.
I was attempting to use a nightly build of nutch to run some internet
crawl tests.
I have 2 boxes that do fetching and use shared storage to keep things
all together.
Is it just me or has the numFetchers option been removed from
org.apache.nutch.crawl.Generator?
I've browsed over the code (downloaded from SVN) and noticed this:
[from org.apache.nutch.crawl.Generator.generate()]
if (numLists == -1) { // for politeness make
numLists = job.getNumMapTasks(); // a partition per
fetch task
}
job.setLong("crawl.gen.curTime", curTime);
job.setLong("crawl.topN", topN);
[-]
numLists starts out being = to numFetchers
but I don't see where it is used other then the check above.
Would it work to do a
job.setLong("crawl.numFetchers",numLists);
I'm hoping that someone knows whats up.
Jeff.