Allow me to add a related question: Fetching is faster if you have more machines.
Is the same true for generate and update steps? In other words, is it faster to generate a fetchlist on a 100-node cluster than on a 10-node cluster (assuming the same crawldb, etc.)? Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Tomislav Poljak <[EMAIL PROTECTED]> > To: "[email protected]" <[email protected]> > Sent: Tuesday, December 9, 2008 2:46:27 PM > Subject: Fetching vs. generate and updatedb time ratio > > Hi, > what is the relative ratio of how long fetching takes vs. other steps > (generate and updatedb) in standard generate/fetch/update cycle? > > Thanks, > Tomislav
