The default tuning parameters are specified in nutch/conf/nutch-default.xml, and can be overridden in nutch/conf/nutch-site.xml. (Or in the crawl command line, but I believe that the 'best practice' is to configure settings in nutch-site.xml.)
My personal belief is that the two most valuable parameters for tuning the crawler are 'fetcher.threads.fetch' and 'fetcher.threads.per.host'. However, there are lots of other parameters for tuning, and you might find more value in some of the timeout parameters. (You might also want to look at tuning you JVM heap space, but I've never seen a real need to tweak it.) As far as resuming a failed crawl, I don't know of any way to do so. I always discard and restart. -- View this message in context: http://old.nabble.com/What-are-the-configuration-parameters-to-fine-tune-Nutch-performance-tp26125943p26250181.html Sent from the Nutch - User mailing list archive at Nabble.com.