I am running Nutch on a powerful server with 1 GB RAM and 3 GHz Intel
processor. I want to know what the optimum number of threads would be
to crawl an intranet with around 100 sites.

If I use too many threads (say -threads 100) while crawling, won't the
context switching overhead hamper the performance.

Please share your experiences like what number of threads have worked
well for you.

You may also share the other metrics like "-depth" values and "-topN" values.

Reply via email to