Hi All, I am crawling multiple big websites for which I have the homepage as the URL in the seed file. The problem I am facing is that one of the websites is getting crawled at a faster pace than the rest of the websites and as a result the indexed data contains a disproportionate number of entries for this one website.
I suspect that this is happening because this website in question has homepage with the maximum number of outlinks. My questions is how can I control the behaviour of Nutch so as to crawl every host/domain in a balanced way. I am using Nutch 1.7 Thanks.

