Hi, As I understand Nutch crawler is employing crawl & stop with threshold is used with –topN parameter. Please correct me if I'm wrong. This also means that some sites will have different depth the others.
Is there a way to control the crawling depth per domain and number of URLS per domain as well as the total number of domains crawled (in this case it's - topN). Thanks, Daniel
