I have some questions. 1. What is the ranking algorithm for Nutch?
2. Does this ranking algorithm come into play, when at a certain crawl depth, Nutch is confronted with 10000 possible URLs but the topN value is only 100. 3. When exactly is the topN selection done? Generation? Fetch? which phase?
