Hi All *Background* I would like to crawl 10-20 domains and all the pages underneath. I have a Nutch crawler that's running continuously.
*Problem* I am trying to investigate why some urls are still not in index yet, though they were created/updated 1 month back. During the investigation, I found out that many urls got a score of "Infinity". I am using "scoring-opic" in my nutch-default configuration. The url in question has a very low score(1.5514997E-4). I am afraid that the missing url never gets picked for fetching. *Questions*: 1. Is opic scoring the best scoring to use for my use case (10-20 domains)? If not can you recommend some other solution that worked for you. 2. Is the score "Infinity" a bug or a feature to tell that these are very important pages. When i look at those urls they dont look as important to me. I dont understand how they got that high a score. Thanks Srini

