I have installed and successfully web crawled thousands of pages using Nutch 2.3.1 with MongoDB.
But suddently, Nutch 2.3.1 Generator not generating any URLs. Seed list URL are accepted (InjectorJob: total number of urls injected after normalization and filtering: 3) and ./bin/nutch parsechecker -dumpText http://xxx.com shows hundreds of URLs Error as follows: GeneratorJob: starting at 2016-06-09 07:26:15 GeneratorJob: Selecting best-scoring urls due for fetch. GeneratorJob: starting GeneratorJob: filtering: false GeneratorJob: normalizing: false GeneratorJob: topN: 50000 GeneratorJob: finished at 2016-06-09 07:26:28, time elapsed: 00:00:13 GeneratorJob: generated batch id: 1465471572-2463 containing 0 URLs What is interesting is that if I delete the webpage collection in the mongodb nutch database, then the crawler works fine so I'm assuming there's a record in the collection that is causing the issue. Can anyone recommend how to fix this problem? (tried deleting any record that doesn't have a status field but that did not help). Many thanks, Jean

