Hi all, I'm running nutch 1.6 and solr 3.6.2 and I'm crawling with depth 1 topN 1000000 and 'db.update.additions.allowed' false. The idea is to fetch, parse and index only the URLs in the seed list.
I seed ~120K URLs but in solr I see only ~20K indexed. The fetch job counters show: moved 49,937 robots_denied 1,149 robots_denied_maxcrawldelay 267 hitByTimeLimit 6,072 exception 4,479 notmodified 2 access_denied 4 temp_moved 4,658 success 23,033 notfound 1,658 and the ParserStatus success count is 22844 What happened to all the URLs ? they are all active URLs, not some old list... Thanks, Amit.

