He thanks for the response. I had lost hope of one. :-) 3. Firstly, yes I am using Solr for indexing. So whatever you have said makes a lot of sense. For 404 pages, which are not picked up in crawl I am doing a manual delete as of now, but it is a pain. I am thinking of some ways to get this automated.
2. Re-indexing is not the issue. The issue is during a re-crawl even when nutch picks up a change I dont see my index getting updated. Don't know what is missing. This however happens, only in a few cases. 1. db.update.additions.allowed is set to true, so I guess things seem to be in place there. -- View this message in context: http://lucene.472066.n3.nabble.com/Crawling-basic-questions-tp3057896p3084813.html Sent from the Nutch - User mailing list archive at Nabble.com.

