I was thinking of using last modified header, but it may be absent. In that case we could use signature of urls in the indexing time. I took a look to to code, it seems it is implemented but not working. I tested nutch-1.4 with a single url, solrindexer always sends the same number of documents to solr although none of the urls is changed.
Thanks. Alex. -- View this message in context: http://lucene.472066.n3.nabble.com/using-less-resources-tp3985537p3990625.html Sent from the Nutch - User mailing list archive at Nabble.com.

