Thanks for the quick response Markus! How would that fit into this continuous crawling scenario (I am trying to get the updates as quickly as possible into solr :-)
If I am doing the generate --> fetch $SEGMENT --> parse $SEGMENT --> updatedb crawldb $segment --> solrindex --> solrdedub cycle and i am generating an "on the fly" segment and I just happen to be generating it (and not done) when the updatedb command runs (changing it to the -dir option), isn't that bad? Has anyone tested the mergedb command with potentially hundreds and hundreds of dbs to merge (one per changed url)? -- View this message in context: http://lucene.472066.n3.nabble.com/crawl-and-update-one-url-already-in-crawldb-tp3848358p3848423.html Sent from the Nutch - User mailing list archive at Nabble.com.