Re: crawl and update one url already in crawldb

webdev1977 Thu, 22 Mar 2012 06:11:13 -0700

Thanks for the quick response Markus! 

How would that fit into this continuous crawling scenario (I am trying to
get the updates as quickly as possible into solr :-)


If I am doing the generate --> fetch $SEGMENT --> parse $SEGMENT -->
updatedb crawldb $segment --> solrindex --> solrdedub  cycle and i am
generating an "on the fly" segment and I just happen to be generating it
(and not done) when the updatedb command runs (changing it to the -dir
option), isn't that bad?

Has anyone tested the mergedb command with potentially hundreds and hundreds
of dbs to merge (one per changed url)?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/crawl-and-update-one-url-already-in-crawldb-tp3848358p3848423.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: crawl and update one url already in crawldb

Reply via email to