re-indexing Nutch data (Best Practice?)

BlackIce Sat, 25 Apr 2015 05:55:46 -0700

HI,

We have our search engine now as Beta 0.1 at www.enlle.com


We are using Nutch 1.9 to crawl the web and index data to Solr.

Currently we are at over 4 million records, which will increase
dramatically every day!

It has ocurred to me that we will be tweaking Solr frequently in order to
improve search results/performance, etc...

Each Tweak in Solr's shema.xml will requiere to re-index data into Solr.

Currently I've been using solrindexer to re-add the already crawled data to
Solr with mixed results, on small data sets it finishes, on larger it dies
halfway trough after a few hours of operation.

So the question would be: How are other Nutch users dealing with
re-building their solr indexes after a change to Solr?


Thank you!

re-indexing Nutch data (Best Practice?)

Reply via email to