Hi Tugcem, Great! Could you attach a patch to NUTCH-945? See https://wiki.apache.org/nutch/HowToContribute for instructions
Thanks Julien On 5 July 2013 15:23, Tuğcem Oral <[email protected]> wrote: > Hi all, > > There' re several issues and patches about indexing nutch segments to > multiple solr servers. However, > NUTCH-1377<https://issues.apache.org/jira/browse/NUTCH-1377> > and NUTCH-1480 <https://issues.apache.org/jira/browse/NUTCH-1480> > considers > a patch about indexing *same* document set to different solr servers. Also, > NUTCH-945 <https://issues.apache.org/jira/browse/NUTCH-945> provides a > patch for partitioning documents by murmur hash partitioner and index to > different solr servers, but for Nutch version 2.x As we're still using > nutch 1.6 (w/o gora), I couldn't find a valid patch. thus I wrote my own. I > tested to index ~8M crawled documents to a solr cluster with 4 shards (each > with 1 replica, total 8 nodes), and it scaled quite good. I'd love to share > this patch with you guys. > > Best, > > Tugcem Oral > > -- > TO > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble

