Hi all,

There' re several issues and patches about indexing nutch segments to
multiple solr servers. However,
NUTCH-1377<https://issues.apache.org/jira/browse/NUTCH-1377>
 and NUTCH-1480 <https://issues.apache.org/jira/browse/NUTCH-1480> considers
a patch about indexing *same* document set to different solr servers. Also,
NUTCH-945 <https://issues.apache.org/jira/browse/NUTCH-945> provides a
patch for partitioning documents by murmur hash partitioner and index to
different solr servers, but for Nutch version 2.x As we're still using
nutch 1.6 (w/o gora), I couldn't find a valid patch. thus I wrote my own. I
tested to index ~8M crawled documents to a solr cluster with 4 shards (each
with 1 replica, total 8 nodes), and it scaled quite good. I'd love to share
this patch with you guys.

Best,

Tugcem Oral

-- 
TO

Reply via email to