Hi,

Well, I see that 1377 is quite related with 1480 and they consider indexing
"same" documents to multiple solr servers within a cluster for Nutch v1.6.
Actually, the crucial point is partitioning the documents over different
solr instances which is discussed in 945. But the MurmurHashPartitioner
patch provided in 945 is for Nutch v2.2. What I wrote is a kind of
combination of these two patches, partitioning and indexing for v1.6.

I will attach the patch probably on monday, the original source code is on
my other computer.

Best,

Tugcem Oral



On Fri, Jul 5, 2013 at 5:48 PM, Markus Jelsma <[email protected]>wrote:

> Hi,
>
> 1480 and 1377 are different. We already use CloudSolrServer (i haven't
> added the patch yet) but also use 1480 to write to multiple Solr clusters!
> Both need still need patches and i haven't had time yet to provide them
> although we already use both features in our Nutch.
>
> I'll try to find some time next week, should be easy.
>
> Cheers
>
>
>
> -----Original message-----
> > From:Tuğcem Oral <[email protected]>
> > Sent: Friday 5th July 2013 16:24
> > To: [email protected]
> > Subject: Indexing from nutch 1.6 to solr 4.3.1 cloud
> >
> > Hi all,
> >
> > There' re several issues and patches about indexing nutch segments to
> > multiple solr servers. However,
> > NUTCH-1377<https://issues.apache.org/jira/browse/NUTCH-1377>
> >  and NUTCH-1480 <https://issues.apache.org/jira/browse/NUTCH-1480>
> considers
> > a patch about indexing *same* document set to different solr servers.
> Also,
> > NUTCH-945 <https://issues.apache.org/jira/browse/NUTCH-945> provides a
> > patch for partitioning documents by murmur hash partitioner and index to
> > different solr servers, but for Nutch version 2.x As we're still using
> > nutch 1.6 (w/o gora), I couldn't find a valid patch. thus I wrote my
> own. I
> > tested to index ~8M crawled documents to a solr cluster with 4 shards
> (each
> > with 1 replica, total 8 nodes), and it scaled quite good. I'd love to
> share
> > this patch with you guys.
> >
> > Best,
> >
> > Tugcem Oral
> >
> > --
> > TO
>



-- 
TO

Reply via email to