RE: Change leader in SolrCloud

Gian Maria Ricci - aka Alkampfer Wed, 13 Jan 2016 05:57:43 -0800

Thanks.

--
Gian Maria Ricci
Cell: +39 320 0136949



-----Original Message-----
From: Alessandro Benedetti [mailto:abenede...@apache.org] 
Sent: martedì 12 gennaio 2016 10:52
To: solr-user@lucene.apache.org
Subject: Re: Change leader in SolrCloud

I would like to do a special mention of the update request processor chain Solr 
Cloud mechanism.[1] Quoting the documentation :

In a distributed SolrCloud situation setup, All processors in the chain
> *before* the DistributedUpdateProcessor are run on the first node that 
> receives an update from the client, regardless of this nodes status as 
> a leader or replica.  The DistributedUpdateProcessor then forwards the 
> update to the appropriate shard leader for the update (or to multiple 
> leaders in the event of an update that affects multiple documents, 
> such as a delete by query, or commit). The shard leader uses a 
> transaction log to apply  Atomic Updates & Optimistic Concurrency 
> <https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Do
> cuments> and then forwards the update to all of the shard replicas. 
> The leader and each replica run all of the processors in the chain 
> that are listed *after* the DistributedUpdateProcessor.
>

This means you need to be careful in the case you have an heavy update 
processor chain happening before the DistributedUpdateProcessor.
In that case the first node that gets the document to be indexed is going to 
have much more work ( running all the update request processor before the 
distribution) .

All the consideration already mentioned are of course still valid.

Cheers


[1]
https://cwiki.apache.org/confluence/display/solr/Update+Request+Processors

On 12 January 2016 at 08:19, Gian Maria Ricci - aka Alkampfer < 
alkamp...@nablasoft.com> wrote:

> Understood, thanks. I thought that the leader send data to other 
> shards after indexing and autocommit take place, but I know that this 
> is not the optimal situation. Sending all documents to all shard Solr 
> can guarantee consistency of data.
>
> Now everything is more clear. Thanks for the explanation.
>
> --
> Gian Maria Ricci
> Cell: +39 320 0136949
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: martedì 12 gennaio 2016 02:27
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: Change leader in SolrCloud
>
> bq:  It seems to me a huge
> wasting of resources.
>
> How else would you guarantee consistency? Especially taking in to 
> account Lucene's write-once segments? Master/Slave sidesteps the 
> problem by moving entire, closed segments to the slave, but as Shawn 
> says if the master goes down the slaves don't have _any_ docs from the 
> not-closed segments.
>
> Best,
> Erick
>
> On Mon, Jan 11, 2016 at 1:42 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> > On 1/11/2016 1:23 PM, Gian Maria Ricci - aka Alkampfer wrote:
> >> Ok, this imply that if I have X replica of a shard, the document is
> indexed X+1 times? one for each replica plus the leader shard? It 
> seems to me a huge wasting of resources.
> >>
> >> In a Master/slave scenario indexing takes places only on master 
> >> node,
> then slave replicates analyzed data.
> >
> > The leader *is* a replica.  So if you have a replicationFactor of 
> > three, you have three replicas for each shard.  For each shard, one 
> > of those replicas gets elected to be the leader.  You do not have a 
> > leader and two replicas.
> >
> > Th
> e above is perhaps extremely pedantic, but understanding how SolrCloud
> > works requires understanding that being temporarily assigned the 
> > leader role does not change how the replica works, it just adds some 
> > additional coordination responsibilities.
> >
> > To answer your question, let's assume you build an index with 
> > replicationFactor=3.  No new replicas are added, and all machines 
> > are up.  In that situation, each document gets indexed a total of 
> > three
> times.
> >
> > In return for this additional complexity and resource usage, you 
> > don't have a single point of failure for indexing.  With 
> > master/slave replication, if your master goes down for any length of 
> > time, you must reconfigure all of your remaining Solr nodes to change the 
> > master.
> > Chances are very good that you will experience downtime.
> >
> > Thanks,
> > Shawn
> >
>



--
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

RE: Change leader in SolrCloud

Reply via email to