Re: Separating Search and Indexing in SolrCloud

Jaroslaw Rozanski Fri, 16 Dec 2016 13:06:12 -0800

Thanks,


On 16/12/16 20:56, Shawn Heisey wrote:
> On 12/16/2016 5:43 AM, Jaroslaw Rozanski wrote:
>> Leader is responsible for distributing update requests to replica. So
>> eventually all replicas have same state as leader. Not a problem. It
>> is more about the performance of such. If I gather correctly normal
>> replication happens by standard update request. Not by, say, segment
>> copy. 
> 
> For SolrCloud, yes.  The master/slave replication that existed before
> SolrCloud does work by copying segment files, but SolrCloud does not
> work that way.  The old master/slave replication feature IS used by
> SolrCloud, but ONLY for index recovery -- copying the entire index from
> the leader to another replica in the event that the replica gets so far
> behind that it cannot be brought current by regular updates and/or the
> transaction log.  This is also used to make new replicas.
> 
>> Hence, if my understanding is correct, sending search request to
>> replica only, in index heavy environment, would bring no benefit. 
> 
> Correct, it would have no benefit.  There's something else: when you
> send queries to SolrCloud, they do not necessarily stay on the node
> where you sent them.  By default, multiple query requests are load
> balanced across the cloud, so they'll hit the leader anyway, even if you
> never send them to the leader.

With custom Solr Client the above logic no longer applies to my case. I
can easily control to which replica/core in shard my query is directed
to (along with distrib=false).

>> So the question is: is there a mechanism, in SolrCloud (not legacy
>> master/slave set-up) to make one node take a load of indexing which
>> other nodes focus on searching. 
> 
> Indexing will always be done by all replicas, including the leader.
> 
> Something to mention, although it doesn't accomplish what you're after: 
> There is a preferLocalShards parameter that you can send with your query
> to keep SolrCloud from doing its load balancing *if* the query can be
> satisfied from local indexes.  This parameter should only be used in one
> of the following situations:
> 
> * Your query rate is very low.
> * You are already load balancing the requests yourself.
> 
> If the preferlocalShards parameter is used in other situations, it can
> end up concentrating a large number of requests onto some replicas and
> leaving the other replicas idle.
> 
> https://cwiki.apache.org/confluence/display/solr/Distributed+Requests#DistributedRequests-PreferLocalShards


Yeap, already solved. I am more concerned with indexing memory
requirements at volume affecting performance of search requests and/or
cluster stability.

> Thanks,
> Shawn
> 



-- 
Jaroslaw Rozanski | e: m...@jarekrozanski.com
695E 436F A176 4961 7793  5C70 AFDF FB5E 682C 4D3D

signature.asc
Description: OpenPGP digital signature

Re: Separating Search and Indexing in SolrCloud

Reply via email to