[ 
https://issues.apache.org/jira/browse/SOLR-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14135171#comment-14135171
 ] 

Ramkumar Aiyengar commented on SOLR-6491:
-----------------------------------------

These are the concerns with the leadership mechanism as it stands currently, 
with no balancing (which would result in leaders all ganging up on one set of 
machines). I am talking based on experience with a NRT system with a fairly 
high rate of indexing, very low commit interval, and hundreds of shards (50+ on 
each machine).

 * The biggest performance issue is not during indexing normally but when some 
replicas are recovering. In such a case, the machines with leaders have to 
service around 50+ IO intensive recovery operations, indexing can really take a 
hit during this time (we have seen indexing latency increase by a few times).
 ** SOLR-6485 somewhat improves this situation, but is a compromise really, it 
increases the time taken for recovery when you could really spread the IO load 
on different machines, doesn't help prevent "spikiness" (you hit IO hard for a 
few 100ms, and then stay quiet for a few 100ms more), and is risky in a cloud 
environment because recovery can be spontaneous (say, a ZK disconnect) -- in 
such a case, the system is already vulnerable due to unplanned, reduced 
capacity and this prolongs that situation.
 * Overseer is hit harder when a machine with leaders dies, or goes down, or if 
there's a ZK expiry on a Solr instance with all cores being leaders. You have a 
lot more elections happening at the same time, and despite various improvements 
done to Overseer recently, it's finally bound as well by how fast ZK can 
respond. This in turn impacts the amount of time replicas find themselves 
without noticing a leader and hence ingestion slows down considerably.
 ** A lesser case of this is when an instance encounters a ZK expiry, you need 
to re-elect each one of the cores in it if all the leaders gang up in one place.
 * If the machine containing the leaders dies, then there's a ephemeral node 
timeout which would affect indexing in general even before elections kick in. 
This is a lot worse (affects a lot more documents) if leadership is 
concentrated on a machine.
 * Even if instances on a 'leader' machine are orderly shutting down, there's a 
time delay between the instance shutting down and the instance losing it's 
leadership because of the servlet model we are currently tied to (the container 
first refuses connections, then gets the servlet to deal with it). Having 
leaders in one place leads to more documents being affected by this. I agree 
this however could potentially be solved by other mechanisms, for example, by 
having a different handler which forces cores to let go of leadership, which is 
called by a script prior to shutdown, or ideally, by getting rid of the servlet 
model as the long term plan is..

> Add preferredLeader as a ROLE and a collections API command to respect this 
> role
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-6491
>                 URL: https://issues.apache.org/jira/browse/SOLR-6491
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.11, 5.0
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>
> Leaders can currently get out of balance due to the sequence of how nodes are 
> brought up in a cluster. For very good reasons shard leadership cannot be 
> permanently assigned.
> However, it seems reasonable that a sys admin could optionally specify that a 
> particular node be the _preferred_ leader for a particular collection/shard. 
> During leader election, preference would be given to any node so marked when 
> electing any leader.
> So the proposal here is to add another role for preferredLeader to the 
> collections API, something like
> ADDROLE?role=preferredLeader&collection=collection_name&shard=shardId
> Second, it would be good to have a new collections API call like 
> ELECTPREFERREDLEADERS?collection=collection_name
> (I really hate that name so far, but you see the idea). That command would 
> (asynchronously?) make an attempt to transfer leadership for each shard in a 
> collection to the leader labeled as the preferred leader by the new ADDROLE 
> role.
> I'm going to start working on this, any suggestions welcome!
> This will subsume several other JIRAs, I'll link them momentarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to