[
https://issues.apache.org/jira/browse/SOLR-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367410#comment-15367410
]
Noble Paul commented on SOLR-7280:
----------------------------------
Had a chat with [~shalinmangar] and came up with the following design.
h4. Objectives
* Move away from the current design of infinite number of threads for core
loads which leads to OOM or other issues
* Avoid the leaderVoitWait problem which leads to shards with no leader for a
long time or even (down shards)
Blindly sorting cores based on replica names is not foolproof. It can lead to
deadlocks depending on how the replicas are distributed. The sorting logic
could be as follows.
h5. Core Sorting logic
When a node comes up, it reads the list of live nodes and the states of each
collection it hosts. Construct a List of shards {{collectionName+shardName}} it
hosts sorted by the (no:of replicas for that shard in other started nodes +
no:of replicas present in the current node for that replica) . Break the tie by
sorting the name in alphabetic {{collectionName+shardName}} order. This
ensures that no other node is waiting for some replica in this node to be up.
h5. Thread count
The default no:of {{coreLoadThreads}} should be much higher for SolrCloud
(Maybe 50 ?). The user should be able to override the value by explicitly
configuring it.
> Load cores in sorted order and tweak coreLoadThread counts to improve cluster
> stability on restarts
> ---------------------------------------------------------------------------------------------------
>
> Key: SOLR-7280
> URL: https://issues.apache.org/jira/browse/SOLR-7280
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Reporter: Shalin Shekhar Mangar
> Assignee: Noble Paul
> Fix For: 5.2, 6.0
>
> Attachments: SOLR-7280.patch
>
>
> In SOLR-7191, Damien mentioned that by loading solr cores in a sorted order
> and tweaking some of the coreLoadThread counts, he was able to improve the
> stability of a cluster with thousands of collections. We should explore some
> of these changes and fold them into Solr.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]