[
https://issues.apache.org/jira/browse/SOLR-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998613#comment-15998613
]
Erick Erickson commented on SOLR-10524:
---------------------------------------
So if I were preparing an "executive summary", there would be several
take-aways:
1> The number of update state operations, i.e. the number of times state is
actually written to ZK is drastically lower under heavy load; by a factor of
almost 400!
2> One implication here is that the number of state change notifications that
ZK has to send out, and thus the number of times the state gets read by Solr
nodes is _also_ decreased by that same factor. So the fact that the state-read
operations throughput is the same should be evaluated in light of the fact that
there will be many fewer of them.
3> One thing not captured by the numbers is that the size of the Overseer queue
is much less like to spin out of control due to both <2> and the fact that
we're reading/ordering/processing batches of up to 10,000 messages at once.
4> Even though some of the throughput numbers haven't changed (am_i_leader for
instance), they'll spend much less time waiting to be carried out due to 1-3.
Plus only three points may make a circle, but isn't enough data to make a good
generalization from ;)
Is this fair? Accurate? Complete? I'm looking for something to present to those
users who have seen the Overseer queue grow to the 100s of K, effectively
making their cluster unusable.
Thanks for this work! As collections get larger and larger this has become a
very significant pain-point.
> Explore in-memory partitioning for processing Overseer queue messages
> ---------------------------------------------------------------------
>
> Key: SOLR-10524
> URL: https://issues.apache.org/jira/browse/SOLR-10524
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Erick Erickson
> Attachments: SOLR-10524.patch, SOLR-10524.patch, SOLR-10524.patch,
> SOLR-10524.patch
>
>
> There are several JIRAs (I'll link in a second) about trying to be more
> efficient about processing overseer messages as the overseer can become a
> bottleneck, especially with very large numbers of replicas in a cluster. One
> of the approaches mentioned near the end of SOLR-5872 (15-Mar) was to "read
> large no:of items say 10000. put them into in memory buckets and feed them
> into overseer....".
> This JIRA is to break out that part of the discussion as it might be an easy
> win whereas "eliminating the Overseer queue" would be quite an undertaking.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]