Here is my POC to add a queue into CoreAdminHandler:
https://github.com/apache/solr/pull/1761

It does the following:
- add a flag to core admin operations to be marked as expensive. For now,
only backup and restore are expensive, this may be extended.
- in CoreAdminHandler, we count the number of in-flight expensive
operations. If more than the limit (currently 5 by default) are
already in-flight, we don't submit any new ones to the thread pool, but we
add them into a queue.
- each time an expensive operation is completed, it starts the following
one from the queue, if any.

Let me know what you think
Thanks

Le jeu. 29 juin 2023 à 15:37, Pierre Salagnac <pierre.salag...@gmail.com> a
écrit :

> Jason, I haven't done much scalability testing, so it's hard to give
> accurate numbers on when we start having issues.
> For the environment I looked in detail we run a 16 nodes cluster, and the
> collection I wasn't able to backup has about 1500 shards, ~1.5 GB each.
>
> Core backups/restores are expensive calls, compared to other admin calls
> like an empty core creation. I'm not sure of the full list of expensive and
> non-expense operations, but we may also have a fairness issue when
> expensive operations block the non expensive ones for a while.
>
> I had a great discussion with David.
> I see two options for a long term fix:
>
> 1. Throttle core backups/restore at the top level (in the overseer).
> My approach was to not send too many concurrent requests from BackupCmd.
> I have a decent POC for backups, but it should be refactored to share this
> mechanism for all admin operations.
> It will be harder to achieve in distributed mode (no overseer), since
> we'll need a central place to count how many backups are in-flight. We may
> somehow lock in Zookeeper for this, so it may be over complex at the end.
>
>
> 2. Throttle in each node.
> - Currently, all async admin operations are handled by a
> ThreadPoolExecutor defined in CoreAdminHandler class. This pool is
> hardcoded with a size of 50, so if we receive more than 50 concurrent
> tasks, we add them in the queue of the executor. For a large collection
> backup, each node immediately starts 50 concurrent core snapshots.
> - Instead of immediately submitting to the executor, I think we should
> manage our own queue here for expensive operations. By counting the number
> of in-flight backups/restores, we don't submit to the executor more than
> (lets say) 10 concurrent backups. Each time one is complete, we poll the
> queue and start the next one if appropriate.
> - This ensures fairness for expensive and non expensive operations.
> Non-expensive ones will always have at least 40 threads to be quickly
> handled. And this works well in distributed mode since the overseer is not
> involved.
> - This could be extended to define more than one queue, but I'm not sure
> it is worth it.
>
> Pierre
>
>
> Le mar. 27 juin 2023 à 19:16, David Smiley <dsmi...@apache.org> a écrit :
>
>> Here's a POC: https://github.com/apache/solr/pull/1729
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Mon, Jun 19, 2023 at 3:36 PM David Smiley <dsmi...@apache.org> wrote:
>>
>> > Has anyone mitigated the potentially large IO impact of doing a backup
>> of
>> > a large collection or just in general?  If the collection is large
>> enough,
>> > there very well could be many shards on one host and it could saturate
>> the
>> > IO.  I wonder if there should be a rate limit mechanism or some other
>> > mechanism.
>> >
>> > Not the same but I know that at a segment level, the merges are rate
>> > limited -- ConcurrentMergeScheduler doesn't quite let you set it but
>> > adjusts itself automatically ("ioThrottle" boolean).
>> >
>> > ~ David Smiley
>> > Apache Lucene/Solr Search Developer
>> > http://www.linkedin.com/in/davidwsmiley
>> >
>>
>

Reply via email to