[
https://issues.apache.org/jira/browse/SOLR-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964796#comment-13964796
]
Anshum Gupta commented on SOLR-5681:
------------------------------------
It's supposed to work in the following cases:
* Calls are async
* Only one call per collection would be processed at a given time (keeping the
mutual exclusion logic simple for now).
I'll also buffer top-X submitted tasks from the zk queue in memory and use an
internal, in-memory tracking for the purpose of processing the parallel tasks.
Here's the flow that I'm working on now:
# Read top-X tasks from the zk queue (as opposed to a single task now)
# Process tasks, spawn thread/pass task to ThreadPoolExecutor in case it's an
ASYNC request, else process inline(still debating). Synchronize on the tasks
buffer and internal maps. Remove when done.
# Delete the task from zk directly (not using typical queue mechanism).
# + Store more information about the completed ASYNC task in the zk map entry.
> Make the OverseerCollectionProcessor multi-threaded
> ---------------------------------------------------
>
> Key: SOLR-5681
> URL: https://issues.apache.org/jira/browse/SOLR-5681
> Project: Solr
> Issue Type: Improvement
> Components: SolrCloud
> Reporter: Anshum Gupta
> Assignee: Anshum Gupta
>
> Right now, the OverseerCollectionProcessor is single threaded i.e submitting
> anything long running would have it block processing of other mutually
> exclusive tasks.
> When OCP tasks become optionally async (SOLR-5477), it'd be good to have
> truly non-blocking behavior by multi-threading the OCP itself.
> For example, a ShardSplit call on Collection1 would block the thread and
> thereby, not processing a create collection task (which would stay queued in
> zk) though both the tasks are mutually exclusive.
> Here are a few of the challenges:
> * Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An
> easy way to handle that is to only let 1 task per collection run at a time.
> * ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue.
> The task from the workQueue is only removed on completion so that in case of
> a failure, the new Overseer can re-consume the same task and retry. A queue
> is not the right data structure in the first place to look ahead i.e. get the
> 2nd task from the queue when the 1st one is in process. Also, deleting tasks
> which are not at the head of a queue is not really an 'intuitive' thing.
> Proposed solutions for task management:
> * Task funnel and peekAfter(): The parent thread is responsible for getting
> and passing the request to a new thread (or one from the pool). The parent
> method uses a peekAfter(last element) instead of a peek(). The peekAfter
> returns the task after the 'last element'. Maintain this request information
> and use it for deleting/cleaning up the workQueue.
> * Another (almost duplicate) queue: While offering tasks to workQueue, also
> offer them to a new queue (call it volatileWorkQueue?). The difference is, as
> soon as a task from this is picked up for processing by the thread, it's
> removed from the queue. At the end, the cleanup is done from the workQueue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]