Anshum Gupta created SOLR-5681:
----------------------------------

             Summary: Make the OverseerCollectionProcessor multi-threaded
                 Key: SOLR-5681
                 URL: https://issues.apache.org/jira/browse/SOLR-5681
             Project: Solr
          Issue Type: Improvement
          Components: SolrCloud
            Reporter: Anshum Gupta
            Assignee: Anshum Gupta


Right now, the OverseerCollectionProcessor is single threaded i.e submitting 
anything long running would have it block processing of other mutually 
exclusive tasks.
With OCP tasks becoming async, it'd be good to have truly non-blocking behavior 
by multi-threading the OCP itself.

For example, a ShardSplit call on Collection1 would block the thread and 
thereby, not processing a create collection task (which would stay queued in 
zk) though both the tasks are mutually exclusive.

Here are a few of the challenges:
* Mutual exclusivity: Only let mutually exclusive tasks run in parallel. An 
easy way to handle that is to only let 1 task per collection run at a time.
* ZK Distributed Queue to feed tasks: The OCP consumes tasks from a queue. The 
task from the workQueue is only removed on completion so that in case of a 
failure, the new Overseer can re-consume the same task and retry. A queue is 
not the right data structure in the first place to look ahead i.e. get the 2nd 
task from the queue when the 1st one is in process. Also, deleting tasks which 
are not at the head of a queue is not really an 'intuitive' thing.

Proposed solutions for task management:
* Task funnel and peekAfter(): The parent thread is responsible for getting and 
passing the request to a new thread (or one from the pool). The parent method 
uses a peekAfter(last element) instead of a peek(). The peekAfter returns the 
task after the 'last element'. Maintain this request information and use it for 
deleting/cleaning up the workQueue.
* Another (almost duplicate) queue: While offering tasks to workQueue, also 
offer them to a new queue (call it volatileWorkQueue?). The difference is, as 
soon as a task from this is picked up for processing by the thread, it's 
removed from the queue. At the end, the cleanup is done from the workQueue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to