[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700144#comment-14700144
 ] 

Scott Blum commented on SOLR-5872:
----------------------------------

At the risk of creating two code paths, here's an idea.

1) We could improve batching *significantly* at the Overseer level, to be able 
to batch even when the same collection isn't updated twice in a row.  We just 
need something like a dirty list instead of only tracking the last one and the 
shared clusterStateModified.  This could be an independent improvement.

2) When performing updates on format=2, we could use a size heuristic to decide 
whether or not to go through the queue.  For collections with less than N 
shards, we could just do a local CAS loop for state update ops.  For 
collections with more than N shares we'd just always go through the queue.

> Eliminate overseer queue 
> -------------------------
>
>                 Key: SOLR-5872
>                 URL: https://issues.apache.org/jira/browse/SOLR-5872
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'ĂȘtre of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to