[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938775#comment-13938775
 ] 

Mark Miller commented on SOLR-5872:
-----------------------------------

It's obviously just a name :) I didn't know that it existed - that's all I was 
saying - I figured it meant something else. To me it doesn't make much sense. I 
think if we decide to split out the clusterstate.json per collection, that is 
the direction we should take, we should only support one clusterstate.json for 
back compat at most, and no such special name should exist. Solr 5.0 would no 
longer support the single clusterstate.json. Or, we might even decide to have 
the Overseer upgrade the format for you or something before 5.0.

Other thoughts on Overseer performance:

* Because only one process should be reading and removing items from the 
distributed queue at a time, seems like there are many cases we could read 
multiple nodes in one call.

* Perhaps 1500ms is not a great batch time - would be interesting if we made it 
configurable as well.

* Seems there might be a lot of room for parallelism - we probably only need to 
order within a collection if not simply per SolrCore. 

> Eliminate overseer queue 
> -------------------------
>
>                 Key: SOLR-5872
>                 URL: https://issues.apache.org/jira/browse/SOLR-5872
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'ĂȘtre of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to