[ 
https://issues.apache.org/jira/browse/SOLR-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927301#comment-15927301
 ] 

Joshua Humphries commented on SOLR-5872:
----------------------------------------

Right, but when a node comes up and changes replica states to active, it is 
highly likely that the number of events for a single collection will be ~1. So 
breaking batches at collection boundaries results in effectively no batching.

With the current code, there's no benefit to combining writes for multiple 
collections into the same batch. But if the code pipelined all of the writes 
for a batch (instead of issuing each one synchronously, blocking for each 
result) then combining writes across collections would reduce latency.

When you suggest partitioning the queue, do you mean multiple ZK queues? Seems 
simpler to just partition in memory: ingest the whole queue (or up to some 
limit) and push into in-memory queues (one per partition; could even explode a 
'downnode' message into the multiple updates it implies and scatter those 
updates across partitions). After one of the in-memory partitions completes an 
item, it can delete the corresponding entry from ZK. So, from ZK's point of 
view, the operations can completing out-of-order instead of always polling the 
head of the queue. When partitions quiesce (or when some other policy allows 
more items to be polled -- so we don't necessarily have to wait on all 
partitions to complete before grabbing more items), ingest another batch of 
items from ZK.

> Eliminate overseer queue 
> -------------------------
>
>                 Key: SOLR-5872
>                 URL: https://issues.apache.org/jira/browse/SOLR-5872
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>
> The overseer queue is one of the busiest points in the entire system. The 
> raison d'ĂȘtre of the queue is
>  * Provide batching of operations for the main clusterstate,json so that 
> state updates are minimized 
> * Avoid race conditions and ensure order
> Now , as we move the individual collection states out of the main 
> clusterstate.json, the batching is not useful anymore.
> Race conditions can easily be solved by using a compare and set in Zookeeper. 
> The proposed solution  is , whenever an operation is required to be performed 
> on the clusterstate, the same thread (and of course the same JVM)
>  # read the fresh state and version of zk node  
>  # construct the new state 
>  # perform a compare and set
>  # if compare and set fails go to step 1
> This should be limited to all operations performed on external collections 
> because batching would be required for others 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to