[ 
https://issues.apache.org/jira/browse/SOLR-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14002020#comment-14002020
 ] 

Mark Miller commented on SOLR-5232:
-----------------------------------

Some more real world experience - the old system of internally sending around 
batches of 10 docs was horribly inefficient and a major performance limiter. 
The only way this might not be the case was if you were using client side 
hashing and no replicas. Batching with multiple threads is the key to 
performance with SolrCloud and the internal batch by 10 would just decimate the 
performance no matter the size the user batched - even with no replicas and 
just internal forwarding. This change unlocked that performance bottleneck and 
is at least many times faster in some cases.

> SolrCloud should distribute updates via streaming rather than buffering.
> ------------------------------------------------------------------------
>
>                 Key: SOLR-5232
>                 URL: https://issues.apache.org/jira/browse/SOLR-5232
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>            Priority: Critical
>             Fix For: 4.6, 5.0
>
>         Attachments: SOLR-5232.patch, SOLR-5232.patch, SOLR-5232.patch, 
> SOLR-5232.patch, SOLR-5232.patch, SOLR-5232.patch
>
>
> The current approach was never the best for SolrCloud - it was designed for a 
> pre SolrCloud Solr - it also uses too many connections and threads - nailing 
> that down is likely wasted effort when we should really move away from 
> explicitly buffering docs and sending small batches per thread as we have 
> been doing.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to