[ 
https://issues.apache.org/jira/browse/SOLR-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233700#comment-14233700
 ] 

Timothy Potter commented on SOLR-6816:
--------------------------------------

Cool - been looking into this as well, nothing definitive yet but here's one 
thing I've noticed:

CPU load is considerable higher on replicas than on leaders when doing 
high-volume indexing with batched documents coming from the client. Basically 
the batch gets broken up on the leader and then sent in mini-batches to 
replicas. Thus, replicas are having to process many more update requests than 
leaders to index the same documents. Check out these two graphs from one of my 
tests:

replica - http://www.dropmocks.com/mHoPUx
leader - http://www.dropmocks.com/mHoWpX

Pretty clear that there is considerably higher load on the replica than on the 
leader. This was done with a 1x2 collection each replica on a separate node. 
Without replication, I indexed a 10M doc collection (my synthetic ones ~1K 
each) at 7,225 docs per second. With replication, I got 4,626 per second ~ 36% 
slower.

Behind the scenes, CUSS does some minimal buffering of the docs, so there are 
many, many more requests sent from the leader to the replica. The updateHandler 
stats tell a good story (basically the replica received 5x the number of update 
requests than the leader for just 10M docs).

Leader:
requests:40,022
avgRequestsPerSecond:9.830068574096831
5minRateReqsPerSecond:0.09800683344335624
15minRateReqsPerSecond:2.9134044254494302
avgTimePerRequest:628.6526956285293
medianRequestTime:379.48604750000004
75thPcRequestTime:568.784846
95thPcRequestTime:1365.1776681499978
99thPcRequestTime:6501.922041030025

Replica:
requests:206,367
avgRequestsPerSecond:51.13560879471209
5minRateReqsPerSecond:0.514584592882959
15minRateReqsPerSecond:14.541814273402418
avgTimePerRequest:104.61283714253733
medianRequestTime:35.7488105
75thPcRequestTime:96.46166525
95thPcRequestTime:272.08549294999995
99thPcRequestTime:718.7258438000003

I've been experimenting with tweaking things like the pollQueueTime, queueSize, 
runner count setup by StreamingSolrServers but haven't come up with a 
definitive recipe for improving things ... still digging ;-)

> Review SolrCloud Indexing Performance.
> --------------------------------------
>
>                 Key: SOLR-6816
>                 URL: https://issues.apache.org/jira/browse/SOLR-6816
>             Project: Solr
>          Issue Type: Task
>          Components: SolrCloud
>            Reporter: Mark Miller
>            Priority: Critical
>         Attachments: SolrBench.pdf
>
>
> We have never really focused on indexing performance, just correctness and 
> low hanging fruit. We need to vet the performance and try to address any 
> holes.
> Note: A common report is that adding any replication is very slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to