These are general guidelines, I've done loads of networking, but may be less 
familiar with SolrCloud  and CDCR architecture.  However, I know it's all TCP 
sockets, so general guidelines do apply.

Check the round-trip time between the data centers using ping or TCP ping.   
Throughput tests may be high, but if Solr has to wait for a response to a 
request before sending the next action, then just like any network protocol 
that does that, it will get slow.

I'm pretty sure CDCR uses HTTP/HTTPS rather than just TCP, so also check 
whether some proxy/load balancer between data centers is causing it to be a 
single connection per operation.   That will *kill* performance.   Some proxies 
default to HTTP/1.0 (open, send request, server send response, close), and that 
will hurt.

Why you should listen to me even without SolrCloud knowledge - checkout paper 
"Latency performance of SOAP Implementations".   Same distribution of skills - 
I knew TCP well, but Apache Axis 1.1 not so well.   I still improved response 
time of Apache Axis 1.1 by 250ms per call with 1-line of code.

-----Original Message-----
From: Tom Peters [mailto:tpet...@synacor.com] 
Sent: Wednesday, March 7, 2018 6:19 PM
To: solr-user@lucene.apache.org
Subject: CDCR performance issues

I'm having issues with the target collection staying up-to-date with indexing 
from the source collection using CDCR.
 
This is what I'm getting back in terms of OPS:

    curl -s 'solr2-a:8080/solr/mycollection/cdcr?action=OPS' | jq .
    {
      "responseHeader": {
        "status": 0,
        "QTime": 0
      },
      "operationsPerSecond": [
        "zook01,zook02,zook03/solr",
        [
          "mycollection",
          [
            "all",
            49.10140553500938,
            "adds",
            10.27612635309587,
            "deletes",
            38.82527896994054
          ]
        ]
      ]
    }

The source and target collections are in separate data centers.

Doing a network test between the leader node in the source data center and the 
ZooKeeper nodes in the target data center show decent enough network 
performance: ~181 Mbit/s

I've tried playing around with the "batchSize" value (128, 512, 728, 1000, 
2000, 2500) and they've haven't made much of a difference.

Any suggestions on potential settings to tune to improve the performance?

Thanks

--

Here's some relevant log lines from the source data center's leader:

    2018-03-07 23:16:11.984 INFO  
(cdcr-replicator-207-thread-3-processing-n:solr2-a:8080_solr 
x:mycollection_shard1_replica_n6 s:shard1 c:mycollection r:core_node9) 
[c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] 
o.a.s.h.CdcrReplicator Forwarded 511 updates to target mycollection
    2018-03-07 23:16:23.062 INFO  
(cdcr-replicator-207-thread-4-processing-n:solr2-a:8080_solr 
x:mycollection_shard1_replica_n6 s:shard1 c:mycollection r:core_node9) 
[c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] 
o.a.s.h.CdcrReplicator Forwarded 510 updates to target mycollection
    2018-03-07 23:16:32.063 INFO  
(cdcr-replicator-207-thread-5-processing-n:solr2-a:8080_solr 
x:mycollection_shard1_replica_n6 s:shard1 c:mycollection r:core_node9) 
[c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] 
o.a.s.h.CdcrReplicator Forwarded 511 updates to target mycollection
    2018-03-07 23:16:36.209 INFO  
(cdcr-replicator-207-thread-1-processing-n:solr2-a:8080_solr 
x:mycollection_shard1_replica_n6 s:shard1 c:mycollection r:core_node9) 
[c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] 
o.a.s.h.CdcrReplicator Forwarded 512 updates to target mycollection
    2018-03-07 23:16:42.091 INFO  
(cdcr-replicator-207-thread-2-processing-n:solr2-a:8080_solr 
x:mycollection_shard1_replica_n6 s:shard1 c:mycollection r:core_node9) 
[c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] 
o.a.s.h.CdcrReplicator Forwarded 512 updates to target mycollection
    2018-03-07 23:16:46.790 INFO  
(cdcr-replicator-207-thread-3-processing-n:solr2-a:8080_solr 
x:mycollection_shard1_replica_n6 s:shard1 c:mycollection r:core_node9) 
[c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] 
o.a.s.h.CdcrReplicator Forwarded 511 updates to target mycollection
    2018-03-07 23:16:50.004 INFO  
(cdcr-replicator-207-thread-4-processing-n:solr2-a:8080_solr 
x:mycollection_shard1_replica_n6 s:shard1 c:mycollection r:core_node9) 
[c:mycollection s:shard1 r:core_node9 x:mycollection_shard1_replica_n6] 
o.a.s.h.CdcrReplicator Forwarded 512 updates to target mycollection


And what the log looks like in the target:

    2018-03-07 23:18:46.475 INFO  (qtp1595212853-26) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067896487950&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.500 INFO  (qtp1595212853-25) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067896487951&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.525 INFO  (qtp1595212853-24) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536512&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.550 INFO  (qtp1595212853-3793) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536513&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.575 INFO  (qtp1595212853-30) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536514&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.600 INFO  (qtp1595212853-26) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536515&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.625 INFO  (qtp1595212853-25) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536516&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.651 INFO  (qtp1595212853-24) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536517&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.676 INFO  (qtp1595212853-3793) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536518&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0
    2018-03-07 23:18:46.701 INFO  (qtp1595212853-30) [c:mycollection s:shard1 
r:core_node2 x:mycollection_shard1_replica_n1] o.a.s.c.S.Request 
[mycollection_shard1_replica_n1]  webapp=/solr path=/update 
params={_stateVer_=mycollection:30&_version_=-1594317067897536519&cdcr.update=&wt=javabin&version=2}
 status=0 QTime=0



This message and any attachment may contain information that is confidential 
and/or proprietary. Any use, disclosure, copying, storing, or distribution of 
this e-mail or any attached file by anyone other than the intended recipient is 
strictly prohibited. If you have received this message in error, please notify 
the sender by reply email and delete the message and any attachments. Thank you.

Reply via email to