Webster Homer created SOLR-12057:
------------------------------------

             Summary: CDCR does not replicate to Collections with TLOG Replicas
                 Key: SOLR-12057
                 URL: https://issues.apache.org/jira/browse/SOLR-12057
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: CDCR
    Affects Versions: 7.2
            Reporter: Webster Homer


We created a collection using TLOG replicas in our QA clouds.
We have a locally hosted solrcloud with 2 nodes, all our collections have 2 
shards. We use CDCR to replicate the collections from this environment to 2 
data centers hosted in Google cloud. This seems to work fairly well for our 
collections with NRT replicas. However the new TLOG collection has problems.
 
The google cloud solrclusters have 4 nodes each (3 separate Zookeepers). 2 
shards per collection with 2 replicas per shard.
 
We never see data show up in the cloud collections, but we do see tlog files 
show up on the cloud servers. I can see that all of the servers have cdcr 
started, buffers are disabled.
The cdcr source configuration is:
 
"requestHandler":{"/cdcr":{
      "name":"/cdcr",
      "class":"solr.CdcrRequestHandler",
      "replica":[
        {
          
"zkHost":"[xxx-mzk01.sial.com:2181|http://xxx-mzk01.sial.com:2181/],[xxx-mzk02.sial.com:2181|http://xxx-mzk02.sial.com:2181/],[xxx-mzk03.sial.com:2181/solr|http://xxx-mzk03.sial.com:2181/solr]";,
          "source":"b2b-catalog-material-180124T",
          "target":"b2b-catalog-material-180124T"},
        {
          
"zkHost":"[yyyy-mzk01.sial.com:2181|http://yyyy-mzk01.sial.com:2181/],[yyyy-mzk02.sial.com:2181|http://yyyy-mzk02.sial.com:2181/],[yyyy-mzk03.sial.com:2181/solr|http://yyyy-mzk03.sial.com:2181/solr]";,
          "source":"b2b-catalog-material-180124T",
          "target":"b2b-catalog-material-180124T"}],
      "replicator":{
        "threadPoolSize":4,
        "schedule":500,
        "batchSize":250},
      "updateLogSynchronizer":\{"schedule":60000}}}}
 
The target configurations in the 2 clouds are the same:
"requestHandler":{"/cdcr":{ "name":"/cdcr", "class":"solr.CdcrRequestHandler", 
"buffer":{"defaultState":"disabled"}}} 
 
All of our collections have a timestamp field, index_date. In the source 
collection all the records have a date of 2/28/2018 but the target collections 
have a latest date of 1/26/2018
 
I don't see cdcr errors in the logs, but we use logstash to search them, and 
we're still perfecting that. 
 
We have a number of similar collections that behave correctly. This is the only 
collection that is a TLOG collection. It appears that CDCR doesn't support TLOG 
collections.
 
It looks like the data is getting to the target servers. I see tlog files with 
the right timestamps. Looking at the timestamps on the documents in the 
collection none of the data appears to have been loaded.In the solr.log I see 
lots of /cdcr messages  action=LASTPROCESSEDVERSION,  
action=COLLECTIONCHECKPOINT, and  action=SHARDCHECKPOINT 
 
no errors
 
Target collections autoCommit is set to  60000 I tried sending a commit 
explicitly no difference. cdcr is uploading data, but no new data appears in 
the collection.
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to