[jira] [Comment Edited] (SOLR-12057) CDCR does not replicate to Collections with TLOG Replicas

2018-10-24 Thread Amrit Sarkar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662648#comment-16662648
 ] 

Amrit Sarkar edited comment on SOLR-12057 at 10/24/18 6:24 PM:
---

Thanks Varun for the detailed feedback, 

The entire test *CdcrBidirectionalTest* has been a copy of the 
*CdcrBidirectionalTest*, which gets its framework from *CdcrBootstrapTest*, 
keeping the uniformity in place. All the points mentioned above are essentially 
framework snippets from *CdcrBootstrapTest*.

I strongly agree with consolidating CdcrBidirectinalTest with the test in this 
patch, and potentially for Cdcr support for pull replicas fix too. Seeking 
advice on whether we should do it under this Jira or create new one.

Other points;
bq. After CdcrTestsUtil.cdcrStart(cluster1SolrClient); do we need to sleep for 
2 seconds? 
Not really, we can remove this safely, from, all tests; 2 sec sleep is for 
loading the Cdcr components and avoiding potentially few retries.
bq. I really like how this test checks for all operations to make sure they 
work correctly. perhaps we could expand it to add a parent-child document and 
an in-place update as well?
Sure, I will include the parent-child doc; though in-place updates are not 
supported for forwarding in CDCR. I can see how much effort is required for 
that. Jira: SOLR-12105.



was (Author: sarkaramr...@gmail.com):
Thanks Varun for the detailed feedback, 

The entire test {CdcrBidirectionalTest} has been a copy of the 
{CdcrBidirectionalTest}, which gets its framework from {CdcrBootstrapTest}, 
keeping the uniformity in place. All the points mentioned above are essentially 
framework snippets from {CdcrBootstrapTest}.

I strongly agree with consolidating CdcrBidirectinalTest with the test in this 
patch, and potentially for Cdcr support for pull replicas fix too. Seeking 
advice on whether we should do it under this Jira or create new one.

Other points;
bq. After CdcrTestsUtil.cdcrStart(cluster1SolrClient); do we need to sleep for 
2 seconds? 
Not really, we can remove this safely, from, all tests; 2 sec sleep is for 
loading the Cdcr components and avoiding potentially few retries.
bq. I really like how this test checks for all operations to make sure they 
work correctly. perhaps we could expand it to add a parent-child document and 
an in-place update as well?
Sure, I will include the parent-child doc; though in-place updates are not 
supported for forwarding in CDCR. I can see how much effort is required for 
that. Jira: SOLR-12105.


> CDCR does not replicate to Collections with TLOG Replicas
> -
>
> Key: SOLR-12057
> URL: https://issues.apache.org/jira/browse/SOLR-12057
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR
>Affects Versions: 7.2
>Reporter: Webster Homer
>Assignee: Varun Thacker
>Priority: Major
> Attachments: SOLR-12057.patch, SOLR-12057.patch, SOLR-12057.patch, 
> SOLR-12057.patch, cdcr-fail-with-tlog-pull.patch, 
> cdcr-fail-with-tlog-pull.patch
>
>
> We created a collection using TLOG replicas in our QA clouds.
> We have a locally hosted solrcloud with 2 nodes, all our collections have 2 
> shards. We use CDCR to replicate the collections from this environment to 2 
> data centers hosted in Google cloud. This seems to work fairly well for our 
> collections with NRT replicas. However the new TLOG collection has problems.
>  
> The google cloud solrclusters have 4 nodes each (3 separate Zookeepers). 2 
> shards per collection with 2 replicas per shard.
>  
> We never see data show up in the cloud collections, but we do see tlog files 
> show up on the cloud servers. I can see that all of the servers have cdcr 
> started, buffers are disabled.
> The cdcr source configuration is:
>  
> "requestHandler":{"/cdcr":{
>       "name":"/cdcr",
>       "class":"solr.CdcrRequestHandler",
>       "replica":[
>         {
>           
> "zkHost":"[xxx-mzk01.sial.com:2181|http://xxx-mzk01.sial.com:2181/],[xxx-mzk02.sial.com:2181|http://xxx-mzk02.sial.com:2181/],[xxx-mzk03.sial.com:2181/solr|http://xxx-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"},
>         {
>           
> "zkHost":"[-mzk01.sial.com:2181|http://-mzk01.sial.com:2181/],[-mzk02.sial.com:2181|http://-mzk02.sial.com:2181/],[-mzk03.sial.com:2181/solr|http://-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"}],
>       "replicator":{
>         "threadPoolSize":4,
>         "schedule":500,
>         "batchSize":250},
>       "updateLogSynchronizer":\{"schedule":6
>  

[jira] [Comment Edited] (SOLR-12057) CDCR does not replicate to Collections with TLOG Replicas

2018-10-24 Thread Amrit Sarkar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662648#comment-16662648
 ] 

Amrit Sarkar edited comment on SOLR-12057 at 10/24/18 6:23 PM:
---

Thanks Varun for the detailed feedback, 

The entire test {CdcrBidirectionalTest} has been a copy of the 
{CdcrBidirectionalTest}, which gets its framework from {CdcrBootstrapTest}, 
keeping the uniformity in place. All the points mentioned above are essentially 
framework snippets from {CdcrBootstrapTest}.

I strongly agree with consolidating CdcrBidirectinalTest with the test in this 
patch, and potentially for Cdcr support for pull replicas fix too. Seeking 
advice on whether we should do it under this Jira or create new one.

Other points;
bq. After CdcrTestsUtil.cdcrStart(cluster1SolrClient); do we need to sleep for 
2 seconds? 
Not really, we can remove this safely, from, all tests; 2 sec sleep is for 
loading the Cdcr components and avoiding potentially few retries.
bq. I really like how this test checks for all operations to make sure they 
work correctly. perhaps we could expand it to add a parent-child document and 
an in-place update as well?
Sure, I will include the parent-child doc; though in-place updates are not 
supported for forwarding in CDCR. I can see how much effort is required for 
that. Jira: SOLR-12105.



was (Author: sarkaramr...@gmail.com):
Thanks Varun for the detailed feedback, 

The entire test {CdcrBidirectionalTest} has been a copy of the 
{CdcrBidirectionalTest}, which gets its framework from {CdcrBootstrapTest}; 
keeping the uniformity in place. All the points mentioned above are essentially 
framework snippets from {CdcrBootstrapTest}.

I strongly agree with consolidating CdcrBidirectinalTest with the test in this 
patch, and potentially for Cdcr support for pull replicas fix too. Seeking 
advice on whether we should do it under this Jira or create new one.

Other points;
bq. After CdcrTestsUtil.cdcrStart(cluster1SolrClient); do we need to sleep for 
2 seconds? 
Not really, we can remove this safely, from, all tests; 2 sec sleep is for 
loading the Cdcr components and avoiding potentially few retries.
bq. I really like how this test checks for all operations to make sure they 
work correctly. perhaps we could expand it to add a parent-child document and 
an in-place update as well?
Sure, I will include the parent-child doc; though in-place updates are not 
supported for forwarding in CDCR. I can see how much effort is required for 
that. Jira: SOLR-12105.


> CDCR does not replicate to Collections with TLOG Replicas
> -
>
> Key: SOLR-12057
> URL: https://issues.apache.org/jira/browse/SOLR-12057
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR
>Affects Versions: 7.2
>Reporter: Webster Homer
>Assignee: Varun Thacker
>Priority: Major
> Attachments: SOLR-12057.patch, SOLR-12057.patch, SOLR-12057.patch, 
> SOLR-12057.patch, cdcr-fail-with-tlog-pull.patch, 
> cdcr-fail-with-tlog-pull.patch
>
>
> We created a collection using TLOG replicas in our QA clouds.
> We have a locally hosted solrcloud with 2 nodes, all our collections have 2 
> shards. We use CDCR to replicate the collections from this environment to 2 
> data centers hosted in Google cloud. This seems to work fairly well for our 
> collections with NRT replicas. However the new TLOG collection has problems.
>  
> The google cloud solrclusters have 4 nodes each (3 separate Zookeepers). 2 
> shards per collection with 2 replicas per shard.
>  
> We never see data show up in the cloud collections, but we do see tlog files 
> show up on the cloud servers. I can see that all of the servers have cdcr 
> started, buffers are disabled.
> The cdcr source configuration is:
>  
> "requestHandler":{"/cdcr":{
>       "name":"/cdcr",
>       "class":"solr.CdcrRequestHandler",
>       "replica":[
>         {
>           
> "zkHost":"[xxx-mzk01.sial.com:2181|http://xxx-mzk01.sial.com:2181/],[xxx-mzk02.sial.com:2181|http://xxx-mzk02.sial.com:2181/],[xxx-mzk03.sial.com:2181/solr|http://xxx-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"},
>         {
>           
> "zkHost":"[-mzk01.sial.com:2181|http://-mzk01.sial.com:2181/],[-mzk02.sial.com:2181|http://-mzk02.sial.com:2181/],[-mzk03.sial.com:2181/solr|http://-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"}],
>       "replicator":{
>         "threadPoolSize":4,
>         "schedule":500,
>         "batchSize":250},
>       "updateLogSynchronizer":\{"schedule":6
>  

[jira] [Comment Edited] (SOLR-12057) CDCR does not replicate to Collections with TLOG Replicas

2018-03-07 Thread Amrit Sarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389301#comment-16389301
 ] 

Amrit Sarkar edited comment on SOLR-12057 at 3/7/18 10:10 AM:
--

I will post regular update here about the observations to get feedback and 
inputs:

1. The CDC replication is happening successfully, that documents are getting 
forwarded to the target collection, all of them, but are not visible despite 
committing explicitly.
2. As stated above Webster, tlogs on the target are getting filled up, fat 
tlogs but segments are not getting created.
3. I verified the behavior on 7.1 and on master branch, concluding the anomaly 
is not introduced by CDCR Bidirectional approach.
4. With hardcommits, softcommits, explicit commits, the documents are not 
getting visible on target with all tlog replicas.
5. While a normal update from SolrJ / client is generating segments and 
behaving as expected.

Will report when I have more.


was (Author: sarkaramr...@gmail.com):
I will post regular update here about the observations to get feedback and 
inputs:

1. The CDC replication is happening successfully, that documents are getting 
forwarded to the target collection, all of them, but are not visible despite 
committing explicitly.
2. As stated above Webster, tlogs on the target are getting filled up, fat 
tlogs but segments are not getting created.
3. I verified the behavior on 7.1 and on master branch, concluding the anomaly 
is not introduced by CDCR Bidirectional approach.
4. With hardcommits, softcommits, explicit commits, the documents are not 
getting visible on target with all tlog replicas.
5. While a normal update from SolrJ / client is generating segments and 
behaving as expected.

Will report when I have more to report.

> CDCR does not replicate to Collections with TLOG Replicas
> -
>
> Key: SOLR-12057
> URL: https://issues.apache.org/jira/browse/SOLR-12057
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR
>Affects Versions: 7.2
>Reporter: Webster Homer
>Priority: Major
> Attachments: cdcr-fail-with-tlog-pull.patch, 
> cdcr-fail-with-tlog-pull.patch
>
>
> We created a collection using TLOG replicas in our QA clouds.
> We have a locally hosted solrcloud with 2 nodes, all our collections have 2 
> shards. We use CDCR to replicate the collections from this environment to 2 
> data centers hosted in Google cloud. This seems to work fairly well for our 
> collections with NRT replicas. However the new TLOG collection has problems.
>  
> The google cloud solrclusters have 4 nodes each (3 separate Zookeepers). 2 
> shards per collection with 2 replicas per shard.
>  
> We never see data show up in the cloud collections, but we do see tlog files 
> show up on the cloud servers. I can see that all of the servers have cdcr 
> started, buffers are disabled.
> The cdcr source configuration is:
>  
> "requestHandler":{"/cdcr":{
>       "name":"/cdcr",
>       "class":"solr.CdcrRequestHandler",
>       "replica":[
>         {
>           
> "zkHost":"[xxx-mzk01.sial.com:2181|http://xxx-mzk01.sial.com:2181/],[xxx-mzk02.sial.com:2181|http://xxx-mzk02.sial.com:2181/],[xxx-mzk03.sial.com:2181/solr|http://xxx-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"},
>         {
>           
> "zkHost":"[-mzk01.sial.com:2181|http://-mzk01.sial.com:2181/],[-mzk02.sial.com:2181|http://-mzk02.sial.com:2181/],[-mzk03.sial.com:2181/solr|http://-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"}],
>       "replicator":{
>         "threadPoolSize":4,
>         "schedule":500,
>         "batchSize":250},
>       "updateLogSynchronizer":\{"schedule":6
>  
> The target configurations in the 2 clouds are the same:
> "requestHandler":{"/cdcr":{ "name":"/cdcr", 
> "class":"solr.CdcrRequestHandler", "buffer":{"defaultState":"disabled"}}} 
>  
> All of our collections have a timestamp field, index_date. In the source 
> collection all the records have a date of 2/28/2018 but the target 
> collections have a latest date of 1/26/2018
>  
> I don't see cdcr errors in the logs, but we use logstash to search them, and 
> we're still perfecting that. 
>  
> We have a number of similar collections that behave correctly. This is the 
> only collection that is a TLOG collection. It appears that CDCR doesn't 
> support TLOG collections.
>  
> It looks like the data is getting to the target servers. I see tlog files 
> with the right timestamps. Looking at the timestamps on the documents in the 
> collection none of the 

[jira] [Comment Edited] (SOLR-12057) CDCR does not replicate to Collections with TLOG Replicas

2018-03-06 Thread Webster Homer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388058#comment-16388058
 ] 

Webster Homer edited comment on SOLR-12057 at 3/6/18 4:35 PM:
--

Our setup would likely have either Tlog or NRT replicas. Any setup would have 
at least one of those, I wouldn't expect that you'd want to send to a PULL 
replica since they cannot be masters.

 

We would like to favor TLOG replicas since they always give consistent search 
results while NRT do not


was (Author: webhomer):
Our setup would likely have either Tlog or NRT replicas. Any setup would have 
at least one of those, I wouldn't expect that you'd want to send to a PULL 
replica since they cannot be masters.

> CDCR does not replicate to Collections with TLOG Replicas
> -
>
> Key: SOLR-12057
> URL: https://issues.apache.org/jira/browse/SOLR-12057
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR
>Affects Versions: 7.2
>Reporter: Webster Homer
>Priority: Major
> Attachments: cdcr-fail-with-tlog-pull.patch
>
>
> We created a collection using TLOG replicas in our QA clouds.
> We have a locally hosted solrcloud with 2 nodes, all our collections have 2 
> shards. We use CDCR to replicate the collections from this environment to 2 
> data centers hosted in Google cloud. This seems to work fairly well for our 
> collections with NRT replicas. However the new TLOG collection has problems.
>  
> The google cloud solrclusters have 4 nodes each (3 separate Zookeepers). 2 
> shards per collection with 2 replicas per shard.
>  
> We never see data show up in the cloud collections, but we do see tlog files 
> show up on the cloud servers. I can see that all of the servers have cdcr 
> started, buffers are disabled.
> The cdcr source configuration is:
>  
> "requestHandler":{"/cdcr":{
>       "name":"/cdcr",
>       "class":"solr.CdcrRequestHandler",
>       "replica":[
>         {
>           
> "zkHost":"[xxx-mzk01.sial.com:2181|http://xxx-mzk01.sial.com:2181/],[xxx-mzk02.sial.com:2181|http://xxx-mzk02.sial.com:2181/],[xxx-mzk03.sial.com:2181/solr|http://xxx-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"},
>         {
>           
> "zkHost":"[-mzk01.sial.com:2181|http://-mzk01.sial.com:2181/],[-mzk02.sial.com:2181|http://-mzk02.sial.com:2181/],[-mzk03.sial.com:2181/solr|http://-mzk03.sial.com:2181/solr];,
>           "source":"b2b-catalog-material-180124T",
>           "target":"b2b-catalog-material-180124T"}],
>       "replicator":{
>         "threadPoolSize":4,
>         "schedule":500,
>         "batchSize":250},
>       "updateLogSynchronizer":\{"schedule":6
>  
> The target configurations in the 2 clouds are the same:
> "requestHandler":{"/cdcr":{ "name":"/cdcr", 
> "class":"solr.CdcrRequestHandler", "buffer":{"defaultState":"disabled"}}} 
>  
> All of our collections have a timestamp field, index_date. In the source 
> collection all the records have a date of 2/28/2018 but the target 
> collections have a latest date of 1/26/2018
>  
> I don't see cdcr errors in the logs, but we use logstash to search them, and 
> we're still perfecting that. 
>  
> We have a number of similar collections that behave correctly. This is the 
> only collection that is a TLOG collection. It appears that CDCR doesn't 
> support TLOG collections.
>  
> It looks like the data is getting to the target servers. I see tlog files 
> with the right timestamps. Looking at the timestamps on the documents in the 
> collection none of the data appears to have been loaded.In the solr.log I see 
> lots of /cdcr messages  action=LASTPROCESSEDVERSION,  
> action=COLLECTIONCHECKPOINT, and  action=SHARDCHECKPOINT 
>  
> no errors
>  
> Target collections autoCommit is set to  6 I tried sending a commit 
> explicitly no difference. cdcr is uploading data, but no new data appears in 
> the collection.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org