[
https://issues.apache.org/jira/browse/SOLR-11003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174051#comment-16174051
]
Amrit Sarkar commented on SOLR-11003:
-------------------------------------
Ok!
{{CdcrBidirectionalTest}} is failing miserably every now and then while we do
beast tests. I see:
{code}
o.a.s.h.CdcrReplicator Forwarded 496 updates to target cdcr-cluster1
[beaster] 2> 19147 ERROR
(cdcr-replicator-31-thread-1-processing-n:127.0.0.1:46505_solr
x:cdcr-cluster1_shard1_replica_n1 s:shard1 c:cdcr-cluster1 r:core_node2)
[n:127.0.0.1:46505_solr c:cdcr-cluster1 s:shard1 r:core_node2
x:cdcr-cluster1_shard1_replica_n1] o.a.s.c.u.ExecutorUtil Uncaught exception
java.lang.AssertionError thrown by thread:
cdcr-replicator-31-thread-1-processing-n:127.0.0.1:46505_solr
x:cdcr-cluster1_shard1_replica_n1 s:shard1 c:cdcr-cluster1 r:core_node2
[beaster] 2> java.lang.Exception: Submitter stack trace
[beaster] 2> at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:163)
[beaster] 2> at
org.apache.solr.handler.CdcrReplicatorScheduler.lambda$start$1(CdcrReplicatorScheduler.java:76)
[beaster] 2> at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[beaster] 2> at
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[beaster] 2> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[beaster] 2> at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[beaster] 2> at java.lang.Thread.run(Thread.java:748)
[beaster] 2> 19155 INFO (qtp620825517-65) [n:127.0.0.1:46505_solr
c:cdcr-cluster1 s:shard1 r:core_node2 x:cdcr-cluster1_shard1_replica_n1]
o.a.s.c.S.Request [cdcr-cluster1_shard1_replica_n1] webapp=/solr path=/update
params={_stateVer_=cdcr-cluster1:5&cdcr.update=&wt=javabin&version=2} status=0
QTime=23
[beaster] 2> 19156 INFO
(cdcr-replicator-35-thread-1-processing-n:127.0.0.1:46044_solr
x:cdcr-cluster2_shard1_replica_n1 s:shard1 c:cdcr-cluster2 r:core_node2)
[n:127.0.0.1:46044_solr c:cdcr-cluster2 s:shard1 r:core_node2
x:cdcr-cluster2_shard1_replica_n1] o.a.s.h.CdcrReplicator Forwarded 495 updates
to target cdcr-cluster1
[beaster] 2> Sht 21, 2017 6:02:10 PD
com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler
uncaughtException
[beaster] 2> WARNING: Uncaught exception in thread:
Thread[cdcr-replicator-31-thread-1,5,TGRP-CdcrBidirectionalTest]
[beaster] 2> java.lang.AssertionError
[beaster] 2> at
__randomizedtesting.SeedInfo.seed([AE4E9FB83368594B]:0)
[beaster] 2> at
org.apache.solr.update.TransactionLog$LogReader.next(TransactionLog.java:588)
[beaster] 2> at
org.apache.solr.update.CdcrTransactionLog$CdcrLogReader.next(CdcrTransactionLog.java:143)
[beaster] 2> at
org.apache.solr.update.CdcrUpdateLog$CdcrLogReader.next(CdcrUpdateLog.java:633)
[beaster] 2> at
org.apache.solr.handler.CdcrReplicator.run(CdcrReplicator.java:77)
[beaster] 2> at
org.apache.solr.handler.CdcrReplicatorScheduler.lambda$null$0(CdcrReplicatorScheduler.java:81)
[beaster] 2> at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[beaster] 2> at java.lang.Thread.run(Thread.java:748)
[beaster] 2>
{code}
Some issue with concurrency in tlogs. some issue with tlog positions.
This results in:
{code}
[beaster] 2> NOTE: reproduce with: ant test
-Dtestcase=CdcrBidirectionalTest -Dtests.method=testBiDir
-Dtests.seed=AE4E9FB83368594B -Dtests.slow=true -Dtests.locale=sq-AL
-Dtests.timezone=Asia/Thimphu -Dtests.asserts=true
-Dtests.file.encoding=ANSI_X3.4-1968
[beaster] [00:01:51.287] ERROR 24.8s | CdcrBidirectionalTest.testBiDir <<<
[beaster] > Throwable #1: java.lang.AssertionError: cluster 2 docs
mismatch expected:<0> but was:<2>
[beaster] > at org.junit.Assert.fail(Assert.java:93)
[beaster] > at org.junit.Assert.failNotEquals(Assert.java:647)
[beaster] > at org.junit.Assert.assertEquals(Assert.java:128)
[beaster] > at org.junit.Assert.assertEquals(Assert.java:472)
[beaster] > at
org.apache.solr.cloud.CdcrBidirectionalTest.testBiDir(CdcrBidirectionalTest.java:199)
[beaster] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
[beaster] > at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[beaster] > at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[beaster] > at java.lang.reflect.Method.invoke(Method.java:498)
[beaster] > at
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
[beaster] > at
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
[beaster] > at
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
[beaster] > at
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
[beaster] > at
{code}
and maybe something else too in other conditions.
Looking into what's the deal with this, it is happening when indexing and
forwarding is happening simultaneously.
> Enabling bi-directional CDCR active-active clusters
> ---------------------------------------------------
>
> Key: SOLR-11003
> URL: https://issues.apache.org/jira/browse/SOLR-11003
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: CDCR
> Reporter: Amrit Sarkar
> Assignee: Varun Thacker
> Attachments: sample-configs.zip, SOLR-11003.patch, SOLR-11003.patch,
> SOLR-11003.patch, SOLR-11003.patch, SOLR-11003.patch,
> SOLR-11003-tlogutils.patch
>
>
> The latest version of Solr CDCR across collections / clusters is in
> active-passive format, where we can index into source collection and the
> updates gets forwarded to the passive one and vice-versa is not supported.
> https://lucene.apache.org/solr/guide/6_6/cross-data-center-replication-cdcr.html
> https://issues.apache.org/jira/browse/SOLR-6273
> We are try to get a design ready to index in both collections and the
> updates gets reflected across the collections in real-time.
> ClusterACollectionA => ClusterBCollectionB | ClusterBCollectionB =>
> ClusterACollectionA.
> The best use-case would be to we keep indexing in ClusterACollectionA which
> forwards the updates to ClusterBCollectionB. If ClusterACollectionA gets
> down, we point the indexer and searcher application to ClusterBCollectionB.
> Once ClusterACollectionA is up, depending on updates count, they will be
> bootstrapped or forwarded to ClusterACollectionA from ClusterBCollectionB and
> keep indexing on the ClusterBCollectionB.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]