[
https://issues.apache.org/jira/browse/SOLR-11293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151453#comment-16151453
]
Cao Manh Dat edited comment on SOLR-11293 at 9/2/17 9:02 AM:
-------------------------------------------------------------
This patch remove {{forceReplication}} flag check in IndexFetcher. I will
explain the case in detail, here are a code block in IndexFetcher.doFetch(..)
{code}
if (latestVersion == 0L) {
if (forceReplication && commit.getGeneration() != 0) {
solrCore.getIndexWriter().deleteAll();
solrCore.getUpdateHandler().commit(new CommitUpdateCommand(req,
false));
}
return success;
}
{code}
So in case of master's index is empty (lastestVersion == 0) and slave's index
is not empty, we clear slave's index. That's sounds reasonable, except the
{{forceReplication}} flag there, if forceReplication == false ( which is the
default for the first run ), we won't clear slave's index.
Therefore Tlog replica index is not empty meanwhile master index is empty
==> Inconsistent between Tlog replica and it's master.
But I think we can let Solr 7.0 continue without respin, because
- Firstly, for data loss event happen, the master's index must be emptied and
the IndexFetcher failed to fetch the index from master in the first try. This
kinda very rare ( I think ).
- Secondly, for inconsistent event happen ( no data get lost in this case ),
the master's must be emptied and the replica's index is not empty. Furthermore
the inconsistent state will gone when master do the next commit.
I think both cases are very rare to happen in production, so we can let Solr
7.0 voting continue if we are not confident with the patch.
was (Author: caomanhdat):
This patch remove {{forceReplication}} flag check in IndexFetcher. I will
explain the case in detail, here are a code block in IndexFetcher.doFetch(..)
{code}
if (latestVersion == 0L) {
if (forceReplication && commit.getGeneration() != 0) {
solrCore.getIndexWriter().deleteAll();
solrCore.getUpdateHandler().commit(new CommitUpdateCommand(req,
false));
}
return success;
}
{code}
So in case of master's index is empty (lastestVersion == 0) and slave's index
is not empty, we clear slave's index. That's sounds reasonable, except the
{{forceReplication}} flag there, if forceReplication == false ( which is the
default for the first run ), we won't clear slave's index.
Therefore Tlog replica index is not empty meanwhile master index is empty
==> Inconsistent between Tlog replica and it's master.
But I think we can let Solr 7.0 continue without respin, because
- Firstly, for data loss event happen, the master's index must be emptied and
the IndexFetcher failed to fetch the index from master in the first try. This
kinda very rare ( I think ).
- Secondly, for inconsistent event happen ( no data get lost in this case ),
the master's must be emptied and the replica's index is not empty. Furthermore
the inconsistent state will gone when master do the next commit.
> HttpPartitionTest fails often
> -----------------------------
>
> Key: SOLR-11293
> URL: https://issues.apache.org/jira/browse/SOLR-11293
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Noble Paul
> Assignee: Noble Paul
> Fix For: 7.1
>
> Attachments: SOLR-11293.patch, SOLR-11293.patch, SOLR-11293.patch
>
>
> https://jenkins.thetaphi.de/job/Lucene-Solr-master-MacOSX/4140/testReport/org.apache.solr.cloud/HttpPartitionTest/test/
> {code}
> Error Message
> Doc with id=1 not found in http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to:
> Path not found: /id; rsp={doc=null}
> Stacktrace
> java.lang.AssertionError: Doc with id=1 not found in
> http://127.0.0.1:60897/b/xj/collMinRf_1x3 due to: Path not found: /id;
> rsp={doc=null}
> at
> __randomizedtesting.SeedInfo.seed([ACF841744A332569:24AC7EAEE4CF4891]:0)
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.assertTrue(Assert.java:43)
> at
> org.apache.solr.cloud.HttpPartitionTest.assertDocExists(HttpPartitionTest.java:603)
> at
> org.apache.solr.cloud.HttpPartitionTest.assertDocsExistInAllReplicas(HttpPartitionTest.java:558)
> at
> org.apache.solr.cloud.HttpPartitionTest.testMinRf(HttpPartitionTest.java:249)
> at
> org.apache.solr.cloud.HttpPartitionTest.test(HttpPartitionTest.java:127)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]