Chris M. Hostetter created SOLR-14262:
-----------------------------------------
Summary: local commit is (silently - no rf support) ignored during
replay
Key: SOLR-14262
URL: https://issues.apache.org/jira/browse/SOLR-14262
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Chris M. Hostetter
Summarizing an issue discovered by Michael Frank and reported to the solr-user
mailing list in this thread...
[https://lists.apache.org/thread.html/%3ccaggv7soucsbhm4+cnhvvtrjxtzbbvpnaxsy-7vsksfpar_a...@mail.gmail.com%3E]
Situation:
* chaos testing of add+commit while randomly bringing nodes up/down
* test client checks rf of every add
** commit does not support rf
* after adding a doc (and confirming expected rf) + commiting, it's possible to
issue a search that gets back a "stale" version of the doc
Analysis by Michael...
{quote}
We traced the problem down to DistributedUpdateProcessor.doLocalCommit()
which is *silently* dropping all commits while the replica is currently
inactive and replaying, imeadiatly returns and still reports status=0.
...
The issue we have is the "silent" part. If upon recieving a commit request
the replica
* would either wait to become healthy and and then commit and return,
honoring waitSearcher=true (which is what we expected from reading the
documentation)
* or at least behave consistently the same way as all other
UpdateRequests and report back the achieved replication factor with the
"rf" response parameter
we could easily detect the degraded cluster state in the client and keep
re-trying the commit till "rf" matches the number of replicas.
{quote}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]