[
https://issues.apache.org/jira/browse/SOLR-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982983#comment-15982983
]
Yonik Seeley commented on SOLR-7427:
------------------------------------
Yes you need a step 2.5...
after the leader starts forwarding updates, and if it's guaranteed that any
"new" updates will be forwarded to the recovering replica, then grab a list of
the current updates in-flight. Then one could wait until all those updates to
make it into the index before doing the hard commit.
We don't currently have a mechanism for knowing all those in-flight updates
though. To make it concurrent and air-tight, an update would need to be added
to that list *before* any check to see if it should be forwarded. And we would
need guarantees that all updates after a certain point would be forwarded.
> Recovery can miss some updates when they're neither forwarded nor present in
> replicated index
> ---------------------------------------------------------------------------------------------
>
> Key: SOLR-7427
> URL: https://issues.apache.org/jira/browse/SOLR-7427
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Affects Versions: 4.10.4, 5.1
> Reporter: Shalin Shekhar Mangar
> Labels: consistency, difficulty-hard, impact-high
>
> According to discussion in SOLR-7141. See [[email protected]]'s comment at
> https://issues.apache.org/jira/browse/SOLR-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14501622#comment-14501622
> {quote}
> From memory, here's how it's supposed to work:
> 1. replica tells leader it want's to recover
> 2. leader starts forwarding updates to replica (which the replica buffers
> since it's in recovery)
> 3. leader executes a hard commit (so replica can replicate the current index)
> 4. replica starts replicating index from the last leader commit point
> Note that the ordering of #2 and #3 is very important. If we did #3 first
> and then #2 after, some updates won't make it into the commit and also won't
> be forwarded to the replica (and that leads to data loss).
> Now the issue: even though we do #2 first and #3 after... it's possible to
> have an unfortunately scheduled update in a different thread that started
> before we did #2, and doesn't complete until after #3, so that update was not
> forwarded, and it's also not in the replicated index. The sleep (which
> should be between steps #2 and #3) is to try and give time for this update to
> complete and make it into the index.
> It occurs to me that the lucene IndexWriter thread stealing (same issue that
> caused this: SOLR-6820) could make this much more likely than we would have
> thought.
> One possible alternative is to block updates for a commit of this type
> (replication commit). Any blocked updates would need to see that they need
> to be forwarded to the replica too (once they are unblocked) - I don't know
> if the code is currently written that way.
> {quote}
> So there is some protection against such a situation but it is based on two
> timeout values:
> # The replica stalls recovery until the leader acknowledges that it has
> indeed seen the replica in 'recovery' (via the prep recovery core admin API)
> # The replica sleeps for 7 seconds by default (configured via the
> hidden-switch "solr.cloud.wait-for-updates-with-stale-state-pause" system
> property) after prep recovery completes to give additional time for such
> updates to complete.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]