[
https://issues.apache.org/jira/browse/SOLR-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15982947#comment-15982947
]
Mano Kovacs commented on SOLR-7427:
-----------------------------------
We bumped into this issue recently with test timeouts and I understand the
workaround is to wait 7 seconds before submit commit command.
If I understand it correctly, the reason for this wait is to avoid updates
starting before the forwarding, but ending after the commit. As far as I
understand, those updates will not yet be forwarded to recovering replica, but
will be written into the next open segment, created after the hard commit,
therefore, won't be replicated with the full replication.
If that is the case, I am wondering, would a custom commit command help?
Assuming that this delayed-commit would wait until each update (that was
already started at the time of the command) is written out, before writing.
What I am thinking of is based on [[email protected]]'s flow described above:
1. replica tells leader it want's to recover
2. leader starts forwarding updates to replica (which the replica buffers since
it's in recovery)
3. delayed-commit command
3.1 the leader collects somehow the already running updates and blocks until
each ends (newer updates are disregarded, though)
3.2 leader executes a hard commit (so replica can replicate the current index)
4. replica starts replicating index from the last leader commit point
[[email protected]], [[email protected]], could you help me verify this?
> Recovery can miss some updates when they're neither forwarded nor present in
> replicated index
> ---------------------------------------------------------------------------------------------
>
> Key: SOLR-7427
> URL: https://issues.apache.org/jira/browse/SOLR-7427
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Affects Versions: 4.10.4, 5.1
> Reporter: Shalin Shekhar Mangar
> Labels: consistency, difficulty-hard, impact-high
>
> According to discussion in SOLR-7141. See [[email protected]]'s comment at
> https://issues.apache.org/jira/browse/SOLR-7141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14501622#comment-14501622
> {quote}
> From memory, here's how it's supposed to work:
> 1. replica tells leader it want's to recover
> 2. leader starts forwarding updates to replica (which the replica buffers
> since it's in recovery)
> 3. leader executes a hard commit (so replica can replicate the current index)
> 4. replica starts replicating index from the last leader commit point
> Note that the ordering of #2 and #3 is very important. If we did #3 first
> and then #2 after, some updates won't make it into the commit and also won't
> be forwarded to the replica (and that leads to data loss).
> Now the issue: even though we do #2 first and #3 after... it's possible to
> have an unfortunately scheduled update in a different thread that started
> before we did #2, and doesn't complete until after #3, so that update was not
> forwarded, and it's also not in the replicated index. The sleep (which
> should be between steps #2 and #3) is to try and give time for this update to
> complete and make it into the index.
> It occurs to me that the lucene IndexWriter thread stealing (same issue that
> caused this: SOLR-6820) could make this much more likely than we would have
> thought.
> One possible alternative is to block updates for a commit of this type
> (replication commit). Any blocked updates would need to see that they need
> to be forwarded to the replica too (once they are unblocked) - I don't know
> if the code is currently written that way.
> {quote}
> So there is some protection against such a situation but it is based on two
> timeout values:
> # The replica stalls recovery until the leader acknowledges that it has
> indeed seen the replica in 'recovery' (via the prep recovery core admin API)
> # The replica sleeps for 7 seconds by default (configured via the
> hidden-switch "solr.cloud.wait-for-updates-with-stale-state-pause" system
> property) after prep recovery completes to give additional time for such
> updates to complete.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]