[jira] [Commented] (SOLR-8372) Canceled recovery can lead to data loss

Shalin Shekhar Mangar (JIRA) Thu, 10 Dec 2015 09:15:42 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051267#comment-15051267
 ]


Shalin Shekhar Mangar commented on SOLR-8372:
---------------------------------------------

Thanks for explaining that Yonik. Yes, resetting start position does seem like 
the right thing to do.

bq. Shard splitting does use it.

Yes, sub-shard leader set update log to buffering mode during core creation. 
The sub-shard leader has no one to recover from so recovery is a no-op so it 
shouldn't be impacted by this change.

bq. There is also a general Solr admin API that start buffering. Looks like 
migrate uses that.

Yes, that is used by migrate to enable buffering on the target collection's 
leader. After this call succeeds, a routing rule is added to the source 
collection which starts sending all relevant updates to the target collection.

bq. For both of them, I'm curious where the replay or drop happens. Seems a 
little hairy to rely on the RecoveryStrategy replay or drop. (Though I guess it 
seems a whole lot safer now that we always buffer docs in recovery, even if 
peer sync works.)

Both use the REQUESTAPPLYUPDATES core admin action to apply buffered updates. 
RecoveryStrategy is not used here at all. If a migrate action fails, the target 
collection is still left in buffering state which shouldn't pose a problem if 
migrate is retried.

> Canceled recovery can lead to data loss
> ---------------------------------------
>
>                 Key: SOLR-8372
>                 URL: https://issues.apache.org/jira/browse/SOLR-8372
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Yonik Seeley
>
> A recovery via index replication tells the update log to start buffering 
> updates.  If that recovery is canceled for whatever reason by the replica, 
> the RecoveryStrategy calls ulog.dropBufferedUpdates() which stops buffering 
> and places the UpdateLog back in active mode.  If updates come from the 
> leader after this point (and before ReplicationStrategy retries recovery), 
> the update will be processed as normal and added to the transaction log. If 
> the server is bounced, those last updates to the transaction log look normal 
> (no FLAG_GAP) and can be used to determine who is more up to date. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-8372) Canceled recovery can lead to data loss

Reply via email to