[jira] [Created] (COUCHDB-2965) Race condition in replicator rescan logic

Nick Vatamaniuc (JIRA) Mon, 07 Mar 2016 08:32:59 -0800

Nick Vatamaniuc created COUCHDB-2965:
----------------------------------------

Summary: Race condition in replicator rescan logic
Key: COUCHDB-2965
URL: https://issues.apache.org/jira/browse/COUCHDB-2965
Project: CouchDB
Issue Type: Bug
Components: Replication
Reporter: Nick Vatamaniuc

There is race condition between the full rescan and regular change feed
processing in the couch_replicator_manger code.

This race condition would lead to replication docs left in untriggered state
when a rescan of all the docs is performed. The rescan might happen when nodes
connect and disconnect. The likelihood of this race condition appear goes up if
a lot of documents are updated and there is a back-up of messages in the
replicator manager's mailbox.

The race condition happens in the following way:

* A full rescan is initiated here:

https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator_manager.erl#L424

It clears the db_to_seq ets table which holds the latest change sequence for
each replicator database. Then launches a scan_all_dbs process.

* scan_all_dbs will find all replicator-looking-like database and for each
send a {resume_scan, DbName} message to the main couch_replicator_manager
process.

* {resume_scan, DbName} message is handled here:

https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator_manager.erl#L233

The expectation is because db_to_seq was reset it ends up not finding a
sequence checkpoint in db_to_seq, so start 0 and spawns a new change feed,
which will rescan all documents (since we need to determine ownership for them).

But the race condition occurs because when change feeds stop, they call
replicator manager with {rep_db_checkpoint, DbName} message. That will update
db_to_seq ets table with the latest change sequence.

https://github.com/apache/couchdb-couch-replicator/blob/master/src/couch_replicator_manager.erl#L225

Which means this sequence of operations could happen:

* db_to_seq is reset to 0, scan_all_dbs is spawned

* change feed stops at sequence 1042, it calls {rep_db_checkpoint,
<<"_replicator">>}

* {rep_db_checkpoint, <<"_replicator">>} call is handled, now latest db_to_seq
for _replicator is 1042

* {resume, <<"_replicator">>} is sent from scan_all_dbs process

* {resume, <<"_replicator">>} is received by replicator manager. It sees that
db_to_seq has _replicator with latest sequence 1042, so it will either start
from that instead of 0, thus skipping updates from 0 to 1042.

This was seen by running the experiment with1000 replication documents were
being updated. Around document 700 or so , node1 was killed (pkill -f node1) .
node2 experienced the race condition on rescan and never picked up a bunch of
document that should have belong to it. didn't.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (COUCHDB-2965) Race condition in replicator rescan logic

Reply via email to