GitHub user nickva opened a pull request:
https://github.com/apache/couchdb-couch-replicator/pull/52
Use mem3 to discover all _replicator shards in replicator manager
Previously this was done via recursive db directory traversal, looking for
shards names ending in `_replicator`. However, if there are orphanned shard
files (not associated with a clustered db), replicator manager crashes. It
restarts eventually, but as long as the orphanned shard file
without an entry in dbs db is present on the file system, replicator manager
will keep crashing and never reach some replication documents in shards
which
would be traversed after the problematic shard. The user-visible effect of
this
is some replication documents are never triggered.
To fix, use mem3 to traverse and discover `_replicator` shards. This was
used
Cloudant's production code for many years it is battle-tested and it doesn't
suffer from file system vs mem3 inconsistency.
Local `_replicator` db is a special case. Since it is not clustered it will
not appear in the clustered db list. However it is already handled as a
special
case in `init(_)` so that behavior is not affected by this change.
COUCHDB-3277
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloudant/couchdb-couch-replicator couchdb-3277
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/couchdb-couch-replicator/pull/52.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #52
----
commit 8205420d4249cea98ec5568344c43ccf11bbc9b1
Author: Nick Vatamaniuc <[email protected]>
Date: 2017-01-24T05:35:32Z
Use mem3 to discover all _replicator shards in replicator manager
Previously this was done via recursive db directory traversal, looking for
shards names ending in `_replicator`. However, if there are orphanned shard
files (not associated with a clustered db), replicator manager crashes. It
restarts eventually, but as long as the orphanned shard file
without an entry in dbs db is present on the file system, replicator manager
will keep crashing and never reach some replication documents in shards
which
would be traversed after the problematic shard. The user-visible effect of
this
is some replication documents are never triggered.
To fix, use mem3 to traverse and discover `_replicator` shards. This was
used
Cloudant's production code for many years it is battle-tested and it doesn't
suffer from file system vs mem3 inconsistency.
Local `_replicator` db is a special case. Since it is not clustered it will
not appear in the clustered db list. However it is already handled as a
special
case in `init(_)` so that behavior is not affected by this change.
COUCHDB-3277
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---