nickva commented on issue #4385:
URL: https://github.com/apache/couchdb/issues/4385#issuecomment-1397565505
In dock.zip I had noticed local.ini wasn't at text file but some kind of a
binary.
```
% cat local.ini
�E�c'K�cnO�cuxUT
```
One thing to pay attention is if the replication ID changes. Based on your
start_log I see that the checkpoint was *not* found for the catalogues endpoint
```
notice] 2023-01-19T13:30:16.056403Z nonode@nohost <0.1150.0> 34af135719
couch1:5984 192.168.100.227 admin GET /catalogues/ 200 ok 2
[notice] 2023-01-19T13:30:16.131009Z nonode@nohost <0.1150.0> fb7b32c9cd
couch1:5984 192.168.100.227 admin GET
/catalogues/_local/475a01ff4762aae18390232479a85acd 404 ok 16
[notice] 2023-01-19T13:30:16.133231Z nonode@nohost <0.1150.0> ca52dcf06f
couch1:5984 192.168.100.227 admin GET
/catalogues/_local/21d33a0db3bd438bc2fe58ef3e64a1a7 404 ok 2
[notice] 2023-01-19T13:30:16.171877Z nonode@nohost <0.1150.0> 9adcacce61
couch1:5984 192.168.100.227 admin GET
/catalogues/_local/9dea40c7f506f19799311d927b321449 404 ok 38
[notice] 2023-01-19T13:30:16.173819Z nonode@nohost <0.1150.0> 0ab11d84ae
couch1:5984 192.168.100.227 admin GET
/catalogues/_local/294224beca58e21bde9e0a5676df7d05 404 ok 1
```
Notice the 404 on the `_local/$replicationid` docs.
Not sure about lotimages_new as the logs start after "Starting
replication..." already.
So what may be happening is your replication IDs change inadvertently when
you update docker configs. If replication IDs change, that means previous
checkpoints won't be found, and replication will rewind from 0. Now, if your
source, target and other replication parameters stay the same it's most like
the the server uuid `[couchdb] uuid = ...` value that's no consistent. The
setting is described
[here](https://docs.couchdb.org/en/stable/config/couchdb.html#couchdb/uuid)
Here is the description on the replication ID generation algorithm:
https://docs.couchdb.org/en/stable/replication/protocol.html#generate-replication-id
If that is not specified, a random value will be generated, and that would
cause your replications IDs to be random every time if you spin up docker
containers unless you persist your config or explicitly set `[couchdb] uuid
...`.
That value doesn't have to be a proper UUID. You could use a hostname or
some other identifier that uniquely identifies the same "cluster". In addition,
make sure it's set to the same value on all the nodes in the cluster. If you
have 3 nodes (couch1, couch2, couch3 ensure uuid is the same).
Checkpoints are persisted on both source and target endpoints (database) in
`_local/$base_replication_id` docs. Replications will only resume from a
checkpoint if it can find the checkpoint on *both* source and target. So in
your logs you could monitor for those 404s and hopefully finally it should find
the last one is found (a 200 response). There are usually a few 404s expected
as we try to load older versions of replication ID since the algorithm to
generate has evolved at least 4 times. Then monitor if replication ID values
stay the same or change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]