I'll double check my situation since I have not thoroughly verified it. This particular issue occurs between restarts of the server where I make no changes to the continuous replications in the _replicator DB, but it may also be related to the issue of too many continuous replications causing a replications to stall out from lack of resources. It's possible that I assumed they were starting over from seq 1 when in fact they were never able to complete a full replication in the first place.
-- Paul Okstad > On May 26, 2016, at 2:51 AM, Robert Newson <[email protected]> wrote: > > There must be something else wrong. Filtered replications definitely make and > resume from checkpoints, same as unfiltered. > > We mix the filter code and parameters into the replication checkpoint id to > ensure we start from 0 for a potentially different filtering. Perhaps you are > changing those? Or maybe supplying since_seq as well (which overrides the > checkpoint)? > > Sent from my iPhone > >> On 25 May 2016, at 16:39, Paul Okstad <[email protected]> wrote: >> >> This isn’t just a problem of filtered replication, it’s a major issue in the >> database-per-user strategy (at least in the v1.6.1 I’m using). I’m also >> using a database-per-user design with thousands of users and a single global >> database. If a small fraction of the users (hundreds) has continuously >> ongoing replications from the user DB to the global DB, it will cause >> extremely high CPU utilization. This is without any replication filtered >> javascript function. >> >> Another huge issue with filtered replications is that they lose their place >> when replications are restarted. In other words, they don’t keep track of >> sequence ID between restarts of the server or stopping and starting the same >> replication. So for example, if I want to perform filtered replication of >> public documents from the global DB to the public DB, and I have a ton of >> documents in global, then each time I restart the filtered replication it >> will begin from sequence #1. I’m guessing this is due to the fact that >> CouchDB does not know if the filter function has been modified between >> replications, but this behavior is still very disappointing. >> >> — >> Paul Okstad >> http://pokstad.com <http://pokstad.com/> >> >> >> >>> On May 25, 2016, at 4:25 AM, Stefan Klein <[email protected]> wrote: >>> >>> 2016-05-25 12:48 GMT+02:00 Stefan du Fresne <[email protected]>: >>> >>> >>> >>>> So to be clear, this is effectively replacing replication— where the >>>> client negotiates with the server for a collection of changes to download— >>>> with a daemon that builds up a collection of documents that each client >>>> should get (and also presumably delete), which clients can then query for >>>> when they’re able? >>> >>> Sorry, didn't describe well enough. >>> >>> On Serverside we have one big database containing all documents and one db >>> for each user. >>> The clients always replicate to and from their individual userdb, >>> unfiltered. So the db for a user is a 1:1 copy of their pouchdb/... on >>> their client. >>> >>> Initially we set up a filtered replication for each user from servers main >>> database to the server copy of the users database. >>> With this we ran into performance problems and sooner or later we probably >>> would have ran into issues with open file descriptors. >>> >>> So what we do instead is listening to the changes of the main database and >>> distribute the documents to the servers userdb, which then are synced with >>> the clients. >>> >>> Note: this is only for documents the users actually work with (as in >>> possibly modify), for queries on the data we query views on the main >>> database. >>> >>> For the way back, we listen to the _dbchanges, so we get an event for >>> changes on the users dbs, get that change from the users db and determine >>> what to do with it. >>> We do not replicate back users changes to the main database but rather have >>> an internal API to evaluate all kinds of constrains on users input. >>> If you do not have to check users input, you could certainly listen to >>> _dbchanges and "blindly" one-shot replicate from the changed DB to your >>> main DB. >>> >>> -- >>> Stefan >
