Anyway I think the broader point is, compaction is for compacting databases (removing old document revisions), and replication is for making a copy of a database (or subset). If compaction is causing downtime then that is a different bug to talk about, but it should be totally transparent.
Jens (incidentally it's nice to talk with you again): the compactor will notice that it has not caught up yet, and it will run again from the old "end" to the real end. Of course, there may be changes during that run too, so it will repeat. Usually each iteration has a much, much smaller window. In practice, you tend to see one "not caught up" message in the logs, and then it's done. However there is a pathological situation where you are updating faster than the compactor can run, and you will get an infinite loop (plus very heavy i/o and filesystem waste as the compactor is basically duplicating your .couch into a .couch.compact forever). On Sat, Feb 1, 2014 at 12:59 AM, Jens Alfke <[email protected]> wrote: > > On Jan 31, 2014, at 9:46 AM, Mark Hahn <[email protected]> wrote: > > > It wouldn't matter if it did. Within the same server linux > short-circuits > > http to make it the same as unix sockets, i.e. very little overhead. > > I think you mean it short-circuits TCP :) > There's extra work involved in HTTP generation & parsing no matter what > transport you're sending it over. And then the replicator is doing a bunch > of JSON and multipart generation/parsing. > Whereas the compactor, I would imagine, is mostly just making raw > read/write calls while walking the b-tree. > > Anyway; this makes me wonder what happens when changes are made to a > database during compaction. The compaction processes working off of a > snapshot of the database from the point that it started, so it's not going > to copy over new changes. Does that mean they get lost, or does the > compactor have extra smarts to run a second phase where it copies over all > revs created since the snapshot? > > —Jens
