So the way I implemented this in PouchDB gives Paul Davis's advice, is to stem the trees but turning the revision tree into a set of revision lists, listing all the individual paths from head to root, then stemming each list to the rev limit, then merging them back into a single tree
I worked off a refactor branch of couch that never got merged, but from a glance it looks like this is how it is done in https://github.com/apache/couchdb/blob/master/src/couchdb/couch_key_tree.erl I keep meaning to write a test to reproduce this, but I am fairly certain this has the problem with a document that is generating a lot of conflicts (by being deleted and recreated continuously), can dos CouchDB as shallow branches never get pruned, but I may possibly be missing something On 31 August 2013 19:05, Robert Newson <[email protected]> wrote: > The best I can find right now is from couch_key_tree where the > truncation occurs; > > %% What makes this a bit more complicated is that there is a limit to the > %% number of revisions kept, specified in couch_db.hrl (default is 1000). > When > %% this limit is exceeded only the last 1000 are kept. This comes in to > play > %% when branches are merged. The comparison has to begin at the same place > in > %% the branches. A revision id is of the form N-XXXXXXX where N is the > current > %% revision. So each path will have a start number, calculated in > %% couch_doc:to_path using the formula N - length(RevIds) + 1 So, .eg. if > a doc > %% was edit 1003 times this start number would be 4, indicating that 3 > %% revisions were truncated. > %% > %% This comes into play in @see merge_at/3 which recursively walks down one > %% tree or the other until they begin at the same revision. > > > On 31 August 2013 19:02, Jens Alfke <[email protected]> wrote: > > The only description I can find about revs_limit is "the maximum number > of document revisions that will be tracked by CouchDB, even after > compaction has occurred." Nothing I've been able to find online says which > revisions are thrown out to reach this limit — it could be the oldest ones, > or the ones most deeply buried, for example. > > > > I’m guessing it’s most likely the oldest [earliest added] revisions, but > it’s not always clear what those are. For example, if a document with a big > rev tree gets replicated into this database, all of its revisions are the > same age as far as the local db is concerned, because they all got added in > the same PUT operation. > > > > Anyone know for sure? > > > > —Jens >
