Robert Newson wrote: >> CouchDB *must* write an updated btree and an updated header to point to the root of that btree every time you update a document, or it will be lost if couch crashed right then. <<
So, we have these 3 pieces of info that need to be written with every update of a document: 1) the btree 2) the updated header that points to the root of the btree 3) the actual json document itself If all 3 of these pieces are written to the same physical disk file then I will respectfully bail out, as the rest of my question would not make much sense, or at least not without major restructuring. However, if (1) the btree is in a file of its own and if (2) the updated header and (3) the acutal json document are written to the same file then .. a) How many of the update headers are actually useful? Is it just the last successfully written one or even just a few last ones ? b) If only the last or last few headers are actually useful then could those updated headers not be kept in a separate (perhaps pre formatted) file, where the header records themselves were re-used (perhaps in a ring or some other fashion) ? c) If (a) and (b) make any sense then would one not result with a perfectly compacted DB for at least all of the logging type of use cases, where only new records are being created and existing ones are never updated nor deleted? d) While (c) might sound like a contrived "use case", I am asking mostly to determine what (in addition to dead old revisions and deleted docs) it is that is adding to the "bulkiness" of disk usage ? In other words, are those "updated headers" one of the major contributing factors (if not all of the factors) and could that be remedied? Thanks again and regards to everyone, teslan
