On Sat, Dec 3, 2011 at 11:02 PM, Robert Newson <[email protected]> wrote: > I can't mention _purge without reminding everyone that it exists only > for removal of data that should not have been stored in the first > place (like sensitive passwords, etc). It is not a mechanism to use > lightly as it breaks eventual consistency, is only lightly tested, and > will often cause full view rebuilds.
Hi, Bob. Since this is the user list, may I pull this thread into a tangent? tl;dr = There is one exception, purging very old deleted docs; link to an awesome write-up at the end. Where I agree: * Purge is not a mechanism to use lightly * Purge breaks eventual consistency * Purge is only lightly tested Where I sort-of agree: * Purge will "often" cause full view rebuilds. This is true in the most general case, however it can be virtually eliminated in production by writing careful code (or using a carefully-written library). Where I disagree: * Purge exists only for removal of data which should not have been stored in the first place (like sensitive passwords) Let's break this down. The easy part is what to do with sensitive passwords. Purge is not delete; purge removes a document from ever having existed in the first place. Like Marty McFly in "The Dating Game": http://www.youtube.com/watch?v=CC73uxAVfVY Since purged documents cannot be replicated away, the best thing to do about a sensitive password is to *change it*, so the change can propagate. (Subsequent compaction on the Couch(es) will remove the password from the disk--or at least from the filesystem.) And speaking of compaction, your "never purge" advice is good; except for deleted documents. Deleted documents never, ever, exit the .couch file. CouchDB is relaxed. I should be able to create and delete documents and expect reasonable post-compaction disk usage. If purge is off-limits, certain usage styles of CouchDB produce ever-growing .couch files, and ever-slowing _all_docs and _changes queries. One might say, "just make a new database, with filtered replication." Well, that is basically exactly what a purge does: * It is to be done only rarely, and with care * Some documents "never existed at all" * All views are rebuilt For this reason, I have written a procedure for purging very old deleted documents in a production setting. I hope to actually implement it one day; but you are right that it is needed rarely, if at all, in the real world. https://github.com/iriscouch/cqs#purging -- Iris Couch
