[
https://issues.apache.org/jira/browse/COUCHDB-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081508#comment-13081508
]
Robert Newson commented on COUCHDB-1243:
----------------------------------------
_purge is really for the "oops, I just put my admin password in a document"
scenario. It's not well tested, has known and unresolved bugs, and obviously
ruins eventual consistency. I'd rather see it removed than encouraged, but I
think it's important for the narrow use case I just mentioned.
We only remember the _rev's for the last 1000 updates to a document, so there
is a cap (albeit a generous one) on how much is retained. When you say '6+
million changes' are these updates to existing documents or are you deleting
documents and making new ones?
If the latter, then you could consider the temporal database idea, which is
often suggested when using couchdb as a message queue: Use a database per time
interval (say, weekly). When the database is empty (i.e, only has deleted
documents), you can delete the db entirely.
I'll finish with saying that CouchDB's retention of information about deleted
documents and old revisions is central to CouchDB, if it's working so strongly
against you, then I don't think it's the right database solution for your
problem.
> Compact and copy feature that resets changes
> --------------------------------------------
>
> Key: COUCHDB-1243
> URL: https://issues.apache.org/jira/browse/COUCHDB-1243
> Project: CouchDB
> Issue Type: New Feature
> Components: Database Core
> Affects Versions: 1.0.1, 1.1
> Environment: Ubuntu, but not important
> Reporter: Henrik Hofmeister
> Labels: cleanup, compaction
> Attachments: dump_load.php
>
>
> After running db and view compaction on a 70K doc db with 6+ mio. changes -
> it takes up 0.8 GB. If copying the same documents to a new db (get and bulk
> insert) - the same date with 70K changes (only the inserts) takes up 40 mb.
> That is a huge difference. Has been verified on 2 db's that the difference is
> more than 65 times the size of data.
> A "Compact and copy" feature that copies only documents, and resets the
> changes for at db would be very nice to try and limit the disk usage a little
> bit. (Our current test environment takes up nearly 100 GB... )
> I've attached the dump load php script for your convenience.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira