Hi,

Sorry to interfere with such a question, but why don't you work with a buffer database? I mean, make a replica to another database which filters out the deleted documents. In such way you can clean all your databases and you use temporary some extra-space (only during the "cleaning" process). Another idea would be to use two databases: one active and one inactive at the given time. That means, you move the data from one to the other, filtering out the deleted documents, and when it's over, you switch to the newly constructed database, while the other gets emptied (deleted and re-created). Just my 2c opinions.

CGS





On 12/23/2011 01:20 PM, Henrik Lundgren wrote:
Ok, so how do I prevent the database from consuming all diskspace in
the long run?

I'm developing an application that is quite insert heavy ( about 6 Gb
/ day ), the database is essentially a message inbox.

I plan to delete obsolete messages in a houskeeping job, but if
CouchDB will retain the latest revision of all documents I might have
to reconsider using CouchDB, which is a pity :-(

Henrik

On Fri, Dec 23, 2011 at 12:36 PM, Marcello Nuccio
<[email protected]>  wrote:
OK, I've added the replies from Robert and Paul to
http://wiki.apache.org/couchdb/FUQ

Then it is right to say that there are informations that can't be
deleted from a database, for example the _id of documents?

Thanks for the clarifications, since this behaviour was totally non
obvious to me.

Marcello

2011/12/23 Robert Newson<[email protected]>:
An update to the wiki would be be very helpful.

It's worth saying again that compaction does *not* remove "deleted
documents’ contents". We keep the latest revision of every document
ever seen, even if that revision has _deleted:true in it. This is so
that replication can ensure eventual consistency between replicas. Not
only will all replicas agree on which documents are present and which
are not, but also the contents of both.

B.

On 23 December 2011 08:11, Marcello Nuccio<[email protected]>  wrote:
2011/12/23 Paul Davis<[email protected]>:
On Thu, Dec 22, 2011 at 7:00 PM, Jens Alfke<[email protected]>  wrote:
On Dec 22, 2011, at 1:44 PM, Chris Stockton wrote:

Okay, so this catches me a bit off guard, always thought compaction
cleaned those up.

Compaction removes old revisions’ and deleted documents’ contents, but their 
revision histories are still there. Those should be pretty small, though, since 
they’re just trees of revision IDs.

(Unless you did delete the docs by just setting a “_deleted” attribute? I don’t 
know what the behavior of that would be; sounds like it doesn’t actually delete 
the document from the database, in which case maybe the last revision data does 
get left behind.)

—Jens
Deleted documents specifically allow for a body to be set in the
deleted revision. The intention for this is to have a "who deleted
this" type of meta data for the doc. Some client libraries delete docs
by grabbing the current object blob, adding a '"_deleted": true'
member, and then sending it back which inadvertently (in most cases)
keeps the last doc body around after compaction.
Can I write these informations in the wiki?
I think it would be very useful in
http://wiki.apache.org/couchdb/Compaction
and in http://wiki.apache.org/couchdb/FUQ

Marcello

Reply via email to