[ 
https://issues.apache.org/jira/browse/COUCHDB-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081324#comment-13081324
 ] 

Damien Katz commented on COUCHDB-1243:
--------------------------------------

I mostly agree with Robert Newsom, that what you are asking for is a dangerous 
thing for CouchDB replication. However, there is the purge option, which 
"forgets" documents, deleted or otherwise, completely removing them from the 
internal indexes. Once documents are purged, compaction will will completely 
remove them from the file forever. Unfortunately, I couldn't find actual 
documentation on the purge functionality, so the best place to figure out how 
to use the purge is to look at the purge test in the browser test suite, which 
can be found here:

http://svn.apache.org/viewvc/couchdb/trunk/share/www/script/test/purge.js?view=co&revision=1086241&content-type=text%2Fplain

I've often thought a it would be useful to purge docs during compaction, by 
providing a user defined function to signal to remove unwanted docs/stubs. But 
no such thing exists, in the meantime you can accomplish it with a purge + 
compaction.

> Compact and copy feature that resets changes
> --------------------------------------------
>
>                 Key: COUCHDB-1243
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1243
>             Project: CouchDB
>          Issue Type: New Feature
>          Components: Database Core
>    Affects Versions: 1.0.1, 1.1
>         Environment: Ubuntu, but not important
>            Reporter: Henrik Hofmeister
>              Labels: cleanup, compaction
>         Attachments: dump_load.php
>
>
> After running db and view compaction on a 70K doc db with 6+ mio. changes - 
> it takes up 0.8 GB. If copying the same documents to a new db (get and bulk 
> insert) - the same date with 70K changes (only the inserts) takes up 40 mb. 
> That is a huge difference. Has been verified on 2 db's that the difference is 
> more than 65 times the size of data.
> A "Compact and copy" feature that copies only documents, and resets the 
> changes for at db would be very nice to try and limit the disk usage a little 
> bit. (Our current test environment takes up nearly 100 GB... )
> I've attached the dump load php script for your convenience.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to