On 26 Jan 2010, at 15:15, Markus Jelsma wrote:

> Hello Paul and others,
> 
> 
> Now we're on the subject of compaction, let me ask an question. I have
> some importer somewhere that fills a clean db with about 3500 records,
> futon now tells me its size is 4.2 MB. However, if i compact a fresh and
> clean database (presumably without extraneous information such as old
> revisions) it is suddenly just 2.4 MB!
> 
> Can you, or someone, give an explanation on this matter? It smells like an
> unwanted feature but i could be wrong :)
> 
> To get things straight, this doesn't happen with just a two documents with
> only a uuid ID and a revision number.

Beside the pruning of old revisions compaction will also rebuild the
underlying b-tree structure into a more compact form than single inserts
create on the original database.

Cheers
Jan
--


> 
> 
> Cheers,
> 
> 
> Paul Davis said:
>> On Tue, Jan 26, 2010 at 4:56 PM, Sean Clark Hess <seanh...@gmail.com>
>> wrote:
>>> I'm wondering if old versions of documents ever expire. On servers
>>> that are disk-bound (like my tiny VPS slices will be) this could be
>>> something I had to design around.
>>> 
>>> For example, when importing data (millions of rows) from a relational
>>> database, I want to be able to build a document a piece at a time. The
>>> relational schema is wacked - it has information about a given
>>> document in like 10 different tables, and I don't want to have to try
>>> to hold everything in memory just so I only have to write the document
>>> once.
>>> 
>>> Any way to control it, or turn versioning off?  Is it even a concern?
>>> Thanks!
>>> 
>> 
>> Sean,
>> 
>> Compaction removes the bodies of old documents. The only information
>> that remains is some historical information to allow for proper
>> merging during replication. The number of historical descriptions is
>> configurable so that even this information can be pruned during
>> compaction.
>> 
>> The closest you could get to purging all historical information is to
>> set the rev_stemming parameter low and compacting to get rid of the
>> extra data. I personally wouldn't worry too much about the
>> rev_stemming parameter and instead just compact as much as needed
>> during the import.
>> 
>> HTH,
>> Paul Davis
> 
> 
> 

Reply via email to