+1 as well. This should be useful for folks who are in a tight spot with an 
uncompacted database to know, roughly, how much free space they need for a 
successful compaction.

Thanks for working on clarifying this!

Best
Jan
—

> On 22. Oct 2018, at 23:47, Joan Touzet <woh...@apache.org> wrote:
> 
> +1 to Adam's definition, which I think is closest to the "former" definition 
> in Eric's first post.
> 
> -Joan
> 
> ----- Original Message -----
> From: "Adam Kocoloski" <kocol...@apache.org>
> To: "dev@couchdb.apache.org Developers" <dev@couchdb.apache.org>
> Sent: Monday, October 22, 2018 5:13:05 PM
> Subject: Re: Exact definition of a database "active size"
> 
> I think sizes.active should be a close approximation of the size of the 
> database after compaction; i.e. it should be possible to use (sizes.file - 
> sizes.active) as a way to estimate the number of bytes that can be reclaimed 
> by compacting that database shard.
> 
> Adam
> 
>> On Oct 22, 2018, at 4:32 PM, Eiri <e...@eiri.ca> wrote:
>> 
>> Dear all,
>> 
>> I’d like to hear your opinion on how we should interpret a database 
>> attribute “active size”.
>> 
>> As you surely know we are using three different size attributes in a 
>> database info: file - the size of the database file on disk; external - the 
>> uncompressed size of database contents and active, defined as “the size of 
>> live data inside the database” or “active byte in the current MVCC snapshot”.
>> 
>> Sometime ago I had a discussion with Paul Davis and he pointed on ambiguity 
>> of that definition, namely - is it live data before a compaction or after a 
>> compaction? To put it in other words: should we treat as “active” only the 
>> documents and attachments on btree’s leafs or also include into it the 
>> previous document revisions while they can be accessed. Codewise it is the 
>> latter, both in current version of CouchDB and in 1.x version where active 
>> size was named data_size, but intuitively it feels that it should be former.
>> 
>> Despite sounds academical this is a practical question, the difference of 
>> active size before and after compaction could be rather noticeable and since 
>> it is used as a trigger by compaction daemon it could skew disk usage 
>> pattern.
>> 
>> Please share your thoughts. If we’ll conclude that we want to change how 
>> active size calculated I’m willing to take on implementation of this as I 
>> have a recent PR around the same area of code.
>> 
>> 
>> Regards,
>> Eric
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 

-- 
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/

Reply via email to