IIRC we're not exactly right on the free space calculation but more importantly 
we also generate garbage while compacting. Specifically the id_tree updates 
cause a lot of fragmentation when docs are updated in a random order.

The compactor on the Nebraska-merge branch was rewritten to avoid this and was 
a significant improvement in many cases. 

> On Oct 5, 2013, at 9:33 AM, Calle Arnesten <[email protected]> 
> wrote:
> 
> Robert, thanks for your reply. 
> 
> I wasn't aware of the database footers, and then I can understand that an 
> endless compaction could happen if the value is set too low. But I get these 
> endless loop even if I raise to as high as 60%. To me that's not intuitive.
> 
> Before, I had it set to 70% and then I didn't get these endless compaction 
> loops, but then I in general consumed a lot more disk space than I do now. 
> 
> To me, at least, it would be more intuitive if the number stood for how much 
> unnecessary space that was allowed before compaction takes place. So for 
> example if I had a 10GB database file and it was 20% fragmented, it would 
> after compaction be 8GB and 0% fragmented. It might (?) be harder to 
> calculate the numbers that way, but it would be much easier to reason about 
> when configuring your database server.
> 
> /Calle
> 
>> On Sat, Oct 5, 2013, at 10:26, Robert Newson wrote:
>> 
>> It makes intuitive sense that setting that % too low will cause endless (and 
>> pointless) compactions (the ratio of disk_size to data_size exceeding your % 
>> immediately after compaction). I'm fairly sure, for example, that the 
>> data_size value does not include the space consumed by the many database 
>> footers in the file.
>> 
>> B.
>> 
>>> On 5 Oct 2013, at 07:43, Calle Arnesten <[email protected]> wrote:
>>> 
>>> I tested to change the db_fragmentation to different levels. If I raise it 
>>> to 70% the compaction stops, but for 60% and lower it keeps running all the 
>>> time. 
>>> 
>>> So there seems to be something weird with how CouchDB calculates the 
>>> fragmentation level. As I said, I have a large percentage of deleted 
>>> documents in the database, so perhaps it is not including them correctly in 
>>> the calculation? It could definitely be near 70% of the database size that 
>>> is deleted documents.
>>> 
>>>> On Fri, Oct 4, 2013, at 10:17, Calle Arnesten wrote:
>>>> Hi,
>>>> 
>>>> I recently upgraded from CouchDB 1.2 to 1.4. I have noticed that the 
>>>> database compaction is running more or less all the time during the 
>>>> allowed compaction time. Is there a known issue for this with 1.4?
>>>> 
>>>> The compaction is completed on each run and the reported database size is 
>>>> smaller on the first run during the compaction time. But then it starts 
>>>> again for the same database, and when completed, starts again, etc. It's 
>>>> like it thinks that the database is still fragmented even if it's not.
>>>> 
>>>> The databases are quite large (~5GB), so it's not the case that many 
>>>> documents have had time to change during the compaction time.
>>>> 
>>>> These are my settings:
>>>> [{db_fragmentation, "20%"}, {view_fragmentation, "20%"}, {from, "03:00"}, 
>>>> {to, "11:00"}]
>>>> 
>>>> The harddrive is not full, it has about 70GB of free space. 
>>>> 
>>>> I have a large percentage of deleted documents, if that might be a reason 
>>>> for the issue/bug. 
>>>> 
>>>> I don't have the same problem for view compaction.
>>>> 
>>>> Best regards
>>>> Calle Arnesten
>> 
>> Email had 1 attachment:
>> + signature.asc
>>  1k (application/pgp-signature)

Reply via email to