IIRC we're not exactly right on the free space calculation but more importantly we also generate garbage while compacting. Specifically the id_tree updates cause a lot of fragmentation when docs are updated in a random order.
The compactor on the Nebraska-merge branch was rewritten to avoid this and was a significant improvement in many cases. > On Oct 5, 2013, at 9:33 AM, Calle Arnesten <[email protected]> > wrote: > > Robert, thanks for your reply. > > I wasn't aware of the database footers, and then I can understand that an > endless compaction could happen if the value is set too low. But I get these > endless loop even if I raise to as high as 60%. To me that's not intuitive. > > Before, I had it set to 70% and then I didn't get these endless compaction > loops, but then I in general consumed a lot more disk space than I do now. > > To me, at least, it would be more intuitive if the number stood for how much > unnecessary space that was allowed before compaction takes place. So for > example if I had a 10GB database file and it was 20% fragmented, it would > after compaction be 8GB and 0% fragmented. It might (?) be harder to > calculate the numbers that way, but it would be much easier to reason about > when configuring your database server. > > /Calle > >> On Sat, Oct 5, 2013, at 10:26, Robert Newson wrote: >> >> It makes intuitive sense that setting that % too low will cause endless (and >> pointless) compactions (the ratio of disk_size to data_size exceeding your % >> immediately after compaction). I'm fairly sure, for example, that the >> data_size value does not include the space consumed by the many database >> footers in the file. >> >> B. >> >>> On 5 Oct 2013, at 07:43, Calle Arnesten <[email protected]> wrote: >>> >>> I tested to change the db_fragmentation to different levels. If I raise it >>> to 70% the compaction stops, but for 60% and lower it keeps running all the >>> time. >>> >>> So there seems to be something weird with how CouchDB calculates the >>> fragmentation level. As I said, I have a large percentage of deleted >>> documents in the database, so perhaps it is not including them correctly in >>> the calculation? It could definitely be near 70% of the database size that >>> is deleted documents. >>> >>>> On Fri, Oct 4, 2013, at 10:17, Calle Arnesten wrote: >>>> Hi, >>>> >>>> I recently upgraded from CouchDB 1.2 to 1.4. I have noticed that the >>>> database compaction is running more or less all the time during the >>>> allowed compaction time. Is there a known issue for this with 1.4? >>>> >>>> The compaction is completed on each run and the reported database size is >>>> smaller on the first run during the compaction time. But then it starts >>>> again for the same database, and when completed, starts again, etc. It's >>>> like it thinks that the database is still fragmented even if it's not. >>>> >>>> The databases are quite large (~5GB), so it's not the case that many >>>> documents have had time to change during the compaction time. >>>> >>>> These are my settings: >>>> [{db_fragmentation, "20%"}, {view_fragmentation, "20%"}, {from, "03:00"}, >>>> {to, "11:00"}] >>>> >>>> The harddrive is not full, it has about 70GB of free space. >>>> >>>> I have a large percentage of deleted documents, if that might be a reason >>>> for the issue/bug. >>>> >>>> I don't have the same problem for view compaction. >>>> >>>> Best regards >>>> Calle Arnesten >> >> Email had 1 attachment: >> + signature.asc >> 1k (application/pgp-signature)
