We run a compaction script that compacts every database every night.
Compaction of our biggest (0.6 TB) database took about 10 hours today.
Granted, the hardware has poor I/O bandwidth, but even if we improve the
hardware, a change in strategy could be good. Along with splitting that
database into more manageable pieces, I hope to write a compaction
script that only compacts a database sometimes (a la Postgresql's
autovacuum). To do that, I want some way to estimate whether there's
anything to gain from compacting any given database.
I thought I could use the doc_del_count returned by GET /<database-name>
as a gauge of whether to compact or not, but in my tests doc_del_count
remained the same after compaction. Are there any statistics, however
imperfect, that could help my code guess when compaction ought to be done?
Best Regards,
Wayne Conrad