On 02/03/2011 19:33, Wayne Conrad wrote:
We run a compaction script that compacts every database every night.
Compaction of our biggest (0.6 TB) database took about 10 hours today.
Granted, the hardware has poor I/O bandwidth, but even if we improve
the hardware, a change in strategy could be good. Along with
splitting that database into more manageable pieces, I hope to write a
compaction script that only compacts a database sometimes (a la
Postgresql's autovacuum). To do that, I want some way to estimate
whether there's anything to gain from compacting any given database.
I thought I could use the doc_del_count returned by GET
/<database-name> as a gauge of whether to compact or not, but in my
tests doc_del_count remained the same after compaction. Are there any
statistics, however imperfect, that could help my code guess when
compaction ought to be done?
Just a thought.
After compacting, the database will have a given size on disk. Would it
be possible to test, and compact if this grew by (say) 15%?
Its not perfect - but it might be better than time.
Regards
Ian