On 02/03/2011 19:33, Wayne Conrad wrote:
We run a compaction script that compacts every database every night. Compaction of our biggest (0.6 TB) database took about 10 hours today. Granted, the hardware has poor I/O bandwidth, but even if we improve the hardware, a change in strategy could be good. Along with splitting that database into more manageable pieces, I hope to write a compaction script that only compacts a database sometimes (a la Postgresql's autovacuum). To do that, I want some way to estimate whether there's anything to gain from compacting any given database.

I thought I could use the doc_del_count returned by GET /<database-name> as a gauge of whether to compact or not, but in my tests doc_del_count remained the same after compaction. Are there any statistics, however imperfect, that could help my code guess when compaction ought to be done?

Just a thought.

After compacting, the database will have a given size on disk. Would it be possible to test, and compact if this grew by (say) 15%?

Its not perfect - but it might be better than time.

Regards

Ian




Reply via email to