Well, what if your real data size grows by 15%? :) Bob Dionne is working on a patch to reveal the true database size (the amount of the file that contains live data). Once we have that, compaction priority can be more accurately determined.
B. On 8 March 2011 09:09, Ian Hobson <[email protected]> wrote: > On 02/03/2011 19:33, Wayne Conrad wrote: >> >> We run a compaction script that compacts every database every night. >> Compaction of our biggest (0.6 TB) database took about 10 hours today. >> Granted, the hardware has poor I/O bandwidth, but even if we improve the >> hardware, a change in strategy could be good. Along with splitting that >> database into more manageable pieces, I hope to write a compaction script >> that only compacts a database sometimes (a la Postgresql's autovacuum). To >> do that, I want some way to estimate whether there's anything to gain from >> compacting any given database. >> >> I thought I could use the doc_del_count returned by GET /<database-name> >> as a gauge of whether to compact or not, but in my tests doc_del_count >> remained the same after compaction. Are there any statistics, however >> imperfect, that could help my code guess when compaction ought to be done? >> > Just a thought. > > After compacting, the database will have a given size on disk. Would it be > possible to test, and compact if this grew by (say) 15%? > > Its not perfect - but it might be better than time. > > Regards > > Ian > > > > >
