Shridhar Daithankar wrote:

Josh Berkus wrote:

Shridhar,

>>However I do not agree with this logic entirely. It pegs the next vacuum


w.r.t current table size which is not always a good thing.

Ok, what do you recommend? The point of two separate variables allows you to specify if you want vacuum based on a fixed number, based on table size or something inbetween.


No, I think the logic's fine, it's the numbers which are wrong. We want to vacuum when updates reach between 5% and 15% of total rows. NOT when updates reach 110% of total rows ... that's much too late.

For small tables, you don't need to vacuum too often. In the testing I did a small table ~100 rows, didn't really show significant performance degredation until it had close to 1000 updates. For large tables, vacuum is so expensive, that you don't want to do it very often, and scanning the whole table when there is only 5% wasted space is not very helpful.

Hmmm ... I also think the threshold level needs to be lowered; I guess the purpose was to prevent continuous re-vacuuuming of small tables? Unfortunately, in the current implementation, the result is tha small tables never get vacuumed at all.

So for defaults, I would peg -V at 0.1 and -v at 100, so our default calculation for a table with 10,000 rows is:

100 + ( 0.1 * 10,000 ) = 1100 rows.

Yes, the I set the defaults a little high perhaps so as to err on the side of caution. I didn't want people to say pg_autovacuum kills the performance of my server. A small table will get vacuumed, just not until it has reached the threshold. So a table with 100 rows, will get vacuumed after 1200 updates / deletes. In my testing it showed that there was no major performance problems until you reached several thousand updates / deletes.

Furthermore analyze threshold depends upon inserts+updates. I think it
should also depends upon deletes for obvious reasons.

Yes. Vacuum threshold is counting deletes, I hope?

It does.


My comment about the frequency of vacuums vs. analyze is that currently the *default* is to analyze twice as often as you vacuum. Based on my experiece as a PG admin on a variety of databases, I believe that the default should be to analyze half as often as you vacuum.

HUH? analyze is very very cheap compared to vacuum. Why not do it more often?

I am all for experimentation. If you have real life data to play with, I
can give you some patches to play around.

I will have real data very soon .....

I will submit a patch that would account deletes in analyze threshold. Since you want to delay the analyze, I would calculate analyze count as

deletes are already accounted for in the analyze threshold.


I am still wary of inverting vacuum analyze frequency. You think it is better to set inverted default rather than documenting it?

I think inverting the vacuum and analyze frequency is wrong.


What I think I am hearing is that people would like very much to be able to tweak the settings of pg_autovacuum for individual tables / databases etc. So that you could set certain tables to be vacuumed more agressivly than others. I agree this would be a good and welcome addition. I hope have time to work on this at some point, but in the near future I won't.

Matthew



---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to