To add some medium-hard data to the discussion, I hacked a PG 7.3.4 a little. The system I am talking about below run's an artificial application that very well resembles the behaviour of a TPC-C benchmark implementation. Without vacuuming the database, it can just so sustain a factor 5 scaled database running with 50 simulated terminals. To free some bandwidth, the system is configured with scaling 4 and runs with 40 simulated terminals. In this configuration it can satisfy the responsetime requirements for 100% of all transactions when not vacuuming ... no surprise.

The test driver takes 10 minute intervals and reports the percentage of transactions which qualify.

If the database now is vacuumed simultaneously, the response time for transactions changes dramatically. A 10 minute interval hit by vacuum drops down from 100% to anything below 90%, I've seen it down to 75%. The system load given by a Linux 2.4 kernel jumps up from under 1.0 to anything between 5 and 8.

So far, that is exactly what most DBA's are complaining about. A system that runs smoothly otherwise get's literally bogged down by any vacuum.

Now I changed the cache policy. While a backend is running vacuum, a global flag is set. If this flag is set and a block is not found in the cache but must be read, it's buffer is marked BM_READ_BY_VACUUM. When the global flag is set, AddBufferToFreelist() inserts buffers so marked at the head of the freelist instead of adding them to the tail. In any case, the buffers BM_READ_BY_VACUUM flag is cleared.

The effect of this simple hack is somewhat surprising. Not only can the system keep satisfying 97% or more of all transactions within time limits and the system load stays well below 2.0 (I've only seen 1.6 once), but very surprisingly VACUUM finishes about 20% faster too.

I'm not a friend of jumping to conclusions, OTOH I have to try to make some sense out of it. So I would like the following be taken with a reasonable amount of salt.

I think that the common theory, vacuum is similar to a sequential scan, just does not hold true for any table that is actually updated randomly. What happens instead is that vacuum not only evicts the whole buffer cache by forcing all blocks of said table and its indexes in, it also dirties a substantial amount of that and leaves the dirt to be cleaned up by all the other backends.

The changes I've done above cause vacuum to work with as few shared buffers as possible for the data not already found in the cache. This avoids imposing unnecessary additional write overhead for regular backends, and causes the vacuum process to stay inside of a few virtual memory pages instead of running all over the place. I don't know how much the latter impacts the efficiency of the MMU, it might not be significant here.

It is well possible that there is some other side effect in the buffer cache that impacts the behaviour of many backends doing few writes compared to one backend doing them en-gros.

However, the test indicates that there is some low hanging fruit in the cache algorithm, and that it's not just a few little raspberries.


Jan



Tom Lane wrote:


Shridhar Daithankar <[EMAIL PROTECTED]> writes:
I was thinking about it. How about vacuuming a page when it is been
pushed out of postgresql buffer cache? It is is memory so not much IO
is involved.

You keep ignoring the problem of removing index entries. To vacuum an individual page, you need to be willing to read in (and update) all index pages that reference the tuples-to-be-deleted. This is hardly tenable when the reason for pushing the page out of buffer cache was so that you could read in something else instead --- you don't have spare buffer slots, and you don't want to do all that I/O (and the associated WAL log entries) before you can read in the page you originally wanted.

The latter point is really the crux of the problem.  The point of having
the VACUUM process is to keep maintenance work out of the critical path
of foreground queries.  Anything that moves even part of that
maintenance work into the critical path is going to be a net loss.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #


---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to