Zeugswetter Andreas ADI SD wrote:
Also, that patch doesn't address the VACUUM issue at all. And
using a small fixed size ring with scans that do updates can
be devastating. I'm experimenting with different ring sizes
for COPY at the moment. Too small ring leads to a lot of WAL
flushes, it's basically the same problem we have with VACUUM
in CVS HEAD.
My first take on that would be to simply abandon any dirty (and actually
also any still pinned) buffer from the ring and replace the ring slot
with a buffer from the freelist.
If the freelist is empty and LSN allows writing the buffer, write it
(and maybe try to group these).
If the LSN does not allow the write, replace the slot with a buffer from
That would effectively disable the ring for COPY and the 2nd phase of
One problem with looking at the LSN is that you need the content lock to
read it, and I wouldn't want to add any new locking. It could be done
inside FlushBuffer when we hold the lock anyway, but I'm afraid the
changes would be pretty invasive.
I'm struggling to get a grip of what the optimal ring size is under
various circumstances. Some thoughts I have this far:
- a small ring gives better L2 cache behavior
- for read-only queries, and for queries that just hint bits, 1 buffer
- small ring with query that writes WAL (COPY, mass updates, 2nd phase
of VACUUM) leads to a lot of WAL flushes, which can become bottleneck.
But all these assumptions need to be validated. I'm setting up tests
with different ring sizes and queries to get a clear picture of this:
- VACUUM on a clean table
- VACUUM on a table with 1 dead tuple per page
- read-only scan, large table
- read-only scan, table fits in OS cache
In addition, I'm going to run VACUUM in a DBT-2 test to see the affect
on other queries running concurrently.
I think a ring that grows when WAL flushes occur covers all the use
cases reasonably well, but I need to do the testing...
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster