On 16.12.2011 15:42, Heikki Linnakangas wrote:
On 16.12.2011 15:03, Simon Riggs wrote:
On Fri, Dec 16, 2011 at 12:50 PM, Heikki Linnakangas
On 16.12.2011 14:37, Simon Riggs wrote:
I already proposed a design for that using page-level share locks any
reason not to go with that?
Sorry, I must've missed that. Got a link?
From nearly 4 years ago.
Ah, thanks. That is similar to what I'm experimenting, but a second
lwlock is still fairly heavy-weight. I think with many backends, you
will be beaten badly by contention on the spinlocks alone.
I'll polish up and post what I've been experimenting with, so we can
So, here's a WIP patch of what I've been working on. The WAL insertions
is split into two stages:
1. Reserve the space from the WAL stream. This is done while holding a
spinlock. The page holding the reserved space doesn't necessary need to
be in cache yet, the reservation can run ahead of the WAL buffer cache.
(quick testing suggests that a lwlock is too heavy-weight for this)
2. Ensure the page is in the WAL buffer cache. If not, initialize it,
evicting old pages if needed. Then finish the CRC calculation of the
header and memcpy the record in place. (if the record spans multiple
pages, it operates on one page at a time, to avoid problems with running
out of WAL buffers)
As long as wal_buffers is high enough, and the I/O can keep up, stage 2
can happen in parallel in many backends. The WAL writer process
pre-initializes new pages ahead of the insertions, so regular backends
rarely need to do that.
When a page is written out, with XLogWrite(), you need to wait for any
in-progress insertions to the pages you're about to write out to finish.
For that, every backend has slot with an XLogRecPtr in shared memory.
Iẗ́'s set to the position where that backend is currently inserting to.
If there's no insertion in-progress, it's invalid, but when it's valid
it acts like a barrier, so that no-one is allowed to XLogWrite() beyond
that position. That's very lightweight to the backends, but I'm using
busy-waiting to wait on an insertion to finish ATM. That should be
replaced with something smarter, that's the biggest missing part of the
One simple way to test the performance impact of this is:
psql -c "DROP TABLE IF EXISTS foo; CREATE TABLE foo (id int4);
echo "BEGIN; INSERT INTO foo SELECT i FROM generate_series(1, 10000) i;
ROLLBACK" > parallel-insert-test.sql
pgbench -n -T 10 -c4 -f parallel-insert-test.sql postgres
On my dual-core laptop, this patch increases the tps on that from about
60 to 110.
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: