I've posted another version of the buffer rewrite patch.

Another thing that might be interesting on a multi-CPU Opteron is to try
to make the shared memory layout more friendly to the CPU cache, which I
believe uses 128-byte cache lines.  (Simon was planning to try some of
these things but I haven't heard back about results.)  Things to try
here include

1. Change ALIGNOF_BUFFER in src/include/pg_config_manual.h to 128.
This will require a full recompile I think.  2 and 3 don't make any
sense until after you do this.

2. Pad the BufferDesc struct (in src/include/storage/buf_internals.h)
out to be exactly 64 or 128 bytes.  (64 would make it exactly 2 buffer
headers per cache line, so two CPUs would contend only when working on
a pair of adjacent headers.  128 would mean no cross-header cache
contention but of course it wastes a lot more storage.)  You need only
recompile the files in src/backend/storage/buffer/ after changing

3. Pad the LWLock struct (in src/backend/storage/lmgr/lwlock.c) to some
power of 2 up to 128 bytes --- same issue of space wasted versus
cross-lock contention.

