Attached is a patch that aligns large shared memory allocations beyond MAXIMUM_ALIGNOF. The reason for this is that Intel's cpus have a fast path for bulk memory copies that only works with aligned addresses. It's possible that other cpus have similar restrictions.
With 7.3.4, it achives a 5% performance gain with pgbench. It has no effect with 7.3.3, because the buffers are already aligned by chance. I haven't properly tested 7.4cvs yet.


One problem is the "32" - it's arbitrary, it probably belongs into an arch dependant header file. But where?

--
   Manfred
diff -u pgsql.orig/src/backend/storage/ipc/shmem.c 
pgsql/src/backend/storage/ipc/shmem.c
--- pgsql.orig/src/backend/storage/ipc/shmem.c  2003-09-20 20:17:08.000000000 +0200
+++ pgsql/src/backend/storage/ipc/shmem.c       2003-09-20 20:34:21.000000000 +0200
@@ -131,6 +131,7 @@
 void *
 ShmemAlloc(Size size)
 {
+       uint32          newStart;
        uint32          newFree;
        void       *newSpace;
 
@@ -146,10 +147,21 @@
 
        SpinLockAcquire(ShmemLock);
 
-       newFree = shmemseghdr->freeoffset + size;
+       newStart = shmemseghdr->freeoffset;
+       if (size >= BLCKSZ)
+       {
+               /* Align BLCKSZ sized buffers even further:
+                * - the costs are small
+                * - some cpus (most notably Intel Pentium III)
+                *   prefer well-aligned addresses for memory copies
+                */
+               newStart = TYPEALIGN(32, newStart);
+       }
+
+       newFree = newStart + size;
        if (newFree <= shmemseghdr->totalsize)
        {
-               newSpace = (void *) MAKE_PTR(shmemseghdr->freeoffset);
+               newSpace = (void *) MAKE_PTR(newStart);
                shmemseghdr->freeoffset = newFree;
        }
        else
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to