Hi list,

PostgreSQL's default settings change when built with Linux kernel
headers 2.6.33 or newer. As discussed on the pgsql-performance list,
this causes a significant performance regression:
http://archives.postgresql.org/pgsql-performance/2010-10/msg00602.php

NB! I am not proposing to change the default -- to the contrary --
this patch restores old behavior. Users might be in for a nasty
performance surprise when re-building their Postgres with newer Linux
headers (as was I), so I propose that this change should be made in
all supported releases.

-- commit message --
Revert default wal_sync_method to fdatasync on Linux 2.6.33+

Linux kernel headers from 2.6.33 (and later) change the behavior of the
O_SYNC flag. Previously O_SYNC was aliased to O_DSYNC, which caused
PostgreSQL to use fdatasync as the default instead.

Starting with kernels 2.6.33 and later, the definitions of O_DSYNC and
O_SYNC differ. When built with headers from these newer kernels,
PostgreSQL will default to using open_datasync. This patch reverts the
Linux default to fdatasync, which has had much more testing over time
and also significantly better performance.
-- end commit message --

Earlier kernel headers defined O_SYNC and O_DSYNC to 0x1000
2.6.33 and later define O_SYNC=0x101000 and O_DSYNC=0x1000 (since old
behavior on most FS-es was always equivalent to POSIX O_DSYNC)

More details at:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6b2f3d1f769be5779b479c37800229d9a4809fc3

Currently PostgreSQL's include/access/xlogdefs.h defaults to using
open_datasync when O_SYNC != O_DSYNC, otherwise fdatasync is used.

Since other platforms might want to default to fdatasync in the
future, too, I defined a new PLATFORM_DEFAULT_SYNC_METHOD constant in
include/port/linux.h. I don't know if this is the best way to do it.

Regards,
Marti
From fbf61c6536b7060e5c6745c8221a5a4fb9a53c92 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp <ma...@juffo.org>
Date: Fri, 5 Nov 2010 17:40:22 +0200
Subject: [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

Linux kernel headers from 2.6.33 (and later) change the behavior of the
O_SYNC flag. Previously O_SYNC was aliased to O_DSYNC, which caused
PostgreSQL to use fdatasync as the default instead.

Starting with kernels 2.6.33 and later, the definitions of O_DSYNC and
O_SYNC differ. When built with headers from these newer kernels,
PostgreSQL will default to using open_datasync. This patch reverts the
Linux default to fdatasync, which has had much more testing over time
and also significantly better performance.
---
 src/include/access/xlogdefs.h |    5 +++++
 src/include/port/linux.h      |    9 +++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/src/include/access/xlogdefs.h b/src/include/access/xlogdefs.h
index 18b214e..7ed1bde 100644
--- a/src/include/access/xlogdefs.h
+++ b/src/include/access/xlogdefs.h
@@ -13,6 +13,7 @@
 #define XLOG_DEFS_H
 
 #include <fcntl.h>				/* need open() flags */
+#include "pg_config_os.h"
 
 /*
  * Pointer to a location in the XLOG.  These pointers are 64 bits wide,
@@ -123,6 +124,9 @@ typedef uint32 TimeLineID;
 #endif
 #endif
 
+#if defined(PLATFORM_DEFAULT_SYNC_METHOD)
+#define DEFAULT_SYNC_METHOD PLATFORM_DEFAULT_SYNC_METHOD
+#else							/* !defined(PLATFORM_DEFAULT_SYNC_METHOD) */
 #if defined(OPEN_DATASYNC_FLAG)
 #define DEFAULT_SYNC_METHOD		SYNC_METHOD_OPEN_DSYNC
 #elif defined(HAVE_FDATASYNC)
@@ -132,6 +136,7 @@ typedef uint32 TimeLineID;
 #else
 #define DEFAULT_SYNC_METHOD		SYNC_METHOD_FSYNC
 #endif
+#endif
 
 /*
  * Limitation of buffer-alignment for direct IO depends on OS and filesystem,
diff --git a/src/include/port/linux.h b/src/include/port/linux.h
index b9498b2..5a88590 100644
--- a/src/include/port/linux.h
+++ b/src/include/port/linux.h
@@ -12,3 +12,12 @@
  * to have a kernel version test here.
  */
 #define HAVE_LINUX_EIDRM_BUG
+
+/*
+ * The definition of O_SYNC changed in Linux kernel headers starting with
+ * 2.6.33.  This caused PostgreSQL to change the wal_sync_method default from
+ * fdatasync to open_datasync.  Since fdatasync is more tested on Linux and has
+ * better performance, default to fdatasync on Linux.
+ */
+#define PLATFORM_DEFAULT_SYNC_METHOD SYNC_METHOD_FDATASYNC
+
-- 
1.7.3.2

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to