[HACKERS] Re: [HACKERS] Wal sync odirect

2013-07-22 Thread Миша Тюрин

i tell about wal_level is higher than MINIMAL


wal_level != minimal
http://doxygen.postgresql.org/xlogdefs_8h_source.html

48   * Because O_DIRECT bypasses the kernel buffers, and because we never
49   * read those buffers except during crash recovery or if wal_level != 
minimal 

 hi, list. there are my proposal. i would like to tell about odirect in wal 
 sync in wal_level is higher than minimal. i think in my case when wal 
 traffic is up to 1gb per 2-3 minutes but discs hardware with 2gb bbu cache 
 (or maybe ssd under wal) - there would be better if wall traffic could not 
 harm os memory eviction. and i do not use streaming. my archive command may 
 read wal directly without os cache. just opinion, i have not done any tests 
 yet. but i am still under the some memory eviction anomaly.

PostgreSQL already uses O_DIRECT for WAL writes if you use O_SYNC mode
for WAL writes. See comments in src/include/access/xlogdefs.h (search
for O_DIRECT). You should also examine
src/backend/access/transam/xlog.c, particularly the function
get_sync_bit(...)

Try doing some tests with pg_test_fsync, see how performance looks. If
your theory is right and WAL traffic is putting pressure on kernel write
buffers, using fsync=open_datasync - which should be the default on
Linux - may help.

I'd recommend doing some detailed tracing and performance measurements
before trying to proceed further.

-- 
 Craig Ringer  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Re: [HACKERS] Wal sync odirect

2013-07-22 Thread Craig Ringer
On 07/22/2013 03:30 PM, Миша Тюрин wrote:
 
 i tell about wal_level is higher than MINIMAL

OK, so you want to be able to force O_DIRECT for wal_level = archive ?

I guess that makes sense if you expect the archive_command to read the
file out of the RAID controller's write cache before it gets flushed and
your archive_command can also use direct I/O to avoid pulling it into cache.

You already know where to change to start experimenting with this. What
exactly are you trying to ask? I don't see any risk in forcing O_DIRECT
for higher wal_level, but I'm not an expert in WAL and recovery. I'd
recommend testing on a non-critical PostgreSQL instance.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Wal sync odirect

2013-07-22 Thread Cédric Villemain
Le lundi 22 juillet 2013 09:39:50, Craig Ringer a écrit :
 On 07/22/2013 03:30 PM, Миша Тюрин wrote:
  
  i tell about wal_level is higher than MINIMAL
 
 OK, so you want to be able to force O_DIRECT for wal_level = archive ?
 
 I guess that makes sense if you expect the archive_command to read the
 file out of the RAID controller's write cache before it gets flushed and
 your archive_command can also use direct I/O to avoid pulling it into cache.
 
 You already know where to change to start experimenting with this. What
 exactly are you trying to ask? I don't see any risk in forcing O_DIRECT
 for higher wal_level, but I'm not an expert in WAL and recovery. I'd
 recommend testing on a non-critical PostgreSQL instance.

IIRC there is also some fadvise() call to flush the buffer cache when using 
'minimal', but not when using archiving of WAL.
I'm unsure how this has been tunned with streaming replication addition.

see xlog.c|h

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation