Re: [HACKERS] Re: [HACKERS] Wal sync odirect

2013-07-22 Thread Cédric Villemain
Le lundi 22 juillet 2013 09:39:50, Craig Ringer a écrit :
> On 07/22/2013 03:30 PM, Миша Тюрин wrote:
> > 
> > i tell about wal_level is higher than MINIMAL
> 
> OK, so you want to be able to force O_DIRECT for wal_level = archive ?
> 
> I guess that makes sense if you expect the archive_command to read the
> file out of the RAID controller's write cache before it gets flushed and
> your archive_command can also use direct I/O to avoid pulling it into cache.
> 
> You already know where to change to start experimenting with this. What
> exactly are you trying to ask? I don't see any risk in forcing O_DIRECT
> for higher wal_level, but I'm not an expert in WAL and recovery. I'd
> recommend testing on a non-critical PostgreSQL instance.

IIRC there is also some fadvise() call to flush the buffer cache when using 
'minimal', but not when using archiving of WAL.
I'm unsure how this has been tunned with streaming replication addition.

see xlog.c|h

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


Re: [HACKERS] Re: [HACKERS] Wal sync odirect

2013-07-22 Thread Craig Ringer
On 07/22/2013 03:30 PM, Миша Тюрин wrote:
> 
> i tell about wal_level is higher than MINIMAL

OK, so you want to be able to force O_DIRECT for wal_level = archive ?

I guess that makes sense if you expect the archive_command to read the
file out of the RAID controller's write cache before it gets flushed and
your archive_command can also use direct I/O to avoid pulling it into cache.

You already know where to change to start experimenting with this. What
exactly are you trying to ask? I don't see any risk in forcing O_DIRECT
for higher wal_level, but I'm not an expert in WAL and recovery. I'd
recommend testing on a non-critical PostgreSQL instance.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Wal sync odirect

2013-07-22 Thread Миша Тюрин

i tell about wal_level is higher than MINIMAL


wal_level != minimal
http://doxygen.postgresql.org/xlogdefs_8h_source.html
"
48   * Because O_DIRECT bypasses the kernel buffers, and because we never
49   * read those buffers except during crash recovery or if wal_level != 
minimal "

>> hi, list. there are my proposal. i would like to tell about odirect in wal 
>> sync in wal_level is higher than minimal. i think in my case when wal 
>> traffic is up to 1gb per 2-3 minutes but discs hardware with 2gb bbu cache 
>> (or maybe ssd under wal) - there would be better if wall traffic could not 
>> harm os memory eviction. and i do not use streaming. my archive command may 
>> read wal directly without os cache. just opinion, i have not done any tests 
>> yet. but i am still under the some memory eviction anomaly.
>
>PostgreSQL already uses O_DIRECT for WAL writes if you use O_SYNC mode
>for WAL writes. See comments in src/include/access/xlogdefs.h (search
>for O_DIRECT). You should also examine
>src/backend/access/transam/xlog.c, particularly the function
>get_sync_bit(...)
>
>Try doing some tests with pg_test_fsync, see how performance looks. If
>your theory is right and WAL traffic is putting pressure on kernel write
>buffers, using fsync=open_datasync - which should be the default on
>Linux - may help.
>
>I'd recommend doing some detailed tracing and performance measurements
>before trying to proceed further.
>
>-- 
> Craig Ringer  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
>-- 
>Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-hackers