On 1/31/16 3:26 PM, Jan Wieck wrote:
On 01/27/2016 08:30 AM, Amit Kapila wrote:
operation.  Now why OS couldn't find the corresponding block in
memory is that, while closing the WAL file, we use
POSIX_FADV_DONTNEED if wal_level is less than 'archive' which
lead to this problem.  So with this experiment, the conclusion is that
though we can avoid re-write of WAL data by doing exact writes, but
it could lead to significant reduction in TPS.

POSIX_FADV_DONTNEED isn't the only way how those blocks would vanish
from OS buffers. If I am not mistaken we recycle WAL segments in a round
robin fashion. In a properly configured system, where the reason for a
checkpoint is usually "time" rather than "xlog", a recycled WAL file
written to had been closed and not touched for about a complete
checkpoint_timeout or longer. You must have a really big amount of spare
RAM in the machine to still find those blocks in memory. Basically we
are talking about the active portion of your database, shared buffers,
the sum of all process local memory and the complete pg_xlog directory
content fitting into RAM.

But that's only going to matter when the segment is newly recycled. My impression from Amit's email is that the OS was repeatedly reading even in the same segment?

Either way, I would think it wouldn't be hard to work around this by spewing out a bunch of zeros to the OS in advance of where we actually need to write, preventing the need for reading back from disk.

Amit, did you do performance testing with archiving enabled an a no-op archive_command?
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to