[HACKERS] PANIC caused by open_sync on Linux

ITAGAKI Takahiro Thu, 25 Oct 2007 21:25:32 -0700

I encountered PANICs on CentOS 5.0 when I ran write-mostly workload.
It occurs only if wal_sync_method is set to open_sync; there were
no problem in fdatasync. It occurred on both Postgres 8.2.5 and 8.3dev.


  PANIC:  could not write to log file 0, segment 212 at offset 3399680,
          length 737280: Input/output error
  STATEMENT:  COMMIT;

My nearby Linux guy says mixed usage of bufferd I/O and direct I/O
could cause errors (EIO) on many version of Linux kernels. If we use
buffered I/O before direct I/O, Linux could fail to discard kernel buffer
cache of the region and report EIO -- yes, it's a bug in Linux.

We use bufferd I/O on WAL segements even if wal_sync_method is open_sync.
We initialized segements with zero using buffered I/O, and after that,
we re-open them with specified sync options.

The behaviors in the bug are different on RHEL 4 and 5.
  RHEL 4 -> No error reports even though the kernel cache is incosistenet.
  RHEL 5 -> write() failes with EIO (Input/output error)
PANIC occurs only on RHEL 5, but RHEL 4 also has a problem. If a wal archiver
reads the inconsistent cache of wal segments, it could archive wrong contents
and PITR might fail at the corrupted archived file.


I'll recommend not to use open_sync for users on Linux until the bug is
fiexed. However, are there any idea to avoid the bug and to use direct i/o?
Mixed usage of bufferd and direct i/o is legal, but enforces complexity
to kernels. If we simplify it, things would be more relaxed. For example,
dropping zero-filling and only use direct i/o. Is it possible?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

[HACKERS] PANIC caused by open_sync on Linux

Reply via email to