From: Alvaro Herrera []
> Tsunakawa, Takayuki wrote:
> > (Although unrelated to this, I've also been wondering why PostgreSQL
> > flushes WAL to disk when writing a page in the shared buffer, because
> > PostgreSQL doesn't use WAL for undo.)
> The reason is that if the system crashes after writing the data page to
> disk, but before writing the WAL, the data page would be inconsistent with
> data in pages that weren't flushed, since there is no WAL to update those
> other pages.  Also, if the system crashes after partially writing the page
> (say it writes the first 4kB) then the page is downright corrupted with
> no way to fix it.
> So there has to be a barrier that ensures that the WAL is flushed up to
> the last position that modified a page (i.e. that page's LSN) before actually
> writing that page to disk.  And this is why we can't use mmap() for shared
> buffers -- there is no mechanism to force the WAL down if the operation
> system has the liberty to flush pages whenever it likes.

I see.  The latter is a torn page problem, which is solved by a full page image 
WAL record.  I understood that an example of the former problem is the 
inconsistency between a table page and an index page -- if an index page is 
flushed to disk without slushing the WAL and the corresponding table page, an 
index entry would point to a wroing table record after recovery.

Thanks, my long-standing question has beenn solved.

Takayuki Tsunakawa

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to