On Tue, May 24, 2011 at 10:52 PM, Jeff Davis <pg...@j-davis.com> wrote: > On Tue, 2011-05-24 at 16:34 -0400, Robert Haas wrote: >> As I think about it a bit more, we'd >> need to XLOG not only the parts of the page we actually modifying, but >> any that the WAL record would need to be correct on replay. > > I don't understand that statement. Can you clarify?
I'll try. Suppose we have two WAL records A and B, with no intervening checkpoint, that both modify the same page. A reads chunk 1 of that page and then modifies chunk 2. B modifies chunk 1. Now, suppose we make A do a "partial page write" on chunk 2 only, and B do the same for chunk 1. At the point the system crashes, A and B are both on disk, and the page has already been written to disk as well. Replay begins from a checkpoint preceding A. Now, when we get to the record for A, what are we to do? If it were a full page image, we could just restore it, and everything would be fine after that. But if we replay the partial page write, we've got trouble. A will now see the state of the chunk 1 as it existed after the action protected by B occurred, and will presumably do the wrong thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers