On Tue, 2010-07-27 at 15:23 -0700, Jeff Davis wrote: > On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote: > > > My first concern with that idea was that it may create an inconsistency > > > between the primary and the standby. The primary could have a bunch of > > > zero pages that never make it to the standby. > > > > Maybe I'm slow on the uptake here, but don't the pages start out > > all-zeroes on the standby just as they do on the primary? The only way > > it seems like this would be a problem is if a page that previously > > contained data on the primary was subsequently zeroed without writing > > a WAL record - or am I confused? > > The case I was concerned about is when you have a table on the primary > with a bunch of zero pages at the end. Then you SET TABLESPACE, and none > of the copied pages (or even the fact that they exist) would be sent to > the standby, but they would exist on the primary. And later pages may > have data, so the standby may see page N but not N-1. > > Generally, most of the code is not expecting to read or write past the > end of the file, unless it's doing an extension. > > However, I think everything is fine during recovery, because it looks > like it's designed to create zero pages as needed. So your idea seems > safe to me, although I do still have some doubts because of my lack of > knowledge in this area; particularly hot standby conflict > detection/resolution. > > My idea was different: still log the zero page, just don't set LSN or > TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't > as clean as your idea, but I'm a little more confident that it is > correct. >
Both potential fixes attached and both appear to work. fix1 -- Only call PageSetLSN/TLI inside log_newpage() and heap_xlog_newpage() if the page is not zeroed. fix2 -- Don't call log_newpage() at all if the page is not zeroed. Please review. I don't have a strong opinion about which one should be applied. Regards, Jeff Davis
*** a/src/backend/access/heap/heapam.c --- b/src/backend/access/heap/heapam.c *************** *** 4079,4086 **** log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno, recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata); ! PageSetLSN(page, recptr); ! PageSetTLI(page, ThisTimeLineID); END_CRIT_SECTION(); --- 4079,4093 ---- recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata); ! /* ! * The new page may be uninitialized. If so, we can't set the LSN ! * and TLI because that would corrupt the page. ! */ ! if (!PageIsNew(page)) ! { ! PageSetLSN(page, recptr); ! PageSetTLI(page, ThisTimeLineID); ! } END_CRIT_SECTION(); *************** *** 4266,4273 **** heap_xlog_newpage(XLogRecPtr lsn, XLogRecord *record) Assert(record->xl_len == SizeOfHeapNewpage + BLCKSZ); memcpy(page, (char *) xlrec + SizeOfHeapNewpage, BLCKSZ); ! PageSetLSN(page, lsn); ! PageSetTLI(page, ThisTimeLineID); MarkBufferDirty(buffer); UnlockReleaseBuffer(buffer); } --- 4273,4288 ---- Assert(record->xl_len == SizeOfHeapNewpage + BLCKSZ); memcpy(page, (char *) xlrec + SizeOfHeapNewpage, BLCKSZ); ! /* ! * The new page may be uninitialized. If so, we can't set the LSN ! * and TLI because that would corrupt the page. ! */ ! if (!PageIsNew(page)) ! { ! PageSetLSN(page, lsn); ! PageSetTLI(page, ThisTimeLineID); ! } ! MarkBufferDirty(buffer); UnlockReleaseBuffer(buffer); }
*** a/src/backend/commands/tablecmds.c --- b/src/backend/commands/tablecmds.c *************** *** 7082,7089 **** copy_relation_data(SMgrRelation src, SMgrRelation dst, smgrread(src, forkNum, blkno, buf); ! /* XLOG stuff */ ! if (use_wal) log_newpage(&dst->smgr_rnode, forkNum, blkno, page); /* --- 7082,7094 ---- smgrread(src, forkNum, blkno, buf); ! /* ! * XLOG stuff ! * ! * Don't log page if it's new, because that will set the LSN ! * and TLI, corrupting the page. ! */ ! if (use_wal && !PageIsNew(page)) log_newpage(&dst->smgr_rnode, forkNum, blkno, page); /*
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers