On 2017/09/14 16:00, Michael Paquier wrote: > On Wed, Sep 13, 2017 at 4:43 PM, Amit Langote > <langote_amit...@lab.ntt.co.jp> wrote: >> Sure, no problem. > > OK, I took a closer look at all patches, but did not run any manual > tests to test the compression except some stuff with > wal_consistency_checking.
Thanks for the review. > + if (opaque->flags & GIN_DELETED) > + mask_page_content(page); > + else if (pagehdr->pd_lower != 0) > + mask_unused_space(page); > [...] > + /* Mask the unused space, provided the page's pd_lower is set. */ > + if (pagehdr->pd_lower != 0) > mask_unused_space(page); > > For the masking functions, shouldn't those check use (pd_lower > > SizeOfPageHeaderData) instead of 0? pd_lower gets set to a non-zero > value on HEAD, so you would apply the masking even if the meta page is > upgraded from an instance that did not enforce the value of pd_lower > later on. Those conditions also definitely need comments. That will be > a good reminder so as why it needs to be kept. Agreed, done. > + * > + * This won't be of any help unless we use option like REGBUF_STANDARD > + * while registering the buffer for metapage as otherwise, it won't try to > + * remove the hole while logging the full page image. > */ > This comment is in the btree code. But you actually add > REGBUF_STANDARD. So I think that this could be just removed. > > * Set pd_lower just past the end of the metadata. This is not essential > - * but it makes the page look compressible to xlog.c. > + * but it makes the page look compressible to xlog.c. See > + * _bt_initmetapage. > This reference could be removed as well as _bt_initmetapage does not > provide any information, the existing comment is wrong without your > patch, and then becomes right with this patch. Amit K's reply may have addressed these comments. > After that I have spotted a couple of places for btree, hash and > SpGist where the updates of pd_lower are not happening. Let's keep in > mind that updated meta pages could come from an upgraded version, so > we need to be careful here about all code paths updating meta pages, > and WAL-logging them. > > It seems to me that an update of pd_lower is missing in _bt_getroot(), > just before marking the buffer as dirty I think. And there is a second > in _bt_unlink_halfdead_page(), a third in _bt_insertonpg() and finally > one in _bt_newroot(). Amit K's reply about btree and hash should've resolved any doubts for those index types. About SP-Gist, see the comment below. > For SpGist, I think that there are two missing: spgbuild() and spgGetCache(). spgbuild() calls SpGistInitMetapage() before marking the metapage buffer dirty. The latter already sets pd_lower correctly, so we don't need to do it explicitly in spgbuild() itself. spgGetCache() doesn't write the metapage, only reads it: /* Last, get the lastUsedPages data from the metapage */ metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO); LockBuffer(metabuffer, BUFFER_LOCK_SHARE); metadata = SpGistPageGetMeta(BufferGetPage(metabuffer)); if (metadata->magicNumber != SPGIST_MAGIC_NUMBER) elog(ERROR, "index \"%s\" is not an SP-GiST index", RelationGetRelationName(index)); cache->lastUsedPages = metadata->lastUsedPages; UnlockReleaseBuffer(metabuffer); So, I think it won't be correct to set pd_lower here, no? > For hash, hashbulkdelete(), _hash_vacuum_one_page(), > _hash_addovflpage(), _hash_freeovflpage() and _hash_doinsert() are > missing the shot, no? We could have a meta page of a hash index > upgraded from PG10. Amit K's reply. :) Updated patch attached, which implements your suggested changes to the masking functions. By the way, as I noted on another unrelated thread, I will not be able to respond to emails from tomorrow until 9/23. Thanks, Amit
From 6741334ce5f3261b5e5caffa3914d4e1485fe5d8 Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Fri, 23 Jun 2017 11:20:41 +0900 Subject: [PATCH 1/3] Set pd_lower correctly in the GIN metapage. Also tell xlog.c to treat the metapage like a standard page, so any hole in it is compressed. --- src/backend/access/gin/ginfast.c | 22 ++++++++++++++++++++-- src/backend/access/gin/gininsert.c | 4 ++-- src/backend/access/gin/ginutil.c | 19 ++++++++++++++++++- src/backend/access/gin/ginxlog.c | 25 ++++++++++--------------- 4 files changed, 50 insertions(+), 20 deletions(-) diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c index 59e435465a..d96529cf72 100644 --- a/src/backend/access/gin/ginfast.c +++ b/src/backend/access/gin/ginfast.c @@ -399,6 +399,15 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector) /* * Write metabuffer, make xlog entry */ + + /* + * Set pd_lower just past the end of the metadata. This is not essential + * but it makes the page look compressible to xlog.c, because we pass the + * buffer containing this page to XLogRegisterBuffer() as a page with + * standard layout. + */ + ((PageHeader) metapage)->pd_lower = + ((char *) metadata + sizeof(GinMetaPageData)) - (char *) metapage; MarkBufferDirty(metabuffer); if (needWal) @@ -407,7 +416,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector) memcpy(&data.metadata, metadata, sizeof(GinMetaPageData)); - XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT); + XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT | REGBUF_STANDARD); XLogRegisterData((char *) &data, sizeof(ginxlogUpdateMeta)); recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_UPDATE_META_PAGE); @@ -572,6 +581,14 @@ shiftList(Relation index, Buffer metabuffer, BlockNumber newHead, metadata->nPendingHeapTuples = 0; } + /* + * Set pd_lower just past the end of the metadata. This is not + * essential but it makes the page look compressible to xlog.c, + * because we pass the buffer containing this page to + * XLogRegisterBuffer() as page with standard layout. + */ + ((PageHeader) metapage)->pd_lower = + ((char *) metadata + sizeof(GinMetaPageData)) - (char *) metapage; MarkBufferDirty(metabuffer); for (i = 0; i < data.ndeleted; i++) @@ -586,7 +603,8 @@ shiftList(Relation index, Buffer metabuffer, BlockNumber newHead, XLogRecPtr recptr; XLogBeginInsert(); - XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT); + XLogRegisterBuffer(0, metabuffer, + REGBUF_WILL_INIT | REGBUF_STANDARD); for (i = 0; i < data.ndeleted; i++) XLogRegisterBuffer(i + 1, buffers[i], REGBUF_WILL_INIT); diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c index 5378011f50..c9aa4ee147 100644 --- a/src/backend/access/gin/gininsert.c +++ b/src/backend/access/gin/gininsert.c @@ -348,7 +348,7 @@ ginbuild(Relation heap, Relation index, IndexInfo *indexInfo) Page page; XLogBeginInsert(); - XLogRegisterBuffer(0, MetaBuffer, REGBUF_WILL_INIT); + XLogRegisterBuffer(0, MetaBuffer, REGBUF_WILL_INIT | REGBUF_STANDARD); XLogRegisterBuffer(1, RootBuffer, REGBUF_WILL_INIT); recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_CREATE_INDEX); @@ -447,7 +447,7 @@ ginbuildempty(Relation index) START_CRIT_SECTION(); GinInitMetabuffer(MetaBuffer); MarkBufferDirty(MetaBuffer); - log_newpage_buffer(MetaBuffer, false); + log_newpage_buffer(MetaBuffer, true); GinInitBuffer(RootBuffer, GIN_LEAF); MarkBufferDirty(RootBuffer); log_newpage_buffer(RootBuffer, false); diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c index 136ea27718..d680849e9d 100644 --- a/src/backend/access/gin/ginutil.c +++ b/src/backend/access/gin/ginutil.c @@ -374,6 +374,15 @@ GinInitMetabuffer(Buffer b) metadata->nDataPages = 0; metadata->nEntries = 0; metadata->ginVersion = GIN_CURRENT_VERSION; + + /* + * Set pd_lower just past the end of the metadata. This is not essential + * but it makes the page look compressible to xlog.c, because we pass the + * buffer containing this page to XLogRegisterBuffer() as a page with + * standard layout. + */ + ((PageHeader) page)->pd_lower = + ((char *) metadata + sizeof(GinMetaPageData)) - (char *) page; } /* @@ -676,6 +685,14 @@ ginUpdateStats(Relation index, const GinStatsData *stats) metadata->nDataPages = stats->nDataPages; metadata->nEntries = stats->nEntries; + /* + * Set pd_lower just past the end of the metadata. This is not essential + * but it makes the page look compressible to xlog.c, because we pass the + * buffer containing this page to XLogRegisterBuffer() as page with + * standard layout. + */ + ((PageHeader) metapage)->pd_lower = + ((char *) metadata + sizeof(GinMetaPageData)) - (char *) metapage; MarkBufferDirty(metabuffer); if (RelationNeedsWAL(index)) @@ -690,7 +707,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats) XLogBeginInsert(); XLogRegisterData((char *) &data, sizeof(ginxlogUpdateMeta)); - XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT); + XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT | REGBUF_STANDARD); recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_UPDATE_META_PAGE); PageSetLSN(metapage, recptr); diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c index 7ba04e324f..52c14ce42e 100644 --- a/src/backend/access/gin/ginxlog.c +++ b/src/backend/access/gin/ginxlog.c @@ -514,7 +514,7 @@ ginRedoUpdateMetapage(XLogReaderState *record) Assert(BufferGetBlockNumber(metabuffer) == GIN_METAPAGE_BLKNO); metapage = BufferGetPage(metabuffer); - GinInitPage(metapage, GIN_META, BufferGetPageSize(metabuffer)); + GinInitMetabuffer(metabuffer); memcpy(GinPageGetMeta(metapage), &data->metadata, sizeof(GinMetaPageData)); PageSetLSN(metapage, lsn); MarkBufferDirty(metabuffer); @@ -656,7 +656,7 @@ ginRedoDeleteListPages(XLogReaderState *record) Assert(BufferGetBlockNumber(metabuffer) == GIN_METAPAGE_BLKNO); metapage = BufferGetPage(metabuffer); - GinInitPage(metapage, GIN_META, BufferGetPageSize(metabuffer)); + GinInitMetabuffer(metabuffer); memcpy(GinPageGetMeta(metapage), &data->metadata, sizeof(GinMetaPageData)); PageSetLSN(metapage, lsn); @@ -768,6 +768,7 @@ void gin_mask(char *pagedata, BlockNumber blkno) { Page page = (Page) pagedata; + PageHeader pagehdr = (PageHeader) page; GinPageOpaque opaque; mask_page_lsn(page); @@ -776,18 +777,12 @@ gin_mask(char *pagedata, BlockNumber blkno) mask_page_hint_bits(page); /* - * GIN metapage doesn't use pd_lower/pd_upper. Other page types do. Hence, - * we need to apply masking for those pages. + * For GIN_DELETED page, the page is initialized to empty. Hence, mask + * the page content. For other pages, mask the hole if the pd_lower + * appears to have been set correctly. */ - if (opaque->flags != GIN_META) - { - /* - * For GIN_DELETED page, the page is initialized to empty. Hence, mask - * the page content. - */ - if (opaque->flags & GIN_DELETED) - mask_page_content(page); - else - mask_unused_space(page); - } + if (opaque->flags & GIN_DELETED) + mask_page_content(page); + else if (pagehdr->pd_lower > SizeOfPageHeaderData) + mask_unused_space(page); } -- 2.11.0
From c51791a4e0effe9c5b31af11ed07f8c49cd0014e Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Mon, 26 Jun 2017 15:13:32 +0900 Subject: [PATCH 2/3] Set pd_lower correctly in the BRIN index metapage Also tell xlog.c to treat the metapage like a standard page, so any hole in it is compressed. --- src/backend/access/brin/brin.c | 4 ++-- src/backend/access/brin/brin_pageops.c | 9 +++++++++ src/backend/access/brin/brin_revmap.c | 11 ++++++++++- src/backend/access/brin/brin_xlog.c | 19 +++++++++++++++++-- 4 files changed, 38 insertions(+), 5 deletions(-) diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c index b3aa6d1ced..e6909d7aea 100644 --- a/src/backend/access/brin/brin.c +++ b/src/backend/access/brin/brin.c @@ -685,7 +685,7 @@ brinbuild(Relation heap, Relation index, IndexInfo *indexInfo) XLogBeginInsert(); XLogRegisterData((char *) &xlrec, SizeOfBrinCreateIdx); - XLogRegisterBuffer(0, meta, REGBUF_WILL_INIT); + XLogRegisterBuffer(0, meta, REGBUF_WILL_INIT | REGBUF_STANDARD); recptr = XLogInsert(RM_BRIN_ID, XLOG_BRIN_CREATE_INDEX); @@ -742,7 +742,7 @@ brinbuildempty(Relation index) brin_metapage_init(BufferGetPage(metabuf), BrinGetPagesPerRange(index), BRIN_CURRENT_VERSION); MarkBufferDirty(metabuf); - log_newpage_buffer(metabuf, false); + log_newpage_buffer(metabuf, true); END_CRIT_SECTION(); UnlockReleaseBuffer(metabuf); diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c index 80f803e438..92903f38c7 100644 --- a/src/backend/access/brin/brin_pageops.c +++ b/src/backend/access/brin/brin_pageops.c @@ -491,6 +491,15 @@ brin_metapage_init(Page page, BlockNumber pagesPerRange, uint16 version) * revmap page to be created when the index is. */ metadata->lastRevmapPage = 0; + + /* + * Set pd_lower just past the end of the metadata. This is not essential + * but it makes the page look compressible to xlog.c, because we pass the + * buffer containing this page to XLogRegisterBuffer() as a page with + * standard layout. + */ + ((PageHeader) page)->pd_lower = + ((char *) metadata + sizeof(BrinMetaPageData)) - (char *) page; } /* diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c index 22f2076887..4b056c68a2 100644 --- a/src/backend/access/brin/brin_revmap.c +++ b/src/backend/access/brin/brin_revmap.c @@ -624,6 +624,15 @@ revmap_physical_extend(BrinRevmap *revmap) MarkBufferDirty(buf); metadata->lastRevmapPage = mapBlk; + + /* + * Set pd_lower just past the end of the metadata. This is not essential + * but it makes the page look compressible to xlog.c, because we pass the + * buffer containing this page to XLogRegisterBuffer() as a page with + * standard layout. + */ + ((PageHeader) metapage)->pd_lower = + ((char *) metadata + sizeof(BrinMetaPageData)) - (char *) metapage; MarkBufferDirty(revmap->rm_metaBuf); if (RelationNeedsWAL(revmap->rm_irel)) @@ -635,7 +644,7 @@ revmap_physical_extend(BrinRevmap *revmap) XLogBeginInsert(); XLogRegisterData((char *) &xlrec, SizeOfBrinRevmapExtend); - XLogRegisterBuffer(0, revmap->rm_metaBuf, 0); + XLogRegisterBuffer(0, revmap->rm_metaBuf, REGBUF_STANDARD); XLogRegisterBuffer(1, buf, REGBUF_WILL_INIT); diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c index dff7198a39..a9c7d909f6 100644 --- a/src/backend/access/brin/brin_xlog.c +++ b/src/backend/access/brin/brin_xlog.c @@ -234,6 +234,15 @@ brin_xlog_revmap_extend(XLogReaderState *record) metadata->lastRevmapPage = xlrec->targetBlk; PageSetLSN(metapg, lsn); + + /* + * Set pd_lower just past the end of the metadata. This is not + * essential but it makes the page look compressible to xlog.c, because + * we pass the buffer containing this page to XLogRegisterBuffer() as a + * page with standard layout. + */ + ((PageHeader) metapg)->pd_lower = + ((char *) metadata + sizeof(BrinMetaPageData)) - (char *) metapg; MarkBufferDirty(metabuf); } @@ -331,14 +340,20 @@ void brin_mask(char *pagedata, BlockNumber blkno) { Page page = (Page) pagedata; + PageHeader pagehdr = (PageHeader) page; mask_page_lsn(page); mask_page_hint_bits(page); - if (BRIN_IS_REGULAR_PAGE(page)) + /* + * Regular brin pages contain unused space which needs to be masked. + * Similarly for meta pages, but mask it only if pd_lower appears to have + * been set correctly. + */ + if (BRIN_IS_REGULAR_PAGE(page) || + (BRIN_IS_META_PAGE(page) && pagehdr->pd_lower > SizeOfPageHeaderData)) { - /* Regular brin pages contain unused space which needs to be masked. */ mask_unused_space(page); } } -- 2.11.0
From 478a9306e64d821d2fafd28906690ad2e00e5cd0 Mon Sep 17 00:00:00 2001 From: amit <amitlangot...@gmail.com> Date: Mon, 26 Jun 2017 15:23:34 +0900 Subject: [PATCH 3/3] Set pd_lower correctly in the SP-GiST index metapage Also tell xlog.c to treat the metapage like a standard page, so any hole in it is compressed. --- src/backend/access/spgist/spginsert.c | 4 ++-- src/backend/access/spgist/spgutils.c | 19 +++++++++++++++++++ src/backend/access/spgist/spgxlog.c | 7 ++++--- 3 files changed, 25 insertions(+), 5 deletions(-) diff --git a/src/backend/access/spgist/spginsert.c b/src/backend/access/spgist/spginsert.c index e4b2c29b0e..80b82e1602 100644 --- a/src/backend/access/spgist/spginsert.c +++ b/src/backend/access/spgist/spginsert.c @@ -110,7 +110,7 @@ spgbuild(Relation heap, Relation index, IndexInfo *indexInfo) * Replay will re-initialize the pages, so don't take full pages * images. No other data to log. */ - XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT); + XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT | REGBUF_STANDARD); XLogRegisterBuffer(1, rootbuffer, REGBUF_WILL_INIT | REGBUF_STANDARD); XLogRegisterBuffer(2, nullbuffer, REGBUF_WILL_INIT | REGBUF_STANDARD); @@ -173,7 +173,7 @@ spgbuildempty(Relation index) smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_METAPAGE_BLKNO, (char *) page, true); log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM, - SPGIST_METAPAGE_BLKNO, page, false); + SPGIST_METAPAGE_BLKNO, page, true); /* Likewise for the root page. */ SpGistInitPage(page, SPGIST_LEAF); diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c index 22f64b0103..9048c08f1c 100644 --- a/src/backend/access/spgist/spgutils.c +++ b/src/backend/access/spgist/spgutils.c @@ -256,15 +256,25 @@ SpGistUpdateMetaPage(Relation index) if (cache != NULL) { Buffer metabuffer; + Page metapage; SpGistMetaPageData *metadata; metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO); + metapage = BufferGetPage(metabuffer); if (ConditionalLockBuffer(metabuffer)) { metadata = SpGistPageGetMeta(BufferGetPage(metabuffer)); metadata->lastUsedPages = cache->lastUsedPages; + /* + * Set pd_lower just past the end of the metadata. This is not + * essential but it makes the page look compressible to xlog.c, + * because we pass the buffer containing this page to + * XLogRegisterBuffer() as page with standard layout. + */ + ((PageHeader) metapage)->pd_lower = ((char *) metadata + + sizeof(SpGistMetaPageData)) - (char *) metapage; MarkBufferDirty(metabuffer); UnlockReleaseBuffer(metabuffer); } @@ -534,6 +544,15 @@ SpGistInitMetapage(Page page) /* initialize last-used-page cache to empty */ for (i = 0; i < SPGIST_CACHED_PAGES; i++) metadata->lastUsedPages.cachedPage[i].blkno = InvalidBlockNumber; + + /* + * Set pd_lower just past the end of the metadata. This is not essential + * but it makes the page look compressible to xlog.c, because we pass the + * buffer containing this page to XLogRegisterBuffer() as page with + * standard layout. + */ + ((PageHeader) page)->pd_lower = + ((char *) metadata + sizeof(SpGistMetaPageData)) - (char *) page; } /* diff --git a/src/backend/access/spgist/spgxlog.c b/src/backend/access/spgist/spgxlog.c index c440d21715..84acd5a30f 100644 --- a/src/backend/access/spgist/spgxlog.c +++ b/src/backend/access/spgist/spgxlog.c @@ -1033,15 +1033,16 @@ void spg_mask(char *pagedata, BlockNumber blkno) { Page page = (Page) pagedata; + PageHeader pagehdr = (PageHeader) page; mask_page_lsn(page); mask_page_hint_bits(page); /* - * Any SpGist page other than meta contains unused space which needs to be - * masked. + * Mask the unused space, only if the page's pd_lower appears to have been + * set correctly. */ - if (!SpGistPageIsMeta(page)) + if (pagehdr->pd_lower > SizeOfPageHeaderData) mask_unused_space(page); } -- 2.11.0
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers