On 2017/09/14 16:00, Michael Paquier wrote:
> On Wed, Sep 13, 2017 at 4:43 PM, Amit Langote
> <[email protected]> wrote:
>> Sure, no problem.
>
> OK, I took a closer look at all patches, but did not run any manual
> tests to test the compression except some stuff with
> wal_consistency_checking.
Thanks for the review.
> + if (opaque->flags & GIN_DELETED)
> + mask_page_content(page);
> + else if (pagehdr->pd_lower != 0)
> + mask_unused_space(page);
> [...]
> + /* Mask the unused space, provided the page's pd_lower is set. */
> + if (pagehdr->pd_lower != 0)
> mask_unused_space(page);
>
> For the masking functions, shouldn't those check use (pd_lower >
> SizeOfPageHeaderData) instead of 0? pd_lower gets set to a non-zero
> value on HEAD, so you would apply the masking even if the meta page is
> upgraded from an instance that did not enforce the value of pd_lower
> later on. Those conditions also definitely need comments. That will be
> a good reminder so as why it needs to be kept.
Agreed, done.
> + *
> + * This won't be of any help unless we use option like REGBUF_STANDARD
> + * while registering the buffer for metapage as otherwise, it won't try to
> + * remove the hole while logging the full page image.
> */
> This comment is in the btree code. But you actually add
> REGBUF_STANDARD. So I think that this could be just removed.
>
> * Set pd_lower just past the end of the metadata. This is not essential
> - * but it makes the page look compressible to xlog.c.
> + * but it makes the page look compressible to xlog.c. See
> + * _bt_initmetapage.
> This reference could be removed as well as _bt_initmetapage does not
> provide any information, the existing comment is wrong without your
> patch, and then becomes right with this patch.
Amit K's reply may have addressed these comments.
> After that I have spotted a couple of places for btree, hash and
> SpGist where the updates of pd_lower are not happening. Let's keep in
> mind that updated meta pages could come from an upgraded version, so
> we need to be careful here about all code paths updating meta pages,
> and WAL-logging them.
>
> It seems to me that an update of pd_lower is missing in _bt_getroot(),
> just before marking the buffer as dirty I think. And there is a second
> in _bt_unlink_halfdead_page(), a third in _bt_insertonpg() and finally
> one in _bt_newroot().
Amit K's reply about btree and hash should've resolved any doubts for
those index types. About SP-Gist, see the comment below.
> For SpGist, I think that there are two missing: spgbuild() and spgGetCache().
spgbuild() calls SpGistInitMetapage() before marking the metapage buffer
dirty. The latter already sets pd_lower correctly, so we don't need to do
it explicitly in spgbuild() itself.
spgGetCache() doesn't write the metapage, only reads it:
/* Last, get the lastUsedPages data from the metapage */
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
elog(ERROR, "index \"%s\" is not an SP-GiST index",
RelationGetRelationName(index));
cache->lastUsedPages = metadata->lastUsedPages;
UnlockReleaseBuffer(metabuffer);
So, I think it won't be correct to set pd_lower here, no?
> For hash, hashbulkdelete(), _hash_vacuum_one_page(),
> _hash_addovflpage(), _hash_freeovflpage() and _hash_doinsert() are
> missing the shot, no? We could have a meta page of a hash index
> upgraded from PG10.
Amit K's reply. :)
Updated patch attached, which implements your suggested changes to the
masking functions.
By the way, as I noted on another unrelated thread, I will not be able to
respond to emails from tomorrow until 9/23.
Thanks,
Amit
From 6741334ce5f3261b5e5caffa3914d4e1485fe5d8 Mon Sep 17 00:00:00 2001
From: amit <[email protected]>
Date: Fri, 23 Jun 2017 11:20:41 +0900
Subject: [PATCH 1/3] Set pd_lower correctly in the GIN metapage.
Also tell xlog.c to treat the metapage like a standard page, so any
hole in it is compressed.
---
src/backend/access/gin/ginfast.c | 22 ++++++++++++++++++++--
src/backend/access/gin/gininsert.c | 4 ++--
src/backend/access/gin/ginutil.c | 19 ++++++++++++++++++-
src/backend/access/gin/ginxlog.c | 25 ++++++++++---------------
4 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 59e435465a..d96529cf72 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -399,6 +399,15 @@ ginHeapTupleFastInsert(GinState *ginstate,
GinTupleCollector *collector)
/*
* Write metabuffer, make xlog entry
*/
+
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
essential
+ * but it makes the page look compressible to xlog.c, because we pass
the
+ * buffer containing this page to XLogRegisterBuffer() as a page with
+ * standard layout.
+ */
+ ((PageHeader) metapage)->pd_lower =
+ ((char *) metadata + sizeof(GinMetaPageData)) - (char
*) metapage;
MarkBufferDirty(metabuffer);
if (needWal)
@@ -407,7 +416,7 @@ ginHeapTupleFastInsert(GinState *ginstate,
GinTupleCollector *collector)
memcpy(&data.metadata, metadata, sizeof(GinMetaPageData));
- XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT);
+ XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT |
REGBUF_STANDARD);
XLogRegisterData((char *) &data, sizeof(ginxlogUpdateMeta));
recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_UPDATE_META_PAGE);
@@ -572,6 +581,14 @@ shiftList(Relation index, Buffer metabuffer, BlockNumber
newHead,
metadata->nPendingHeapTuples = 0;
}
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
+ * essential but it makes the page look compressible to xlog.c,
+ * because we pass the buffer containing this page to
+ * XLogRegisterBuffer() as page with standard layout.
+ */
+ ((PageHeader) metapage)->pd_lower =
+ ((char *) metadata + sizeof(GinMetaPageData)) - (char
*) metapage;
MarkBufferDirty(metabuffer);
for (i = 0; i < data.ndeleted; i++)
@@ -586,7 +603,8 @@ shiftList(Relation index, Buffer metabuffer, BlockNumber
newHead,
XLogRecPtr recptr;
XLogBeginInsert();
- XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT);
+ XLogRegisterBuffer(0, metabuffer,
+ REGBUF_WILL_INIT |
REGBUF_STANDARD);
for (i = 0; i < data.ndeleted; i++)
XLogRegisterBuffer(i + 1, buffers[i],
REGBUF_WILL_INIT);
diff --git a/src/backend/access/gin/gininsert.c
b/src/backend/access/gin/gininsert.c
index 5378011f50..c9aa4ee147 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -348,7 +348,7 @@ ginbuild(Relation heap, Relation index, IndexInfo
*indexInfo)
Page page;
XLogBeginInsert();
- XLogRegisterBuffer(0, MetaBuffer, REGBUF_WILL_INIT);
+ XLogRegisterBuffer(0, MetaBuffer, REGBUF_WILL_INIT |
REGBUF_STANDARD);
XLogRegisterBuffer(1, RootBuffer, REGBUF_WILL_INIT);
recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_CREATE_INDEX);
@@ -447,7 +447,7 @@ ginbuildempty(Relation index)
START_CRIT_SECTION();
GinInitMetabuffer(MetaBuffer);
MarkBufferDirty(MetaBuffer);
- log_newpage_buffer(MetaBuffer, false);
+ log_newpage_buffer(MetaBuffer, true);
GinInitBuffer(RootBuffer, GIN_LEAF);
MarkBufferDirty(RootBuffer);
log_newpage_buffer(RootBuffer, false);
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 136ea27718..d680849e9d 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -374,6 +374,15 @@ GinInitMetabuffer(Buffer b)
metadata->nDataPages = 0;
metadata->nEntries = 0;
metadata->ginVersion = GIN_CURRENT_VERSION;
+
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
essential
+ * but it makes the page look compressible to xlog.c, because we pass
the
+ * buffer containing this page to XLogRegisterBuffer() as a page with
+ * standard layout.
+ */
+ ((PageHeader) page)->pd_lower =
+ ((char *) metadata + sizeof(GinMetaPageData)) - (char
*) page;
}
/*
@@ -676,6 +685,14 @@ ginUpdateStats(Relation index, const GinStatsData *stats)
metadata->nDataPages = stats->nDataPages;
metadata->nEntries = stats->nEntries;
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
essential
+ * but it makes the page look compressible to xlog.c, because we pass
the
+ * buffer containing this page to XLogRegisterBuffer() as page with
+ * standard layout.
+ */
+ ((PageHeader) metapage)->pd_lower =
+ ((char *) metadata + sizeof(GinMetaPageData)) - (char *)
metapage;
MarkBufferDirty(metabuffer);
if (RelationNeedsWAL(index))
@@ -690,7 +707,7 @@ ginUpdateStats(Relation index, const GinStatsData *stats)
XLogBeginInsert();
XLogRegisterData((char *) &data, sizeof(ginxlogUpdateMeta));
- XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT);
+ XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT |
REGBUF_STANDARD);
recptr = XLogInsert(RM_GIN_ID, XLOG_GIN_UPDATE_META_PAGE);
PageSetLSN(metapage, recptr);
diff --git a/src/backend/access/gin/ginxlog.c b/src/backend/access/gin/ginxlog.c
index 7ba04e324f..52c14ce42e 100644
--- a/src/backend/access/gin/ginxlog.c
+++ b/src/backend/access/gin/ginxlog.c
@@ -514,7 +514,7 @@ ginRedoUpdateMetapage(XLogReaderState *record)
Assert(BufferGetBlockNumber(metabuffer) == GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
- GinInitPage(metapage, GIN_META, BufferGetPageSize(metabuffer));
+ GinInitMetabuffer(metabuffer);
memcpy(GinPageGetMeta(metapage), &data->metadata,
sizeof(GinMetaPageData));
PageSetLSN(metapage, lsn);
MarkBufferDirty(metabuffer);
@@ -656,7 +656,7 @@ ginRedoDeleteListPages(XLogReaderState *record)
Assert(BufferGetBlockNumber(metabuffer) == GIN_METAPAGE_BLKNO);
metapage = BufferGetPage(metabuffer);
- GinInitPage(metapage, GIN_META, BufferGetPageSize(metabuffer));
+ GinInitMetabuffer(metabuffer);
memcpy(GinPageGetMeta(metapage), &data->metadata,
sizeof(GinMetaPageData));
PageSetLSN(metapage, lsn);
@@ -768,6 +768,7 @@ void
gin_mask(char *pagedata, BlockNumber blkno)
{
Page page = (Page) pagedata;
+ PageHeader pagehdr = (PageHeader) page;
GinPageOpaque opaque;
mask_page_lsn(page);
@@ -776,18 +777,12 @@ gin_mask(char *pagedata, BlockNumber blkno)
mask_page_hint_bits(page);
/*
- * GIN metapage doesn't use pd_lower/pd_upper. Other page types do.
Hence,
- * we need to apply masking for those pages.
+ * For GIN_DELETED page, the page is initialized to empty. Hence, mask
+ * the page content. For other pages, mask the hole if the pd_lower
+ * appears to have been set correctly.
*/
- if (opaque->flags != GIN_META)
- {
- /*
- * For GIN_DELETED page, the page is initialized to empty.
Hence, mask
- * the page content.
- */
- if (opaque->flags & GIN_DELETED)
- mask_page_content(page);
- else
- mask_unused_space(page);
- }
+ if (opaque->flags & GIN_DELETED)
+ mask_page_content(page);
+ else if (pagehdr->pd_lower > SizeOfPageHeaderData)
+ mask_unused_space(page);
}
--
2.11.0
From c51791a4e0effe9c5b31af11ed07f8c49cd0014e Mon Sep 17 00:00:00 2001
From: amit <[email protected]>
Date: Mon, 26 Jun 2017 15:13:32 +0900
Subject: [PATCH 2/3] Set pd_lower correctly in the BRIN index metapage
Also tell xlog.c to treat the metapage like a standard page, so any
hole in it is compressed.
---
src/backend/access/brin/brin.c | 4 ++--
src/backend/access/brin/brin_pageops.c | 9 +++++++++
src/backend/access/brin/brin_revmap.c | 11 ++++++++++-
src/backend/access/brin/brin_xlog.c | 19 +++++++++++++++++--
4 files changed, 38 insertions(+), 5 deletions(-)
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index b3aa6d1ced..e6909d7aea 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -685,7 +685,7 @@ brinbuild(Relation heap, Relation index, IndexInfo
*indexInfo)
XLogBeginInsert();
XLogRegisterData((char *) &xlrec, SizeOfBrinCreateIdx);
- XLogRegisterBuffer(0, meta, REGBUF_WILL_INIT);
+ XLogRegisterBuffer(0, meta, REGBUF_WILL_INIT | REGBUF_STANDARD);
recptr = XLogInsert(RM_BRIN_ID, XLOG_BRIN_CREATE_INDEX);
@@ -742,7 +742,7 @@ brinbuildempty(Relation index)
brin_metapage_init(BufferGetPage(metabuf), BrinGetPagesPerRange(index),
BRIN_CURRENT_VERSION);
MarkBufferDirty(metabuf);
- log_newpage_buffer(metabuf, false);
+ log_newpage_buffer(metabuf, true);
END_CRIT_SECTION();
UnlockReleaseBuffer(metabuf);
diff --git a/src/backend/access/brin/brin_pageops.c
b/src/backend/access/brin/brin_pageops.c
index 80f803e438..92903f38c7 100644
--- a/src/backend/access/brin/brin_pageops.c
+++ b/src/backend/access/brin/brin_pageops.c
@@ -491,6 +491,15 @@ brin_metapage_init(Page page, BlockNumber pagesPerRange,
uint16 version)
* revmap page to be created when the index is.
*/
metadata->lastRevmapPage = 0;
+
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
essential
+ * but it makes the page look compressible to xlog.c, because we pass
the
+ * buffer containing this page to XLogRegisterBuffer() as a page with
+ * standard layout.
+ */
+ ((PageHeader) page)->pd_lower =
+ ((char *) metadata + sizeof(BrinMetaPageData)) - (char *) page;
}
/*
diff --git a/src/backend/access/brin/brin_revmap.c
b/src/backend/access/brin/brin_revmap.c
index 22f2076887..4b056c68a2 100644
--- a/src/backend/access/brin/brin_revmap.c
+++ b/src/backend/access/brin/brin_revmap.c
@@ -624,6 +624,15 @@ revmap_physical_extend(BrinRevmap *revmap)
MarkBufferDirty(buf);
metadata->lastRevmapPage = mapBlk;
+
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
essential
+ * but it makes the page look compressible to xlog.c, because we pass
the
+ * buffer containing this page to XLogRegisterBuffer() as a page with
+ * standard layout.
+ */
+ ((PageHeader) metapage)->pd_lower =
+ ((char *) metadata + sizeof(BrinMetaPageData)) - (char *)
metapage;
MarkBufferDirty(revmap->rm_metaBuf);
if (RelationNeedsWAL(revmap->rm_irel))
@@ -635,7 +644,7 @@ revmap_physical_extend(BrinRevmap *revmap)
XLogBeginInsert();
XLogRegisterData((char *) &xlrec, SizeOfBrinRevmapExtend);
- XLogRegisterBuffer(0, revmap->rm_metaBuf, 0);
+ XLogRegisterBuffer(0, revmap->rm_metaBuf, REGBUF_STANDARD);
XLogRegisterBuffer(1, buf, REGBUF_WILL_INIT);
diff --git a/src/backend/access/brin/brin_xlog.c
b/src/backend/access/brin/brin_xlog.c
index dff7198a39..a9c7d909f6 100644
--- a/src/backend/access/brin/brin_xlog.c
+++ b/src/backend/access/brin/brin_xlog.c
@@ -234,6 +234,15 @@ brin_xlog_revmap_extend(XLogReaderState *record)
metadata->lastRevmapPage = xlrec->targetBlk;
PageSetLSN(metapg, lsn);
+
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
+ * essential but it makes the page look compressible to xlog.c,
because
+ * we pass the buffer containing this page to
XLogRegisterBuffer() as a
+ * page with standard layout.
+ */
+ ((PageHeader) metapg)->pd_lower =
+ ((char *) metadata + sizeof(BrinMetaPageData)) - (char
*) metapg;
MarkBufferDirty(metabuf);
}
@@ -331,14 +340,20 @@ void
brin_mask(char *pagedata, BlockNumber blkno)
{
Page page = (Page) pagedata;
+ PageHeader pagehdr = (PageHeader) page;
mask_page_lsn(page);
mask_page_hint_bits(page);
- if (BRIN_IS_REGULAR_PAGE(page))
+ /*
+ * Regular brin pages contain unused space which needs to be masked.
+ * Similarly for meta pages, but mask it only if pd_lower appears to
have
+ * been set correctly.
+ */
+ if (BRIN_IS_REGULAR_PAGE(page) ||
+ (BRIN_IS_META_PAGE(page) && pagehdr->pd_lower >
SizeOfPageHeaderData))
{
- /* Regular brin pages contain unused space which needs to be
masked. */
mask_unused_space(page);
}
}
--
2.11.0
From 478a9306e64d821d2fafd28906690ad2e00e5cd0 Mon Sep 17 00:00:00 2001
From: amit <[email protected]>
Date: Mon, 26 Jun 2017 15:23:34 +0900
Subject: [PATCH 3/3] Set pd_lower correctly in the SP-GiST index metapage
Also tell xlog.c to treat the metapage like a standard page, so any
hole in it is compressed.
---
src/backend/access/spgist/spginsert.c | 4 ++--
src/backend/access/spgist/spgutils.c | 19 +++++++++++++++++++
src/backend/access/spgist/spgxlog.c | 7 ++++---
3 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/src/backend/access/spgist/spginsert.c
b/src/backend/access/spgist/spginsert.c
index e4b2c29b0e..80b82e1602 100644
--- a/src/backend/access/spgist/spginsert.c
+++ b/src/backend/access/spgist/spginsert.c
@@ -110,7 +110,7 @@ spgbuild(Relation heap, Relation index, IndexInfo
*indexInfo)
* Replay will re-initialize the pages, so don't take full pages
* images. No other data to log.
*/
- XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT);
+ XLogRegisterBuffer(0, metabuffer, REGBUF_WILL_INIT |
REGBUF_STANDARD);
XLogRegisterBuffer(1, rootbuffer, REGBUF_WILL_INIT |
REGBUF_STANDARD);
XLogRegisterBuffer(2, nullbuffer, REGBUF_WILL_INIT |
REGBUF_STANDARD);
@@ -173,7 +173,7 @@ spgbuildempty(Relation index)
smgrwrite(index->rd_smgr, INIT_FORKNUM, SPGIST_METAPAGE_BLKNO,
(char *) page, true);
log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
- SPGIST_METAPAGE_BLKNO, page, false);
+ SPGIST_METAPAGE_BLKNO, page, true);
/* Likewise for the root page. */
SpGistInitPage(page, SPGIST_LEAF);
diff --git a/src/backend/access/spgist/spgutils.c
b/src/backend/access/spgist/spgutils.c
index 22f64b0103..9048c08f1c 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -256,15 +256,25 @@ SpGistUpdateMetaPage(Relation index)
if (cache != NULL)
{
Buffer metabuffer;
+ Page metapage;
SpGistMetaPageData *metadata;
metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
+ metapage = BufferGetPage(metabuffer);
if (ConditionalLockBuffer(metabuffer))
{
metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
metadata->lastUsedPages = cache->lastUsedPages;
+ /*
+ * Set pd_lower just past the end of the metadata.
This is not
+ * essential but it makes the page look compressible to
xlog.c,
+ * because we pass the buffer containing this page to
+ * XLogRegisterBuffer() as page with standard layout.
+ */
+ ((PageHeader) metapage)->pd_lower = ((char *) metadata +
+
sizeof(SpGistMetaPageData)) - (char *) metapage;
MarkBufferDirty(metabuffer);
UnlockReleaseBuffer(metabuffer);
}
@@ -534,6 +544,15 @@ SpGistInitMetapage(Page page)
/* initialize last-used-page cache to empty */
for (i = 0; i < SPGIST_CACHED_PAGES; i++)
metadata->lastUsedPages.cachedPage[i].blkno =
InvalidBlockNumber;
+
+ /*
+ * Set pd_lower just past the end of the metadata. This is not
essential
+ * but it makes the page look compressible to xlog.c, because we pass
the
+ * buffer containing this page to XLogRegisterBuffer() as page with
+ * standard layout.
+ */
+ ((PageHeader) page)->pd_lower =
+ ((char *) metadata + sizeof(SpGistMetaPageData)) - (char *)
page;
}
/*
diff --git a/src/backend/access/spgist/spgxlog.c
b/src/backend/access/spgist/spgxlog.c
index c440d21715..84acd5a30f 100644
--- a/src/backend/access/spgist/spgxlog.c
+++ b/src/backend/access/spgist/spgxlog.c
@@ -1033,15 +1033,16 @@ void
spg_mask(char *pagedata, BlockNumber blkno)
{
Page page = (Page) pagedata;
+ PageHeader pagehdr = (PageHeader) page;
mask_page_lsn(page);
mask_page_hint_bits(page);
/*
- * Any SpGist page other than meta contains unused space which needs to
be
- * masked.
+ * Mask the unused space, only if the page's pd_lower appears to have
been
+ * set correctly.
*/
- if (!SpGistPageIsMeta(page))
+ if (pagehdr->pd_lower > SizeOfPageHeaderData)
mask_unused_space(page);
}
--
2.11.0
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers