On 30.10.2020 03:42, Tomas Vondra wrote:
Hi,
I might be somewhat late to the party, but I've done a bit of
benchmarking too ;-) I used TPC-H data from a 100GB test, and tried
different combinations of COPY [FREEZE] and VACUUM [FREEZE], both on
current master and with the patch.
So I don't really observe any measurable slowdowns in the COPY part (in
fact there seems to be a tiny speedup, but it might be just noise). In
the VACUUM part, there's clear speedup when the data was already frozen
by COPY (Yes, those are zeroes, because it took less than 1 second.)
So that looks pretty awesome, I guess.
For the record, these tests were run on a server with NVMe SSD, so
hopefully reliable and representative numbers.
Thank you for the review.
A couple minor comments about the code:
2) I find the "if (all_frozen_set)" block a bit strange. It's a matter
of personal preference, but I'd just use a single level of nesting, i.e.
something like this:
/* if everything frozen, the whole page has to be visible */
Assert(!(all_frozen_set && !PageIsAllVisible(page)));
/*
* If we've frozen everything on the page, and if we're already
* holding pin on the vmbuffer, record that in the visibilitymap.
* If we're not holding the pin, it's OK to skip this - it's just
* an optimization.
*
* It's fine to use InvalidTransactionId here - this is only used
* when HEAP_INSERT_FROZEN is specified, which intentionally
* violates visibility rules.
*/
if (all_frozen_set &&
visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer))
visibilitymap_set(...);
IMHO it's easier to read, but YMMV. I've also reworded the comment a bit
to say what we're doing etc. And I've moved the comment from inside the
if block into the main comment - that was making it harder to read too.
I agree that it's a matter of taste. I've updated comments and left
nesting unchanged to keep assertions simple.
3) I see RelationGetBufferForTuple does this:
/*
* This is for COPY FREEZE needs. If page is empty,
* pin vmbuffer to set all_frozen bit
*/
if ((options & HEAP_INSERT_FROZEN) &&
(PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0))
visibilitymap_pin(relation, BufferGetBlockNumber(buffer),
vmbuffer);
so is it actually possible to get into the (all_frozen_set) without
holding a pin on the visibilitymap? I haven't investigated this so
maybe there are other ways to get into that code block. But the new
additions to hio.c get the pin too.
I was thinking that GetVisibilityMapPins() can somehow unset the pin. I
gave it a second look. And now I don't think it's possible to get into
this code block without a pin. So, I converted this check into an
assertion.
4) In heap_xlog_multi_insert we now have this:
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
PageClearAllVisible(page);
if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
PageSetAllVisible(page);
IIUC it makes no sense to have both flags at the same time, right? So
what about adding
Assert(!(XLH_INSERT_ALL_FROZEN_SET &&
XLH_INSERT_ALL_VISIBLE_CLEARED));
to check that?
Agree.
I placed this assertion to the very beginning of the function. It also
helped to simplify the code a bit.
I also noticed, that we were not updating visibility map for all_frozen
from heap_xlog_multi_insert. Fixed.
I wonder what to do about the heap_insert - I know there are concerns it
would negatively impact regular insert, but is it really true? I suppose
this optimization would be valuable even for cases where multi-insert is
not possible.
Do we have something like INSERT .. FREEZE? I only see
TABLE_INSERT_FROZEN set for COPY FREEZE and for matview operations. Can
you explain, what use-case are we trying to optimize by extending this
patch to heap_insert()?
The new version is attached.
I've also fixed a typo in the comment by Tatsuo Ishii suggestion.
Also, I tested this patch with replication and found no issues.
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
commit 395716e277ac4be22b9b61311c301a69db0f9101
Author: anastasia <a.lubennik...@postgrespro.ru>
Date: Mon Nov 2 16:27:48 2020 +0300
Teach COPY FREEZE to set PD_ALL_VISIBLE and visibility map bits.
diff --git a/contrib/pg_visibility/expected/pg_visibility.out b/contrib/pg_visibility/expected/pg_visibility.out
index ca4b6e186b..0017e3415c 100644
--- a/contrib/pg_visibility/expected/pg_visibility.out
+++ b/contrib/pg_visibility/expected/pg_visibility.out
@@ -179,6 +179,69 @@ select pg_truncate_visibility_map('test_partition');
(1 row)
+-- test copy freeze
+create table copyfreeze (a int, b char(1500));
+-- load all rows via COPY FREEZE and ensure that all pages are set all-visible
+-- and all-frozen.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen
+-------+-------------+------------
+ 0 | t | t
+ 1 | t | t
+ 2 | t | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid
+--------
+(0 rows)
+
+-- load half the rows via regular COPY and rest via COPY FREEZE. The pages
+-- which are touched by regular COPY must not be set all-visible/all-frozen. On
+-- the other hand, pages allocated by COPY FREEZE should be marked
+-- all-frozen/all-visible.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen
+-------+-------------+------------
+ 0 | f | f
+ 1 | f | f
+ 2 | t | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid
+--------
+(0 rows)
+
+-- Try a mix of regular COPY and COPY FREEZE.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+copy copyfreeze from stdin;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen
+-------+-------------+------------
+ 0 | t | t
+ 1 | f | f
+ 2 | t | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid
+--------
+(0 rows)
+
-- cleanup
drop table test_partitioned;
drop view test_view;
@@ -188,3 +251,4 @@ drop server dummy_server;
drop foreign data wrapper dummy;
drop materialized view matview_visibility_test;
drop table regular_table;
+drop table copyfreeze;
diff --git a/contrib/pg_visibility/sql/pg_visibility.sql b/contrib/pg_visibility/sql/pg_visibility.sql
index f79b54480b..ec1afd4906 100644
--- a/contrib/pg_visibility/sql/pg_visibility.sql
+++ b/contrib/pg_visibility/sql/pg_visibility.sql
@@ -94,6 +94,82 @@ select count(*) > 0 from pg_visibility_map_summary('test_partition');
select * from pg_check_frozen('test_partition'); -- hopefully none
select pg_truncate_visibility_map('test_partition');
+-- test copy freeze
+create table copyfreeze (a int, b char(1500));
+
+-- load all rows via COPY FREEZE and ensure that all pages are set all-visible
+-- and all-frozen.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+1 '1'
+2 '2'
+3 '3'
+4 '4'
+5 '5'
+6 '6'
+7 '7'
+8 '8'
+9 '9'
+10 '10'
+11 '11'
+12 '12'
+\.
+commit;
+select * from pg_visibility_map('copyfreeze');
+select * from pg_check_frozen('copyfreeze');
+
+-- load half the rows via regular COPY and rest via COPY FREEZE. The pages
+-- which are touched by regular COPY must not be set all-visible/all-frozen. On
+-- the other hand, pages allocated by COPY FREEZE should be marked
+-- all-frozen/all-visible.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin;
+1 '1'
+2 '2'
+3 '3'
+4 '4'
+5 '5'
+6 '6'
+\.
+copy copyfreeze from stdin freeze;
+7 '7'
+8 '8'
+9 '9'
+10 '10'
+11 '11'
+12 '12'
+\.
+commit;
+select * from pg_visibility_map('copyfreeze');
+select * from pg_check_frozen('copyfreeze');
+
+-- Try a mix of regular COPY and COPY FREEZE.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+1 '1'
+2 '2'
+3 '3'
+4 '4'
+5 '5'
+\.
+copy copyfreeze from stdin;
+6 '6'
+\.
+copy copyfreeze from stdin freeze;
+7 '7'
+8 '8'
+9 '9'
+10 '10'
+11 '11'
+12 '12'
+\.
+commit;
+select * from pg_visibility_map('copyfreeze');
+select * from pg_check_frozen('copyfreeze');
+
-- cleanup
drop table test_partitioned;
drop view test_view;
@@ -103,3 +179,4 @@ drop server dummy_server;
drop foreign data wrapper dummy;
drop materialized view matview_visibility_test;
drop table regular_table;
+drop table copyfreeze;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 1585861a02..e1664c33e1 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2103,6 +2103,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
int ndone;
PGAlignedBlock scratch;
Page page;
+ Buffer vmbuffer = InvalidBuffer;
bool needwal;
Size saveFreeSpace;
bool need_tuple_data = RelationIsLogicallyLogged(relation);
@@ -2157,8 +2158,9 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
while (ndone < ntuples)
{
Buffer buffer;
- Buffer vmbuffer = InvalidBuffer;
+ bool starting_with_empty_page;
bool all_visible_cleared = false;
+ bool all_frozen_set = false;
int nthispage;
CHECK_FOR_INTERRUPTS();
@@ -2166,12 +2168,20 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/*
* Find buffer where at least the next tuple will fit. If the page is
* all-visible, this will also pin the requisite visibility map page.
+ *
+ * Also pin visibility map page if COPY FREEZE inserts tuples into an
+ * empty page. See all_frozen_set below.
*/
buffer = RelationGetBufferForTuple(relation, heaptuples[ndone]->t_len,
InvalidBuffer, options, bistate,
&vmbuffer, NULL);
page = BufferGetPage(buffer);
+ starting_with_empty_page = PageGetMaxOffsetNumber(page) == 0;
+
+ if (starting_with_empty_page && (options & HEAP_INSERT_FROZEN))
+ all_frozen_set = true;
+
/* NO EREPORT(ERROR) from here till changes are logged */
START_CRIT_SECTION();
@@ -2205,7 +2215,11 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
log_heap_new_cid(relation, heaptup);
}
- if (PageIsAllVisible(page))
+ /*
+ * If the page is all visible, need to clear that, unless we're only
+ * going to add further frozen rows to it.
+ */
+ if (PageIsAllVisible(page) && !(options & HEAP_INSERT_FROZEN))
{
all_visible_cleared = true;
PageClearAllVisible(page);
@@ -2213,6 +2227,12 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
BufferGetBlockNumber(buffer),
vmbuffer, VISIBILITYMAP_VALID_BITS);
}
+ /*
+ * If we're only adding already frozen rows, and the page was
+ * previously empty, mark it as all-visible.
+ */
+ else if (all_frozen_set)
+ PageSetAllVisible(page);
/*
* XXX Should we set PageSetPrunable on this page ? See heap_insert()
@@ -2236,8 +2256,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
* If the page was previously empty, we can reinit the page
* instead of restoring the whole thing.
*/
- init = (ItemPointerGetOffsetNumber(&(heaptuples[ndone]->t_self)) == FirstOffsetNumber &&
- PageGetMaxOffsetNumber(page) == FirstOffsetNumber + nthispage - 1);
+ init = starting_with_empty_page;
/* allocate xl_heap_multi_insert struct from the scratch area */
xlrec = (xl_heap_multi_insert *) scratchptr;
@@ -2255,7 +2274,14 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* the rest of the scratch space is used for tuple data */
tupledata = scratchptr;
- xlrec->flags = all_visible_cleared ? XLH_INSERT_ALL_VISIBLE_CLEARED : 0;
+ xlrec->flags = 0;
+ /* check mutually exclusive flags */
+ Assert (!(all_visible_cleared && all_frozen_set));
+ if (all_visible_cleared)
+ xlrec->flags |= XLH_INSERT_ALL_VISIBLE_CLEARED;
+ if (all_frozen_set)
+ xlrec->flags |= XLH_INSERT_ALL_FROZEN_SET;
+
xlrec->ntuples = nthispage;
/*
@@ -2329,13 +2355,39 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
END_CRIT_SECTION();
- UnlockReleaseBuffer(buffer);
- if (vmbuffer != InvalidBuffer)
- ReleaseBuffer(vmbuffer);
+ if (all_frozen_set)
+ {
+ /* if everything frozen, the whole page has to be visible */
+ Assert(PageIsAllVisible(page));
+ /*
+ * If we've frozen everything on the page, and if we're already
+ * holding pin on the vmbuffer, record that in the visibilitymap.
+ *
+ * It's fine to use InvalidTransactionId here - this is only used
+ * when HEAP_INSERT_FROZEN is specified, which intentionally
+ * violates visibility rules.
+ */
+ Assert(visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer));
+ visibilitymap_set(relation, BufferGetBlockNumber(buffer), buffer,
+ InvalidXLogRecPtr, vmbuffer,
+ InvalidTransactionId,
+ VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
+ }
+
+ UnlockReleaseBuffer(buffer);
ndone += nthispage;
+
+ /*
+ * NB: Only release vmbuffer after inserting all tuples - it's fairly
+ * likely that we'll insert into subsequent heap pages that are likely
+ * to use the same vm page.
+ */
}
+ if (vmbuffer != InvalidBuffer)
+ ReleaseBuffer(vmbuffer);
+
/*
* We're done with the actual inserts. Check for conflicts again, to
* ensure that all rw-conflicts in to these inserts are detected. Without
@@ -8240,6 +8292,10 @@ heap_xlog_multi_insert(XLogReaderState *record)
XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
+ /* check mutually exclusive flags */
+ Assert (!((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
+ && (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)));
+
/*
* The visibility map may need to be fixed even if the heap page is
* already up-to-date.
@@ -8328,6 +8384,9 @@ heap_xlog_multi_insert(XLogReaderState *record)
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
PageClearAllVisible(page);
+ /* XLH_INSERT_ALL_FROZEN_SET implies that all tuples are visible */
+ if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
+ PageSetAllVisible(page);
MarkBufferDirty(buffer);
}
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index ca357410a2..349bb43249 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -433,6 +433,13 @@ loop:
buffer = ReadBufferBI(relation, targetBlock, RBM_NORMAL, bistate);
if (PageIsAllVisible(BufferGetPage(buffer)))
visibilitymap_pin(relation, targetBlock, vmbuffer);
+ /*
+ * If the page is empty, pin vmbuffer to set all_frozen bit.
+ */
+ if ((options & HEAP_INSERT_FROZEN) &&
+ (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0))
+ visibilitymap_pin(relation, targetBlock, vmbuffer);
+
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
}
else if (otherBlock == targetBlock)
@@ -619,6 +626,15 @@ loop:
PageInit(page, BufferGetPageSize(buffer), 0);
MarkBufferDirty(buffer);
+ /*
+ * The page is empty, pin vmbuffer to set all_frozen bit.
+ */
+ if (options & HEAP_INSERT_FROZEN)
+ {
+ Assert(PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0);
+ visibilitymap_pin(relation, BufferGetBlockNumber(buffer), vmbuffer);
+ }
+
/*
* Release the file-extension lock; it's now OK for someone else to extend
* the relation some more.
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 1525194112..41185b3977 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -69,6 +69,9 @@
#define XLH_INSERT_CONTAINS_NEW_TUPLE (1<<3)
#define XLH_INSERT_ON_TOAST_RELATION (1<<4)
+/* all_frozen_set always implies all_visible_set */
+#define XLH_INSERT_ALL_FROZEN_SET (1<<5)
+
/*
* xl_heap_update flag values, 8 bits are available.
*/