On Sat, 28 May 2022 at 22:51, Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > > > > On 5/28/22 21:24, Matthias van de Meent wrote: > > On Sat, 28 May 2022 at 16:51, Tomas Vondra > > <tomas.von...@enterprisedb.com> wrote: > >> > >> Hi, > >> > >> while working on some BRIN stuff, I realized (my) commit 5753d4ee320b > >> ignoring BRIN indexes for HOT is likely broken. Consider this example: > >> > >> ---------------------------------------------------------------------- > >> CREATE TABLE t (a INT) WITH (fillfactor = 10); > >> > >> INSERT INTO t SELECT i > >> FROM generate_series(0,100000) s(i); > >> > >> CREATE INDEX ON t USING BRIN (a); > >> > >> UPDATE t SET a = 0 WHERE random() < 0.01; > >> > >> SET enable_seqscan = off; > >> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0; > >> > >> SET enable_seqscan = on; > >> EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) SELECT * FROM t WHERE a = 0; > >> ---------------------------------------------------------------------- > >> > >> which unfortunately produces this: > >> > >> QUERY PLAN > >> --------------------------------------------------------------- > >> Bitmap Heap Scan on t (actual rows=23 loops=1) > >> Recheck Cond: (a = 0) > >> Rows Removed by Index Recheck: 2793 > >> Heap Blocks: lossy=128 > >> -> Bitmap Index Scan on t_a_idx (actual rows=1280 loops=1) > >> Index Cond: (a = 0) > >> Planning Time: 0.049 ms > >> Execution Time: 0.424 ms > >> (8 rows) > >> > >> SET > >> QUERY PLAN > >> ----------------------------------------- > >> Seq Scan on t (actual rows=995 loops=1) > >> Filter: (a = 0) > >> Rows Removed by Filter: 99006 > >> Planning Time: 0.027 ms > >> Execution Time: 7.670 ms > >> (5 rows) > >> > >> That is, the index fails to return some of the rows :-( > >> > >> I don't remember the exact reasoning behind the commit, but the commit > >> message justifies the change like this: > >> > >> There are no index pointers to individual tuples in BRIN, and the > >> page range summary will be updated anyway as it relies on visibility > >> info. > >> > >> AFAICS that's a misunderstanding of how BRIN uses visibility map, or > >> rather does not use. In particular, bringetbitmap() does not look at the > >> vm at all, so it'll produce incomplete bitmap. > >> > >> So it seems I made a boo boo here. Luckily, this is a PG15 commit, not a > >> live issue. I don't quite see if this can be salvaged - I'll think about > >> this a bit more, but it'll probably end with a revert. > > > > The principle of 'amhotblocking' for only blocking HOT updates seems > > correct, except for the fact that the HOT flag bit is also used as a > > way to block the propagation of new values to existing indexes. > > > > A better abstraction would be "amSummarizes[Block]', in which updates > > that only modify columns that are only included in summarizing indexes > > still allow HOT, but still will see an update call to all (relevant?) > > summarizing indexes. That should still improve performance > > significantly for the relevant cases. > > > > Yeah, I think that might/should work. We could still create the HOT > chain, but we'd have to update the BRIN indexes. But that seems like a > fairly complicated change to be done this late for PG15.
Here's an example patch for that (based on a branch derived from master @ 5bb2b6ab). A nod to the authors of the pHOT patch, as that is a related patch and was informative in how this could/should impact AM APIs -- this is doing things similar (but not exactly the same) to that by only updating select indexes. Note that this is an ABI change in some critical places -- I'm not sure it's OK to commit a fix like this into PG15 unless we really don't want to revert 5753d4ee320b. Also of note is that this still updates _all_ summarizing indexes, not only those involved in the tuple update. Better performance is up to a different implementation. The patch includes a new regression test based on your example, which fails on master but succeeds after applying the patch. -Matthias
From fdcb789a52d6f63c035984da7006a276c7c946e1 Mon Sep 17 00:00:00 2001 From: Matthias van de Meent <boekewurm+postgres@gmail.com> Date: Mon, 30 May 2022 15:55:02 +0200 Subject: [PATCH v1] Rework 5753d4ee's amhotblocking infrastructure -- replaced with amsummarizing Pain points of 'hot blocking': it is a strange name for an indexam-specific API, as HOT is a heap-specific optimization that is not necessarily shared across table AMs. The specific feature that allows HOT to still be applied when there is an index on an updated column would be that the index provides a 'summary of indexed values on page X'. The specific feature for those indexes thus becomes whether the index is 'summarizing' the data of the block in a way that doesn't allow point tuple lookups, such that the values returned by the AM are either all tuples of a page, or no tuples of that page. The new code in 5753d4ee320b did not take into account that even summarizing indexes do need to receive the new value of their indexed columns, even if the specific tuple would never be indexed as is -- the summary of the index must still be updated with the new value, lest the summary become invalid. This patch adds that path -- summarizing indexes will now receive an update even if HOT is applied, through an ABI-breaking change in table_tuple_update. Reported-By: https://www.postgresql.org/message-id/05ebcb44-f383-86e3-4f31-0a97a55634cf%40enterprisedb.com --- doc/src/sgml/indexam.sgml | 12 ++++---- src/backend/access/brin/brin.c | 2 +- src/backend/access/gin/ginutil.c | 2 +- src/backend/access/gist/gist.c | 2 +- src/backend/access/hash/hash.c | 2 +- src/backend/access/heap/heapam.c | 13 +++++++++ src/backend/access/heap/heapam_handler.c | 19 ++++++++++-- src/backend/access/nbtree/nbtree.c | 2 +- src/backend/access/spgist/spgutils.c | 2 +- src/backend/access/table/tableam.c | 2 +- src/backend/catalog/index.c | 9 ++++-- src/backend/catalog/indexing.c | 18 ++++++++++-- src/backend/commands/copyfrom.c | 5 ++-- src/backend/commands/indexcmds.c | 10 +++++-- src/backend/executor/execIndexing.c | 6 +++- src/backend/executor/execReplication.c | 9 +++--- src/backend/executor/nodeModifyTable.c | 13 +++++---- src/backend/nodes/makefuncs.c | 4 ++- src/backend/utils/cache/relcache.c | 24 ++++++++++++--- src/include/access/amapi.h | 4 +-- src/include/access/htup_details.h | 29 +++++++++++++++++++ src/include/access/tableam.h | 19 ++++++++++-- src/include/executor/executor.h | 3 +- src/include/nodes/execnodes.h | 2 ++ src/include/nodes/makefuncs.h | 4 ++- src/include/utils/rel.h | 1 + src/include/utils/relcache.h | 3 +- .../modules/dummy_index_am/dummy_index_am.c | 2 +- src/test/regress/expected/brin.out | 21 ++++++++++++-- src/test/regress/sql/brin.sql | 10 +++++-- 30 files changed, 202 insertions(+), 52 deletions(-) diff --git a/doc/src/sgml/indexam.sgml b/doc/src/sgml/indexam.sgml index d4163c96e9..f815da817f 100644 --- a/doc/src/sgml/indexam.sgml +++ b/doc/src/sgml/indexam.sgml @@ -126,8 +126,9 @@ typedef struct IndexAmRoutine bool amcaninclude; /* does AM use maintenance_work_mem? */ bool amusemaintenanceworkmem; - /* does AM block HOT update? */ - bool amhotblocking; + /* does AM summarize tuples, with at least all tuples in the block + * summarized in one summary */ + bool amsummarizing; /* OR of parallel vacuum flags */ uint8 amparallelvacuumoptions; /* type of data stored in index, or InvalidOid if variable */ @@ -249,9 +250,10 @@ typedef struct IndexAmRoutine </para> <para> - The <structfield>amhotblocking</structfield> flag indicates whether the - access method blocks <acronym>HOT</acronym> when an indexed attribute is - updated. Access methods without pointers to individual tuples (like + The <structfield>amsummarizing</structfield> flag indicates whether the + access method summarizes the indexed tuples, with summarizing granularity + of at least per block. + Access methods without pointers to individual tuples (like <acronym>BRIN</acronym>) may allow <acronym>HOT</acronym> even in this case. This does not apply to attributes referenced in index predicates; an update of such attribute always disables <acronym>HOT</acronym>. diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c index 4366010768..550314e6a7 100644 --- a/src/backend/access/brin/brin.c +++ b/src/backend/access/brin/brin.c @@ -108,7 +108,7 @@ brinhandler(PG_FUNCTION_ARGS) amroutine->amcanparallel = false; amroutine->amcaninclude = false; amroutine->amusemaintenanceworkmem = false; - amroutine->amhotblocking = false; + amroutine->amsummarizing = true; amroutine->amparallelvacuumoptions = VACUUM_OPTION_PARALLEL_CLEANUP; amroutine->amkeytype = InvalidOid; diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c index 3d15701a01..52b5928b58 100644 --- a/src/backend/access/gin/ginutil.c +++ b/src/backend/access/gin/ginutil.c @@ -56,7 +56,7 @@ ginhandler(PG_FUNCTION_ARGS) amroutine->amcanparallel = false; amroutine->amcaninclude = false; amroutine->amusemaintenanceworkmem = true; - amroutine->amhotblocking = true; + amroutine->amsummarizing = false; amroutine->amparallelvacuumoptions = VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP; amroutine->amkeytype = InvalidOid; diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index 8c6c744ab7..a7d0a3c21e 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -78,7 +78,7 @@ gisthandler(PG_FUNCTION_ARGS) amroutine->amcanparallel = false; amroutine->amcaninclude = true; amroutine->amusemaintenanceworkmem = false; - amroutine->amhotblocking = true; + amroutine->amsummarizing = false; amroutine->amparallelvacuumoptions = VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP; amroutine->amkeytype = InvalidOid; diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c index fd1a7119b6..2afa658b0e 100644 --- a/src/backend/access/hash/hash.c +++ b/src/backend/access/hash/hash.c @@ -75,7 +75,7 @@ hashhandler(PG_FUNCTION_ARGS) amroutine->amcanparallel = false; amroutine->amcaninclude = false; amroutine->amusemaintenanceworkmem = false; - amroutine->amhotblocking = true; + amroutine->amsummarizing = false; amroutine->amparallelvacuumoptions = VACUUM_OPTION_PARALLEL_BULKDEL; amroutine->amkeytype = INT4OID; diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 7421851027..d15eaf85de 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -3123,6 +3123,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, TM_Result result; TransactionId xid = GetCurrentTransactionId(); Bitmapset *hot_attrs; + Bitmapset *sum_attrs; Bitmapset *key_attrs; Bitmapset *id_attrs; Bitmapset *interesting_attrs; @@ -3145,6 +3146,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, bool have_tuple_lock = false; bool iscombo; bool use_hot_update = false; + bool summarized_update = false; bool key_intact; bool all_visible_cleared = false; bool all_visible_cleared_new = false; @@ -3191,11 +3193,13 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup, * relcache flush happening midway through. */ hot_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_HOT_BLOCKING); + sum_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_SUMMARIZED); key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY); id_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_IDENTITY_KEY); interesting_attrs = NULL; interesting_attrs = bms_add_members(interesting_attrs, hot_attrs); + interesting_attrs = bms_add_members(interesting_attrs, sum_attrs); interesting_attrs = bms_add_members(interesting_attrs, key_attrs); interesting_attrs = bms_add_members(interesting_attrs, id_attrs); @@ -3506,6 +3510,7 @@ l2: if (vmbuffer != InvalidBuffer) ReleaseBuffer(vmbuffer); bms_free(hot_attrs); + bms_free(sum_attrs); bms_free(key_attrs); bms_free(id_attrs); bms_free(modified_attrs); @@ -3827,7 +3832,11 @@ l2: * changed. */ if (!bms_overlap(modified_attrs, hot_attrs)) + { use_hot_update = true; + if (bms_overlap(modified_attrs, sum_attrs)) + summarized_update = true; + } } else { @@ -3987,10 +3996,14 @@ l2: heap_freetuple(heaptup); } + if (summarized_update) + HeapTupleHeaderSetSummaryUpdate(newtup->t_data); + if (old_key_tuple != NULL && old_key_copied) heap_freetuple(old_key_tuple); bms_free(hot_attrs); + bms_free(sum_attrs); bms_free(key_attrs); bms_free(id_attrs); bms_free(modified_attrs); diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c index 444f027149..0bfce50b35 100644 --- a/src/backend/access/heap/heapam_handler.c +++ b/src/backend/access/heap/heapam_handler.c @@ -314,7 +314,7 @@ static TM_Result heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot, CommandId cid, Snapshot snapshot, Snapshot crosscheck, bool wait, TM_FailureData *tmfd, - LockTupleMode *lockmode, bool *update_indexes) + LockTupleMode *lockmode, TU_UpdateIndexes *update_indexes) { bool shouldFree = true; HeapTuple tuple = ExecFetchSlotHeapTuple(slot, true, &shouldFree); @@ -334,9 +334,22 @@ heapam_tuple_update(Relation relation, ItemPointer otid, TupleTableSlot *slot, * Note: heap_update returns the tid (location) of the new tuple in the * t_self field. * - * If it's a HOT update, we mustn't insert new index entries. + * If it's a HOT update, we mustn't insert new index entries. If the update + * was summarized, we must update only those summarizing indexes. */ - *update_indexes = result == TM_Ok && !HeapTupleIsHeapOnly(tuple); + if (result != TM_Ok) + *update_indexes = TUUI_None; + else if (!HeapTupleIsHeapOnly(tuple)) + *update_indexes = TUUI_All; + else if (HeapTupleHeaderIsHOTWithSummaryUpdate(tuple->t_data)) + { + *update_indexes = TUUI_Summarizing; + + /* Clear temporary bits */ + HeapTupleHeaderClearSummaryUpdate(tuple->t_data); + } + else + *update_indexes = TUUI_None; if (shouldFree) pfree(tuple); diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index 95da2c46bf..2f32ee720f 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -118,7 +118,7 @@ bthandler(PG_FUNCTION_ARGS) amroutine->amcanparallel = true; amroutine->amcaninclude = true; amroutine->amusemaintenanceworkmem = false; - amroutine->amhotblocking = true; + amroutine->amsummarizing = false; amroutine->amparallelvacuumoptions = VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP; amroutine->amkeytype = InvalidOid; diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c index a171ca8a08..cf2a11ef70 100644 --- a/src/backend/access/spgist/spgutils.c +++ b/src/backend/access/spgist/spgutils.c @@ -62,7 +62,7 @@ spghandler(PG_FUNCTION_ARGS) amroutine->amcanparallel = false; amroutine->amcaninclude = true; amroutine->amusemaintenanceworkmem = false; - amroutine->amhotblocking = true; + amroutine->amsummarizing = false; amroutine->amparallelvacuumoptions = VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_COND_CLEANUP; amroutine->amkeytype = InvalidOid; diff --git a/src/backend/access/table/tableam.c b/src/backend/access/table/tableam.c index b3d1a6c3f8..b6a403aa16 100644 --- a/src/backend/access/table/tableam.c +++ b/src/backend/access/table/tableam.c @@ -346,7 +346,7 @@ void simple_table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot, Snapshot snapshot, - bool *update_indexes) + TU_UpdateIndexes *update_indexes) { TM_Result result; TM_FailureData tmfd; diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index fd389c28d8..85fe9f715d 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -1371,7 +1371,8 @@ index_concurrently_create_copy(Relation heapRelation, Oid oldIndexId, oldInfo->ii_Unique, oldInfo->ii_NullsNotDistinct, false, /* not ready for inserts */ - true); + true, + indexRelation->rd_indam->amsummarizing); /* * Extract the list of column names and the column numbers for the new @@ -2422,7 +2423,8 @@ BuildIndexInfo(Relation index) indexStruct->indisunique, indexStruct->indnullsnotdistinct, indexStruct->indisready, - false); + false, + index->rd_indam->amsummarizing); /* fill in attribute numbers */ for (i = 0; i < numAtts; i++) @@ -2482,7 +2484,8 @@ BuildDummyIndexInfo(Relation index) indexStruct->indisunique, indexStruct->indnullsnotdistinct, indexStruct->indisready, - false); + false, + index->rd_indam->amsummarizing); /* fill in attribute numbers */ for (i = 0; i < numAtts; i++) diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c index 0b92093322..08efbf458e 100644 --- a/src/backend/catalog/indexing.c +++ b/src/backend/catalog/indexing.c @@ -82,6 +82,7 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple) IndexInfo **indexInfoArray; Datum values[INDEX_MAX_KEYS]; bool isnull[INDEX_MAX_KEYS]; + bool onlySummarized = false; /* * HOT update does not require index inserts. But with asserts enabled we @@ -90,7 +91,17 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple) */ #ifndef USE_ASSERT_CHECKING if (HeapTupleIsHeapOnly(heapTuple)) - return; + { + if (HeapTupleHeaderIsHOTWithSummaryUpdate(heapTuple->t_data)) + { + HeapTupleHeaderClearSummaryUpdate(heapTuple->t_data); + onlySummarized = true; + } + else + { + return; + } + } #endif /* @@ -135,13 +146,16 @@ CatalogIndexInsert(CatalogIndexState indstate, HeapTuple heapTuple) /* see earlier check above */ #ifdef USE_ASSERT_CHECKING - if (HeapTupleIsHeapOnly(heapTuple)) + if (HeapTupleIsHeapOnly(heapTuple) && !onlySummarized) { Assert(!ReindexIsProcessingIndex(RelationGetRelid(index))); continue; } #endif /* USE_ASSERT_CHECKING */ + if (onlySummarized && !indexInfo->ii_Summarizing) + continue; + /* * FormIndexDatum fills in its values and isnull parameters with the * appropriate values for the column(s) of the index. diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c index 35a1d3a774..c79711b0d5 100644 --- a/src/backend/commands/copyfrom.c +++ b/src/backend/commands/copyfrom.c @@ -347,7 +347,7 @@ CopyMultiInsertBufferFlush(CopyMultiInsertInfo *miinfo, recheckIndexes = ExecInsertIndexTuples(resultRelInfo, buffer->slots[i], estate, false, false, - NULL, NIL); + NULL, NIL, false); ExecARInsertTriggers(estate, resultRelInfo, slots[i], recheckIndexes, cstate->transition_capture); @@ -1104,7 +1104,8 @@ CopyFrom(CopyFromState cstate) false, false, NULL, - NIL); + NIL, + false); } /* AFTER ROW INSERT Triggers */ diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index cd30f15eba..af1a979760 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -177,6 +177,7 @@ CheckIndexCompatible(Oid oldId, Form_pg_am accessMethodForm; IndexAmRoutine *amRoutine; bool amcanorder; + bool amsummarizing; int16 *coloptions; IndexInfo *indexInfo; int numberOfAttributes; @@ -215,6 +216,7 @@ CheckIndexCompatible(Oid oldId, ReleaseSysCache(tuple); amcanorder = amRoutine->amcanorder; + amsummarizing = amRoutine->amsummarizing; /* * Compute the operator classes, collations, and exclusion operators for @@ -226,7 +228,8 @@ CheckIndexCompatible(Oid oldId, * ii_NumIndexKeyAttrs with same value. */ indexInfo = makeIndexInfo(numberOfAttributes, numberOfAttributes, - accessMethodId, NIL, NIL, false, false, false, false); + accessMethodId, NIL, NIL, false, false, + false, false, amsummarizing); typeObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid)); collationObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid)); classObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid)); @@ -531,6 +534,7 @@ DefineIndex(Oid relationId, Form_pg_am accessMethodForm; IndexAmRoutine *amRoutine; bool amcanorder; + bool amissummarizing; amoptions_function amoptions; bool partitioned; bool safe_index; @@ -838,6 +842,7 @@ DefineIndex(Oid relationId, amcanorder = amRoutine->amcanorder; amoptions = amRoutine->amoptions; + amissummarizing = amRoutine->amsummarizing; pfree(amRoutine); ReleaseSysCache(tuple); @@ -869,7 +874,8 @@ DefineIndex(Oid relationId, stmt->unique, stmt->nulls_not_distinct, !concurrent, - concurrent); + concurrent, + amissummarizing); typeObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid)); collationObjectId = (Oid *) palloc(numberOfAttributes * sizeof(Oid)); diff --git a/src/backend/executor/execIndexing.c b/src/backend/executor/execIndexing.c index 0cb0b8f111..f81274bf89 100644 --- a/src/backend/executor/execIndexing.c +++ b/src/backend/executor/execIndexing.c @@ -287,7 +287,8 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo, bool update, bool noDupErr, bool *specConflict, - List *arbiterIndexes) + List *arbiterIndexes, + bool onlySummarizing) { ItemPointer tupleid = &slot->tts_tid; List *result = NIL; @@ -343,6 +344,9 @@ ExecInsertIndexTuples(ResultRelInfo *resultRelInfo, if (!indexInfo->ii_ReadyForInserts) continue; + if (onlySummarizing && !indexInfo->ii_Summarizing) + continue; + /* Check for partial index */ if (indexInfo->ii_Predicate != NIL) { diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c index b000645d48..1a7c928e1e 100644 --- a/src/backend/executor/execReplication.c +++ b/src/backend/executor/execReplication.c @@ -445,7 +445,7 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo, if (resultRelInfo->ri_NumIndices > 0) recheckIndexes = ExecInsertIndexTuples(resultRelInfo, slot, estate, false, false, - NULL, NIL); + NULL, NIL, false); /* AFTER ROW INSERT Triggers */ ExecARInsertTriggers(estate, resultRelInfo, slot, @@ -493,7 +493,7 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo, if (!skip_tuple) { List *recheckIndexes = NIL; - bool update_indexes; + TU_UpdateIndexes update_indexes; /* Compute stored generated columns */ if (rel->rd_att->constr && @@ -510,10 +510,11 @@ ExecSimpleRelationUpdate(ResultRelInfo *resultRelInfo, simple_table_tuple_update(rel, tid, slot, estate->es_snapshot, &update_indexes); - if (resultRelInfo->ri_NumIndices > 0 && update_indexes) + if (resultRelInfo->ri_NumIndices > 0 && update_indexes != TUUI_None) recheckIndexes = ExecInsertIndexTuples(resultRelInfo, slot, estate, true, false, - NULL, NIL); + NULL, NIL, + update_indexes == TUUI_Summarizing); /* AFTER ROW UPDATE Triggers */ ExecARUpdateTriggers(estate, resultRelInfo, diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c index 171575cd73..a981479ae0 100644 --- a/src/backend/executor/nodeModifyTable.c +++ b/src/backend/executor/nodeModifyTable.c @@ -130,8 +130,8 @@ typedef struct ModifyTableContext typedef struct UpdateContext { bool updated; /* did UPDATE actually occur? */ - bool updateIndexes; /* index update required? */ bool crossPartUpdate; /* was it a cross-partition update? */ + TU_UpdateIndexes updateIndexes; /* Which index updates are required? */ } UpdateContext; @@ -1024,7 +1024,8 @@ ExecInsert(ModifyTableContext *context, recheckIndexes = ExecInsertIndexTuples(resultRelInfo, slot, estate, false, true, &specConflict, - arbiterIndexes); + arbiterIndexes, + false); /* adjust the tuple's state accordingly */ table_tuple_complete_speculative(resultRelationDesc, slot, @@ -1063,7 +1064,8 @@ ExecInsert(ModifyTableContext *context, if (resultRelInfo->ri_NumIndices > 0) recheckIndexes = ExecInsertIndexTuples(resultRelInfo, slot, estate, false, - false, NULL, NIL); + false, NULL, NIL, + false); } } @@ -1994,11 +1996,12 @@ ExecUpdateEpilogue(ModifyTableContext *context, UpdateContext *updateCxt, ModifyTableState *mtstate = context->mtstate; /* insert index entries for tuple if necessary */ - if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes) + if (resultRelInfo->ri_NumIndices > 0 && updateCxt->updateIndexes != TUUI_None) recheckIndexes = ExecInsertIndexTuples(resultRelInfo, slot, context->estate, true, false, - NULL, NIL); + NULL, NIL, + updateCxt->updateIndexes == TUUI_Summarizing); /* AFTER ROW UPDATE Triggers */ ExecARUpdateTriggers(context->estate, resultRelInfo, diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c index 41e26a0fe6..29d7517067 100644 --- a/src/backend/nodes/makefuncs.c +++ b/src/backend/nodes/makefuncs.c @@ -742,7 +742,8 @@ make_ands_implicit(Expr *clause) */ IndexInfo * makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions, - List *predicates, bool unique, bool nulls_not_distinct, bool isready, bool concurrent) + List *predicates, bool unique, bool nulls_not_distinct, + bool isready, bool concurrent, bool summarizing) { IndexInfo *n = makeNode(IndexInfo); @@ -756,6 +757,7 @@ makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions, n->ii_CheckedUnchanged = false; n->ii_IndexUnchanged = false; n->ii_Concurrent = concurrent; + n->ii_Summarizing = summarizing; /* expressions */ n->ii_Expressions = expressions; diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c index 43f14c233d..bbc01994d8 100644 --- a/src/backend/utils/cache/relcache.c +++ b/src/backend/utils/cache/relcache.c @@ -2443,6 +2443,7 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc) bms_free(relation->rd_pkattr); bms_free(relation->rd_idattr); bms_free(relation->rd_hotblockingattr); + bms_free(relation->rd_summarizedattr); if (relation->rd_pubdesc) pfree(relation->rd_pubdesc); if (relation->rd_options) @@ -5108,6 +5109,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) Bitmapset *pkindexattrs; /* columns in the primary index */ Bitmapset *idindexattrs; /* columns in the replica identity */ Bitmapset *hotblockingattrs; /* columns with HOT blocking indexes */ + Bitmapset *summarizedattrs; /* columns with summarizing indexes */ List *indexoidlist; List *newindexoidlist; Oid relpkindex; @@ -5128,6 +5130,8 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind) return bms_copy(relation->rd_idattr); case INDEX_ATTR_BITMAP_HOT_BLOCKING: return bms_copy(relation->rd_hotblockingattr); + case INDEX_ATTR_BITMAP_SUMMARIZED: + return bms_copy(relation->rd_summarizedattr); default: elog(ERROR, "unknown attrKind %u", attrKind); } @@ -5171,6 +5175,7 @@ restart: pkindexattrs = NULL; idindexattrs = NULL; hotblockingattrs = NULL; + summarizedattrs = NULL; foreach(l, indexoidlist) { Oid indexOid = lfirst_oid(l); @@ -5235,9 +5240,12 @@ restart: */ if (attrnum != 0) { - if (indexDesc->rd_indam->amhotblocking) + if (indexDesc->rd_indam->amsummarizing) + summarizedattrs = bms_add_member(summarizedattrs, + attrnum - FirstLowInvalidHeapAttributeNumber); + else hotblockingattrs = bms_add_member(hotblockingattrs, - attrnum - FirstLowInvalidHeapAttributeNumber); + attrnum - FirstLowInvalidHeapAttributeNumber); if (isKey && i < indexDesc->rd_index->indnkeyatts) uindexattrs = bms_add_member(uindexattrs, @@ -5254,12 +5262,14 @@ restart: } /* Collect all attributes used in expressions, too */ - if (indexDesc->rd_indam->amhotblocking) + if (indexDesc->rd_indam->amsummarizing) + pull_varattnos(indexExpressions, 1, &summarizedattrs); + else pull_varattnos(indexExpressions, 1, &hotblockingattrs); /* * Collect all attributes in the index predicate, too. We have to ignore - * amhotblocking flag, because the row might become indexable, in which + * amsummarizing flag, because the row might become indexable, in which * case we have to add it to the index. */ pull_varattnos(indexPredicate, 1, &hotblockingattrs); @@ -5291,6 +5301,7 @@ restart: bms_free(pkindexattrs); bms_free(idindexattrs); bms_free(hotblockingattrs); + bms_free(summarizedattrs); goto restart; } @@ -5304,6 +5315,8 @@ restart: relation->rd_idattr = NULL; bms_free(relation->rd_hotblockingattr); relation->rd_hotblockingattr = NULL; + bms_free(relation->rd_summarizedattr); + relation->rd_summarizedattr = NULL; /* * Now save copies of the bitmaps in the relcache entry. We intentionally @@ -5317,6 +5330,7 @@ restart: relation->rd_pkattr = bms_copy(pkindexattrs); relation->rd_idattr = bms_copy(idindexattrs); relation->rd_hotblockingattr = bms_copy(hotblockingattrs); + relation->rd_summarizedattr = bms_copy(summarizedattrs); relation->rd_attrsvalid = true; MemoryContextSwitchTo(oldcxt); @@ -5331,6 +5345,8 @@ restart: return idindexattrs; case INDEX_ATTR_BITMAP_HOT_BLOCKING: return hotblockingattrs; + case INDEX_ATTR_BITMAP_SUMMARIZED: + return summarizedattrs; default: elog(ERROR, "unknown attrKind %u", attrKind); return NULL; diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h index a382551a98..ba357bd34d 100644 --- a/src/include/access/amapi.h +++ b/src/include/access/amapi.h @@ -244,8 +244,8 @@ typedef struct IndexAmRoutine bool amcaninclude; /* does AM use maintenance_work_mem? */ bool amusemaintenanceworkmem; - /* does AM block HOT update? */ - bool amhotblocking; + /* does AM store information only at (at least) block level? */ + bool amsummarizing; /* OR of parallel vacuum flags. See vacuum.h for flags. */ uint8 amparallelvacuumoptions; /* type of data stored in index, or InvalidOid if variable */ diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h index 51a60eda08..80d086076b 100644 --- a/src/include/access/htup_details.h +++ b/src/include/access/htup_details.h @@ -290,6 +290,16 @@ struct HeapTupleHeaderData */ #define HEAP_TUPLE_HAS_MATCH HEAP_ONLY_TUPLE /* tuple has a join match */ +/* + * HEAP_TUPLE_SUMMARIZING_UPDATED is a temporary flag used to signal that + * of the indexed columns, only columns used in summarizing indexes were + * updated. It is only used on the in-memory newly inserted updated tuple, + * which can't have been HOT updated at this point, so this should never + * pose an issue. + */ +#define HEAP_TUPLE_SUMMARIZING_UPDATED HEAP_HOT_UPDATED + + /* * HeapTupleHeader accessor macros * @@ -539,6 +549,25 @@ do { \ (((tup)->t_infomask & HEAP_HASEXTERNAL) != 0) +#define HeapTupleHeaderIsHOTWithSummaryUpdate(tup) \ +( \ + ((tup)->t_infomask2 & HEAP_ONLY_TUPLE) != 0 && \ + ((tup)->t_infomask2 & HEAP_TUPLE_SUMMARIZING_UPDATED) != 0 \ +) + +#define HeapTupleHeaderSetSummaryUpdate(tup) \ +( \ + (tup)->t_infomask2 |= HEAP_TUPLE_SUMMARIZING_UPDATED \ +) + +#define HeapTupleHeaderClearSummaryUpdate(tup) \ +( \ + (tup)->t_infomask2 &= ~HEAP_TUPLE_SUMMARIZING_UPDATED \ +) + + + + /* * BITMAPLEN(NATTS) - * Computes size of null bitmap given number of data columns. diff --git a/src/include/access/tableam.h b/src/include/access/tableam.h index fe869c6c18..5ead2ec0fd 100644 --- a/src/include/access/tableam.h +++ b/src/include/access/tableam.h @@ -102,6 +102,19 @@ typedef enum TM_Result TM_WouldBlock } TM_Result; +/* + * Result codes for table_update(..., update_indexes*..). + * Used to determine which indexes to update. + */ +typedef enum TU_UpdateIndexes { + /* No indexed columns were updated (incl. TID addressing of tuple) */ + TUUI_None = 0, + /* A non-summarizing indexed column was updated, or the TID has changed */ + TUUI_All = 1, + /* Only summarized columns were updated, TID is unchanged */ + TUUI_Summarizing = 2 +} TU_UpdateIndexes; + /* * When table_tuple_update, table_tuple_delete, or table_tuple_lock fail * because the target tuple is already outdated, they fill in this struct to @@ -526,7 +539,7 @@ typedef struct TableAmRoutine bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode, - bool *update_indexes); + TU_UpdateIndexes *update_indexes); /* see table_tuple_lock() for reference about parameters */ TM_Result (*tuple_lock) (Relation rel, @@ -1506,7 +1519,7 @@ static inline TM_Result table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot, CommandId cid, Snapshot snapshot, Snapshot crosscheck, bool wait, TM_FailureData *tmfd, LockTupleMode *lockmode, - bool *update_indexes) + TU_UpdateIndexes *update_indexes) { return rel->rd_tableam->tuple_update(rel, otid, slot, cid, snapshot, crosscheck, @@ -2029,7 +2042,7 @@ extern void simple_table_tuple_delete(Relation rel, ItemPointer tid, Snapshot snapshot); extern void simple_table_tuple_update(Relation rel, ItemPointer otid, TupleTableSlot *slot, Snapshot snapshot, - bool *update_indexes); + TU_UpdateIndexes *update_indexes); /* ---------------------------------------------------------------------------- diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h index 873772f188..518bbf98d0 100644 --- a/src/include/executor/executor.h +++ b/src/include/executor/executor.h @@ -617,7 +617,8 @@ extern List *ExecInsertIndexTuples(ResultRelInfo *resultRelInfo, TupleTableSlot *slot, EState *estate, bool update, bool noDupErr, - bool *specConflict, List *arbiterIndexes); + bool *specConflict, List *arbiterIndexes, + bool onlySummarizing); extern bool ExecCheckIndexConstraints(ResultRelInfo *resultRelInfo, TupleTableSlot *slot, EState *estate, ItemPointer conflictTid, diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h index 94b191f8ae..801a4356e3 100644 --- a/src/include/nodes/execnodes.h +++ b/src/include/nodes/execnodes.h @@ -146,6 +146,7 @@ typedef struct ExprState * IndexUnchanged aminsert hint, cached for retail inserts * Concurrent are we doing a concurrent index build? * BrokenHotChain did we detect any broken HOT chains? + * Summarizing is it summarizing? * ParallelWorkers # of workers requested (excludes leader) * Am Oid of index AM * AmCache private cache area for index AM @@ -179,6 +180,7 @@ typedef struct IndexInfo bool ii_IndexUnchanged; bool ii_Concurrent; bool ii_BrokenHotChain; + bool ii_Summarizing; int ii_ParallelWorkers; Oid ii_Am; void *ii_AmCache; diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h index c717468eb3..b631c363ba 100644 --- a/src/include/nodes/makefuncs.h +++ b/src/include/nodes/makefuncs.h @@ -96,7 +96,9 @@ extern List *make_ands_implicit(Expr *clause); extern IndexInfo *makeIndexInfo(int numattrs, int numkeyattrs, Oid amoid, List *expressions, List *predicates, - bool unique, bool nulls_not_distinct, bool isready, bool concurrent); + bool unique, bool nulls_not_distinct, + bool isready, bool concurrent, + bool summarizing); extern DefElem *makeDefElem(char *name, Node *arg, int location); extern DefElem *makeDefElemExtended(char *nameSpace, char *name, Node *arg, diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h index eadbd00904..d41bb79881 100644 --- a/src/include/utils/rel.h +++ b/src/include/utils/rel.h @@ -160,6 +160,7 @@ typedef struct RelationData Bitmapset *rd_pkattr; /* cols included in primary key */ Bitmapset *rd_idattr; /* included in replica identity index */ Bitmapset *rd_hotblockingattr; /* cols blocking HOT update */ + Bitmapset *rd_summarizedattr; /* cols indexed by block-or-larger summarizing indexes */ PublicationDesc *rd_pubdesc; /* publication descriptor, or NULL */ diff --git a/src/include/utils/relcache.h b/src/include/utils/relcache.h index 86dddbd975..e7cdd11e7b 100644 --- a/src/include/utils/relcache.h +++ b/src/include/utils/relcache.h @@ -58,7 +58,8 @@ typedef enum IndexAttrBitmapKind INDEX_ATTR_BITMAP_KEY, INDEX_ATTR_BITMAP_PRIMARY_KEY, INDEX_ATTR_BITMAP_IDENTITY_KEY, - INDEX_ATTR_BITMAP_HOT_BLOCKING + INDEX_ATTR_BITMAP_HOT_BLOCKING, + INDEX_ATTR_BITMAP_SUMMARIZED } IndexAttrBitmapKind; extern Bitmapset *RelationGetIndexAttrBitmap(Relation relation, diff --git a/src/test/modules/dummy_index_am/dummy_index_am.c b/src/test/modules/dummy_index_am/dummy_index_am.c index 22578b6246..3785c6c9bc 100644 --- a/src/test/modules/dummy_index_am/dummy_index_am.c +++ b/src/test/modules/dummy_index_am/dummy_index_am.c @@ -298,7 +298,7 @@ dihandler(PG_FUNCTION_ARGS) amroutine->amcanparallel = false; amroutine->amcaninclude = false; amroutine->amusemaintenanceworkmem = false; - amroutine->amhotblocking = true; + amroutine->amsummarizing = false; amroutine->amparallelvacuumoptions = VACUUM_OPTION_NO_PARALLEL; amroutine->amkeytype = InvalidOid; diff --git a/src/test/regress/expected/brin.out b/src/test/regress/expected/brin.out index ed7879f583..8f4698aae4 100644 --- a/src/test/regress/expected/brin.out +++ b/src/test/regress/expected/brin.out @@ -607,8 +607,8 @@ SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100; CREATE TABLE brin_hot ( id integer PRIMARY KEY, val integer NOT NULL -) WITH (autovacuum_enabled = off, fillfactor = 70); -INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235); +) WITH (autovacuum_enabled = off, fillfactor = 10); +INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 100000); CREATE INDEX val_brin ON brin_hot using brin(val); UPDATE brin_hot SET val = -3 WHERE id = 42; -- ensure pending stats are flushed @@ -624,4 +624,21 @@ SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid); 1 (1 row) +-- VACUUM and ANALYZE so that we use the BRIN index on the next query +VACUUM FREEZE ANALYZE brin_hot; +EXPLAIN (COSTS OFF) SELECT * FROM brin_hot WHERE val < 0; + QUERY PLAN +------------------------------------- + Bitmap Heap Scan on brin_hot + Recheck Cond: (val < 0) + -> Bitmap Index Scan on val_brin + Index Cond: (val < 0) +(4 rows) + +SELECT * FROM brin_hot WHERE val < 0; + id | val +----+----- + 42 | -3 +(1 row) + DROP TABLE brin_hot; diff --git a/src/test/regress/sql/brin.sql b/src/test/regress/sql/brin.sql index 920e053249..9feb0f47ed 100644 --- a/src/test/regress/sql/brin.sql +++ b/src/test/regress/sql/brin.sql @@ -532,9 +532,9 @@ SELECT COUNT(*) FROM brin_hot_2 WHERE a = 2 AND b = 100; CREATE TABLE brin_hot ( id integer PRIMARY KEY, val integer NOT NULL -) WITH (autovacuum_enabled = off, fillfactor = 70); +) WITH (autovacuum_enabled = off, fillfactor = 10); -INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 235); +INSERT INTO brin_hot SELECT *, 0 FROM generate_series(1, 100000); CREATE INDEX val_brin ON brin_hot using brin(val); UPDATE brin_hot SET val = -3 WHERE id = 42; @@ -544,4 +544,10 @@ SELECT pg_stat_force_next_flush(); SELECT pg_stat_get_tuples_hot_updated('brin_hot'::regclass::oid); +-- VACUUM and ANALYZE so that we use the BRIN index on the next query +VACUUM FREEZE ANALYZE brin_hot; + +EXPLAIN (COSTS OFF) SELECT * FROM brin_hot WHERE val < 0; +SELECT * FROM brin_hot WHERE val < 0; + DROP TABLE brin_hot; -- 2.30.2