Re: partial heap only tuples
On 11/4/21, 3:24 AM, "Daniel Gustafsson" wrote: > As no update has been posted, the patch still doesn't apply. I'm marking this > patch Returned with Feedback, feel free to open a new entry for an updated > patch. Thanks. I have been working on this intermittently, and I hope to post a more complete proof-of-concept in the near future. I'll create a new commitfest entry once that's done. Nathan
Re: partial heap only tuples
> On 14 Jul 2021, at 13:34, vignesh C wrote: > The patch does not apply on Head anymore, could you rebase and post a > patch. I'm changing the status to "Waiting for Author". As no update has been posted, the patch still doesn't apply. I'm marking this patch Returned with Feedback, feel free to open a new entry for an updated patch. -- Daniel Gustafsson https://vmware.com/
Re: partial heap only tuples
On Tue, Mar 9, 2021 at 12:09 AM Bossart, Nathan wrote: > > On 3/8/21, 10:16 AM, "Ibrar Ahmed" wrote: > > On Wed, Feb 24, 2021 at 3:22 AM Bossart, Nathan wrote: > >> On 2/10/21, 2:43 PM, "Bruce Momjian" wrote: > >>> I wonder if you should create a Postgres wiki page to document all of > >>> this. I agree PG 15 makes sense. I would like to help with this if I > >>> can. I will need to study this email more later. > >> > >> I've started the wiki page for this: > >> > >>https://wiki.postgresql.org/wiki/Partial_Heap_Only_Tuples > >> > >> Nathan > > > > The regression test case (partial-index) is failing > > > > https://cirrus-ci.com/task/5310522716323840 > > This patch is intended as a proof-of-concept of some basic pieces of > the project. I'm working on a new patch set that should be more > suitable for community review. The patch does not apply on Head anymore, could you rebase and post a patch. I'm changing the status to "Waiting for Author". Regards, Vignesh
Re: partial heap only tuples
On Mon, Apr 19, 2021 at 5:09 PM Bruce Momjian wrote: > > A diversity of strategies with fallback behavior is sometimes the best > > strategy. Don't underestimate the contribution of rare and seemingly > > insignificant adverse events. Consider the lifecycle of the data over > > That is an intersting point --- we often focus on optimizing frequent > operations, but preventing rare but expensive-in-aggregate events from > happening is also useful. Right. Similarly, we sometimes focus on adding an improvement, overlooking more promising opportunities to subtract a disimprovement. Apparently this is a well known tendency: https://www.scientificamerican.com/article/our-brain-typically-overlooks-this-brilliant-problem-solving-strategy/ I believe that it's particularly important to consider subtractive approaches with a complex system. This has sometimes worked well for me as a conscious and deliberate strategy. -- Peter Geoghegan
Re: partial heap only tuples
On Sun, Apr 18, 2021 at 04:27:15PM -0700, Peter Geoghegan wrote: > Everybody tends to talk about HOT as if it works perfectly once you > make some modest assumptions, such as "there are no long-running > transactions", and "no UPDATEs will logically modify indexed columns". > But I tend to doubt that that's truly the case -- I think that there > are still pathological cases where HOT cannot keep the total table > size stable in the long run due to subtle effects that eventually > aggregate into significant issues, like heap fragmentation. Ask Jan > Wieck about the stability of some of the TPC-C/BenchmarkSQL tables to ... > We might have successfully fit the successor heap tuple version a > million times before just by HOT pruning, and yet currently we give up > just because it didn't work on the one millionth and first occasion -- > don't you think that's kind of silly? We may be able to afford having > a fallback strategy that is relatively expensive, provided it is > rarely used. And it might be very effective in the aggregate, despite > being rarely used -- it might provide us just what we were missing > before. Just try harder when you run into a problem every once in a > blue moon! > > A diversity of strategies with fallback behavior is sometimes the best > strategy. Don't underestimate the contribution of rare and seemingly > insignificant adverse events. Consider the lifecycle of the data over That is an intersting point --- we often focus on optimizing frequent operations, but preventing rare but expensive-in-aggregate events from happening is also useful. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Re: partial heap only tuples
On Tue, Feb 9, 2021 at 10:48 AM Bossart, Nathan wrote: > I'm hoping to gather some early feedback on a heap optimization I've > been working on. In short, I'm hoping to add "partial heap only > tuple" (PHOT) support, which would allow you to skip updating indexes > for unchanged columns even when other indexes require updates. Today, > HOT works wonders when no indexed columns are updated. However, as > soon as you touch one indexed column, you lose that optimization > entirely, as you must update every index on the table. The resulting > performance impact is a pain point for many of our (AWS's) enterprise > customers, so we'd like to lend a hand for some improvements in this > area. For workloads involving a lot of columns and a lot of indexes, > an optimization like PHOT can make a huge difference. I'm aware that > there was a previous attempt a few years ago to add a similar > optimization called WARM [0] [1]. However, I only noticed this > previous effort after coming up with the design for PHOT, so I ended > up taking a slightly different approach. I am also aware of a couple > of recent nbtree improvements that may mitigate some of the impact of > non-HOT updates [2] [3], but I am hoping that PHOT serves as a nice > complement to those. I've attached a very early proof-of-concept > patch with the design described below. I would like to share some thoughts that I have about how I think about optimizations like PHOT, and how they might fit together with my own work -- particularly the nbtree bottom-up index deletion feature you referenced. My remarks could equally well apply to WARM. Ordinarily this is the kind of thing that would be too hand-wavey for the mailing list, but we don't have the luxury of in-person communication right now. Everybody tends to talk about HOT as if it works perfectly once you make some modest assumptions, such as "there are no long-running transactions", and "no UPDATEs will logically modify indexed columns". But I tend to doubt that that's truly the case -- I think that there are still pathological cases where HOT cannot keep the total table size stable in the long run due to subtle effects that eventually aggregate into significant issues, like heap fragmentation. Ask Jan Wieck about the stability of some of the TPC-C/BenchmarkSQL tables to get one example of this. There is no reason to believe that PHOT will help with that. Maybe that's okay, but I would think carefully about what that means if I were undertaking this work. Ensuring stability in the on-disk size of tables in cases where the size of the logical database is stable should be an important goal of a project like PHOT or HOT. If you want to get a better sense of how these inefficiencies might happen, I suggest looking into using recently added autovacuum logging to analyze how well HOT works today, using the technique that I go into here: https://postgr.es/m/cah2-wzkju+nibskzunbdpz6trse+aqvupae+xgm8zvob4wq...@mail.gmail.com Small inefficiencies in the on-disk structure have a tendency to aggregate over time, at least when there is no possible way to reverse them. The bottom-up index deletion stuff is very effective as a backstop against index bloat, because things are generally very non-linear. The cost of an unnecessary page split is very high, and permanent. But we can make it cheap to *try* to avoid that using fairly simple heuristics. We can be reasonably confident that we're about to split the page unnecessarily, and use cues that ramp up the number of heap page accesses as needed. We ramp up during a bottom-up index deletion, as we manage to free some index tuples as a result of previous heap page accesses. This works very well because we can intervene very selectively. We aren't interested in deleting index tuples unless and until we really have to, and in general there tends to be quite a bit of free space to temporarily store extra version duplicates -- that's what most index pages look like, even on the busiest of databases. It's possible for the bottom-up index deletion mechanism to be invoked very infrequently, and yet make a huge difference. And when it fails to free anything, it fails permanently for that particular leaf page (because it splits) -- so now we have lots of space for future index tuple insertions that cover the original page's key space. We won't thrash. My intuition is that similar principles can be applied inside heapam. Failing to fit related versions on a heap page (having managed to do so for hours or days before that point) is more or less the heap page equivalent of a leaf page split from version churn (this is the pathology that bottom-up index deletion targets). For example, we could have a fall back mode that compresses old versions that is used if and only if heap pruning was attempted but then failed. We should always try to avoid migrating to a new heap page, because that amounts to a permanent solution to a temporary problem. We should perhaps make the
Re: partial heap only tuples
On Tue, Mar 9, 2021 at 09:33:31PM +, Bossart, Nathan wrote: > I'm cautiously optimistic that index creation and deletion will not > require too much extra work. For example, if a new index needs to > point to a partial heap only tuple, it can do so (unlike HOT, which > would require that the new index point to the root of the chain). The > modified-columns bitmaps could include the entire set of modified > columns (not just the indexed ones), so no additional changes would > need to be made there. Furthermore, I'm anticipating that the > modified-columns bitmaps will end up only being used with the > redirected LPs to help reduce heap bloat after single-page vacuuming. > In that case, new indexes would probably avoid the existing bitmaps > anyway. Yes, that would probably work, sure. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: partial heap only tuples
On 3/9/21, 8:24 AM, "Bruce Momjian" wrote: > On Mon, Feb 15, 2021 at 08:19:40PM +, Bossart, Nathan wrote: >> Yeah, this is something I'm concerned about. I think adding a bitmap >> of modified columns to the header of PHOT-updated tuples improves >> matters quite a bit, even for single-page vacuuming. Following is a >> strategy I've been developing (there may still be some gaps). Here's >> a basic PHOT chain where all tuples are visible and the last one has >> not been deleted or updated: >> >> idx10 1 2 3 >> idx20 1 2 >> idx30 2 3 >> lp 1 2 3 4 5 >> tuple (0,0,0) (0,1,1) (2,2,1) (2,2,2) (3,2,3) >> bitmap -xx xx- --x x-x > > First, I want to continue encouraging you to work on this because I > think it can yield big improvements. Second, I like the wiki you > created. Third, the diagram above seems to be more meaningful if read > from the bottom-up. I suggest you reorder it on the wiki so it can be > read top-down, maybe: > >> lp 1 2 3 4 5 >> tuple (0,0,0) (0,1,1) (2,2,1) (2,2,2) (3,2,3) >> bitmap -xx xx- --x x-x >> idx10 1 2 3 >> idx20 1 2 >> idx30 2 3 I appreciate the feedback and the words of encouragement. I'll go ahead and flip the diagrams like you suggested. I'm planning on publishing a larger round of edits to the wiki once the patch set is ready to share. There are a few changes to the design that I've picked up along the way. > Fourth, I know in the wiki you said create/drop index needs more > research, but I suggest you avoid any design that will be overly complex > for create/drop index. For example, a per-row bitmap that is based on > what indexes exist at time of row creation might cause unacceptable > problems in handling create/drop index. Would you number indexes? I am > not saying you have to solve all the problems now, but you have to keep > your eye on obstacles that might block your progress later. I am agreed on avoiding an overly complex design. This project introduces a certain amount of inherent complexity, so one of my main goals is ensuring that it's easy to reason about each piece. I'm cautiously optimistic that index creation and deletion will not require too much extra work. For example, if a new index needs to point to a partial heap only tuple, it can do so (unlike HOT, which would require that the new index point to the root of the chain). The modified-columns bitmaps could include the entire set of modified columns (not just the indexed ones), so no additional changes would need to be made there. Furthermore, I'm anticipating that the modified-columns bitmaps will end up only being used with the redirected LPs to help reduce heap bloat after single-page vacuuming. In that case, new indexes would probably avoid the existing bitmaps anyway. Nathan
Re: partial heap only tuples
On Mon, Feb 15, 2021 at 08:19:40PM +, Bossart, Nathan wrote: > Yeah, this is something I'm concerned about. I think adding a bitmap > of modified columns to the header of PHOT-updated tuples improves > matters quite a bit, even for single-page vacuuming. Following is a > strategy I've been developing (there may still be some gaps). Here's > a basic PHOT chain where all tuples are visible and the last one has > not been deleted or updated: > > idx10 1 2 3 > idx20 1 2 > idx30 2 3 > lp 1 2 3 4 5 > tuple (0,0,0) (0,1,1) (2,2,1) (2,2,2) (3,2,3) > bitmap -xx xx- --x x-x First, I want to continue encouraging you to work on this because I think it can yield big improvements. Second, I like the wiki you created. Third, the diagram above seems to be more meaningful if read from the bottom-up. I suggest you reorder it on the wiki so it can be read top-down, maybe: > lp 1 2 3 4 5 > tuple (0,0,0) (0,1,1) (2,2,1) (2,2,2) (3,2,3) > bitmap -xx xx- --x x-x > idx10 1 2 3 > idx20 1 2 > idx30 2 3 Fourth, I know in the wiki you said create/drop index needs more research, but I suggest you avoid any design that will be overly complex for create/drop index. For example, a per-row bitmap that is based on what indexes exist at time of row creation might cause unacceptable problems in handling create/drop index. Would you number indexes? I am not saying you have to solve all the problems now, but you have to keep your eye on obstacles that might block your progress later. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee
Re: partial heap only tuples
On 3/8/21, 10:16 AM, "Ibrar Ahmed" wrote: > On Wed, Feb 24, 2021 at 3:22 AM Bossart, Nathan wrote: >> On 2/10/21, 2:43 PM, "Bruce Momjian" wrote: >>> I wonder if you should create a Postgres wiki page to document all of >>> this. I agree PG 15 makes sense. I would like to help with this if I >>> can. I will need to study this email more later. >> >> I've started the wiki page for this: >> >>https://wiki.postgresql.org/wiki/Partial_Heap_Only_Tuples >> >> Nathan > > The regression test case (partial-index) is failing > > https://cirrus-ci.com/task/5310522716323840 This patch is intended as a proof-of-concept of some basic pieces of the project. I'm working on a new patch set that should be more suitable for community review. Nathan
Re: partial heap only tuples
On Wed, Feb 24, 2021 at 3:22 AM Bossart, Nathan wrote: > On 2/10/21, 2:43 PM, "Bruce Momjian" wrote: > > I wonder if you should create a Postgres wiki page to document all of > > this. I agree PG 15 makes sense. I would like to help with this if I > > can. I will need to study this email more later. > > I've started the wiki page for this: > > https://wiki.postgresql.org/wiki/Partial_Heap_Only_Tuples > > Nathan > > The regression test case (partial-index) is failing https://cirrus-ci.com/task/5310522716323840 === ./src/test/isolation/output_iso/regression.diffs === diff -U3 /tmp/cirrus-ci-build/src/test/isolation/expected/partial-index.out /tmp/cirrus-ci-build/src/test/isolation/output_iso/results/partial-index.out --- /tmp/cirrus-ci-build/src/test/isolation/expected/partial-index.out 2021-03-06 23:11:08.018868833 + +++ /tmp/cirrus-ci-build/src/test/isolation/output_iso/results/partial-index.out 2021-03-06 23:26:15.857027075 + @@ -30,6 +30,8 @@ 6 a 1 7 a 1 8 a 1 +9 a 2 +10 a 2 step c2: COMMIT; starting permutation: rxy1 wx1 wy2 c1 rxy2 c2 @@ -83,6 +85,7 @@ 6 a 1 7 a 1 8 a 1 +9 a 2 10 a 1 step c1: COMMIT; Can you please take a look at that? -- Ibrar Ahmed
Re: partial heap only tuples
On 2/10/21, 2:43 PM, "Bruce Momjian" wrote: > I wonder if you should create a Postgres wiki page to document all of > this. I agree PG 15 makes sense. I would like to help with this if I > can. I will need to study this email more later. I've started the wiki page for this: https://wiki.postgresql.org/wiki/Partial_Heap_Only_Tuples Nathan
Re: partial heap only tuples
On 2/13/21, 8:26 AM, "Andres Freund" wrote: > On 2021-02-09 18:48:21 +, Bossart, Nathan wrote: >> In order to be eligible for cleanup, the final tuple in the >> corresponding PHOT/HOT chain must also be eligible for cleanup, or all >> indexes must have been updated later in the chain before any visible >> tuples. > > This sounds like it might be prohibitively painful. Adding effectively > unremovable bloat to remove other bloat is not an uncomplicated > premise. I think you'd really need a way to fully remove this as part of > vacuum for this to be viable. Yeah, this is something I'm concerned about. I think adding a bitmap of modified columns to the header of PHOT-updated tuples improves matters quite a bit, even for single-page vacuuming. Following is a strategy I've been developing (there may still be some gaps). Here's a basic PHOT chain where all tuples are visible and the last one has not been deleted or updated: idx10 1 2 3 idx20 1 2 idx30 2 3 lp 1 2 3 4 5 tuple (0,0,0) (0,1,1) (2,2,1) (2,2,2) (3,2,3) bitmap -xx xx- --x x-x For single-page vacuum, we take the following actions: 1. Starting at the root of the PHOT chain, create an OR'd bitmap of the chain. 2. Walk backwards, OR-ing the bitmaps. Stop when the bitmap matches the one from step 1. As we walk backwards, identify "key" tuples, which are tuples where the OR'd bitmap changes as you walk backwards. If the OR'd bitmap does not include all columns for the table, also include the root of the PHOT chain as a key tuple. 3. Redirect each key tuple to the next key tuple. 4. For all but the first key tuple, OR the bitmaps of all pruned tuples from each key tuple (exclusive) to the next key tuple (inclusive) and store the result in the bitmap of the next key tuple. 5. Mark all line pointers for all non-key tuples as dead. Storage can be removed for all tuples except the last one, but we must leave around the bitmap for all key tuples except for the first one. After this, our basic PHOT chain looks like this: idx10 1 2 3 idx20 1 2 idx30 2 3 lp X X 3->5X 5 tuple (3,2,3) bitmap x-x Without PHOT, this intermediate state would have 15 index tuples, 5 line pointers, and 1 heap tuples. With PHOT, we have 10 index tuples, 5 line pointers, 1 heap tuple, and 1 bitmap. When we vacuum the indexes, we can reclaim the dead line pointers and remove the associated index tuples: idx13 idx22 idx32 3 lp 3->55 tuple (3,2,3) bitmap x-x Without PHOT, this final state would have 3 index tuples, 1 line pointer, and 1 heap tuple. With PHOT, we have 4 index tuples, 2 line pointers, 1 heap tuple, and 1 bitmap. Overall, we still end up keeping around more line pointers and tuple headers (for the bitmaps), but maybe that is good enough. I think the next step here would be to find a way to remove some of the unnecessary index tuples and adjust the remaining ones to point to the last line pointer in the PHOT chain. Nathan
Re: partial heap only tuples
Hi, On 2021-02-09 18:48:21 +, Bossart, Nathan wrote: > In order to be eligible for cleanup, the final tuple in the > corresponding PHOT/HOT chain must also be eligible for cleanup, or all > indexes must have been updated later in the chain before any visible > tuples. This sounds like it might be prohibitively painful. Adding effectively unremovable bloat to remove other bloat is not an uncomplicated premise. I think you'd really need a way to fully remove this as part of vacuum for this to be viable. Greetings, Andres Freund
Re: partial heap only tuples
On 2/10/21, 2:43 PM, "Bruce Momjian" wrote: > On Tue, Feb 9, 2021 at 06:48:21PM +, Bossart, Nathan wrote: >> HOT works wonders when no indexed columns are updated. However, as >> soon as you touch one indexed column, you lose that optimization >> entirely, as you must update every index on the table. The resulting >> performance impact is a pain point for many of our (AWS's) enterprise >> customers, so we'd like to lend a hand for some improvements in this >> area. For workloads involving a lot of columns and a lot of indexes, >> an optimization like PHOT can make a huge difference. I'm aware that >> there was a previous attempt a few years ago to add a similar >> optimization called WARM [0] [1]. However, I only noticed this >> previous effort after coming up with the design for PHOT, so I ended >> up taking a slightly different approach. I am also aware of a couple >> of recent nbtree improvements that may mitigate some of the impact of >> non-HOT updates [2] [3], but I am hoping that PHOT serves as a nice >> complement to those. I've attached a very early proof-of-concept >> patch with the design described below. > > How is your approach different from those of [0] and [1]? It is > interesting you still see performance benefits even after the btree > duplication improvements. Did you test with those improvements? I believe one of the main differences is that index tuples will point to the corresponding PHOT tuple instead of the root of the HOT/PHOT chain. I'm sure there are other differences. I plan on giving those two long threads another read-through in the near future. I made sure that the btree duplication improvements were applied for my benchmarking. IIUC those don't alleviate the requirement that you insert all index tuples for non-HOT updates, so PHOT can still provide some added benefits there. >> Next, I'll go into the design a bit. I've commandeered the two >> remaining bits in t_infomask2 to use as HEAP_PHOT_UPDATED and >> HEAP_PHOT_TUPLE. These are analogous to the HEAP_HOT_UPDATED and >> HEAP_ONLY_TUPLE bits. (If there are concerns about exhausting the >> t_infomask2 bits, I think we could only use one of the remaining bits >> as a "modifier" bit on the HOT ones. I opted against that for the >> proof-of-concept patch to keep things simple.) When creating a PHOT >> tuple, we only create new index tuples for updated columns. These new >> index tuples point to the PHOT tuple. Following is a simple >> demonstration with a table with two integer columns, each with its own >> index: > > Whatever solution you have, you have to be able to handle > adding/removing columns, and adding/removing indexes. I admittedly have not thought too much about the implications of adding/removing columns and indexes for PHOT yet, but that's definitely an important part of this project that I need to look into. I see that HOT has some special handling for commands like CREATE INDEX that I can reference. >> When it is time to scan through a PHOT chain, there are a couple of >> things to account for. Sequential scans work out-of-the-box thanks to >> the visibility rules, but other types of scans like index scans >> require additional checks. If you encounter a PHOT chain when >> performing an index scan, you should only continue following the chain >> as long as none of the columns the index indexes are modified. If the >> scan does encounter such a modification, we stop following the chain >> and continue with the index scan. Even if there is a tuple in that > > I think in patch [0] and [1], if an index column changes, all the > indexes had to be inserted into, while you seem to require inserts only > into the index that needs it. Is that correct? Right, PHOT only requires new index tuples for the modified columns. However, I was under the impression that WARM aimed to do the same thing. I might be misunderstanding your question. > I wonder if you should create a Postgres wiki page to document all of > this. I agree PG 15 makes sense. I would like to help with this if I > can. I will need to study this email more later. Thanks for taking a look. I think a wiki is a good idea for keeping track of the current state of the design. I'll look into that. Nathan
Re: partial heap only tuples
On Tue, Feb 9, 2021 at 06:48:21PM +, Bossart, Nathan wrote: > Hello, > > I'm hoping to gather some early feedback on a heap optimization I've > been working on. In short, I'm hoping to add "partial heap only > tuple" (PHOT) support, which would allow you to skip updating indexes > for unchanged columns even when other indexes require updates. Today, I think it is great you are working on this. I think it is a major way to improve performance and I have been disappointed it has not moved forward since 2016. > HOT works wonders when no indexed columns are updated. However, as > soon as you touch one indexed column, you lose that optimization > entirely, as you must update every index on the table. The resulting > performance impact is a pain point for many of our (AWS's) enterprise > customers, so we'd like to lend a hand for some improvements in this > area. For workloads involving a lot of columns and a lot of indexes, > an optimization like PHOT can make a huge difference. I'm aware that > there was a previous attempt a few years ago to add a similar > optimization called WARM [0] [1]. However, I only noticed this > previous effort after coming up with the design for PHOT, so I ended > up taking a slightly different approach. I am also aware of a couple > of recent nbtree improvements that may mitigate some of the impact of > non-HOT updates [2] [3], but I am hoping that PHOT serves as a nice > complement to those. I've attached a very early proof-of-concept > patch with the design described below. How is your approach different from those of [0] and [1]? It is interesting you still see performance benefits even after the btree duplication improvements. Did you test with those improvements? > As far as performance is concerned, it is simple enough to show major > benefits from PHOT by tacking on a large number of indexes and columns > to a table. For a short pgbench run where each table had 5 additional > text columns and indexes on every column, I noticed a ~34% bump in > TPS with PHOT [4]. Theoretically, the TPS bump should be even higher That's a big improvement. > Next, I'll go into the design a bit. I've commandeered the two > remaining bits in t_infomask2 to use as HEAP_PHOT_UPDATED and > HEAP_PHOT_TUPLE. These are analogous to the HEAP_HOT_UPDATED and > HEAP_ONLY_TUPLE bits. (If there are concerns about exhausting the > t_infomask2 bits, I think we could only use one of the remaining bits > as a "modifier" bit on the HOT ones. I opted against that for the > proof-of-concept patch to keep things simple.) When creating a PHOT > tuple, we only create new index tuples for updated columns. These new > index tuples point to the PHOT tuple. Following is a simple > demonstration with a table with two integer columns, each with its own > index: Whatever solution you have, you have to be able to handle adding/removing columns, and adding/removing indexes. > When it is time to scan through a PHOT chain, there are a couple of > things to account for. Sequential scans work out-of-the-box thanks to > the visibility rules, but other types of scans like index scans > require additional checks. If you encounter a PHOT chain when > performing an index scan, you should only continue following the chain > as long as none of the columns the index indexes are modified. If the > scan does encounter such a modification, we stop following the chain > and continue with the index scan. Even if there is a tuple in that I think in patch [0] and [1], if an index column changes, all the indexes had to be inserted into, while you seem to require inserts only into the index that needs it. Is that correct? > PHOT chain that should be returned by our index scan, we will still > find it, as there will be another matching index tuple that points us > to later in the PHOT chain. My initial idea for determining which > columns were modified was to add a new bitmap after the "nulls" bitmap > in the tuple header. However, the attached patch simply uses > HeapDetermineModifiedColumns(). I've yet to measure the overhead of > this approach versus the bitmap approach, but I haven't noticed > anything too detrimental in the testing I've done so far. A bitmap is an interesting approach, but you are right it will need benchmarking. I wonder if you should create a Postgres wiki page to document all of this. I agree PG 15 makes sense. I would like to help with this if I can. I will need to study this email more later. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee