On Sat, Nov 26, 2016 at 10:49 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: > On Fri, Nov 25, 2016 at 11:23 PM, Bruce Momjian <br...@momjian.us> wrote: >> On Thu, Nov 24, 2016 at 12:23:28PM -0500, Robert Haas wrote: >>> I agree up to a point. I think we need to design our own system as >>> well as we can, not just copy what others have done. For example, the >>> design I sketched will work with all of PostgreSQL's existing index >>> types. You need to modify each AM in order to support in-place >>> updates when a column indexed by that AM has been modified, and that's >>> probably highly desirable, but it's not a hard requirement. >> >> I feel you are going to get into the problem of finding the index entry >> for the old row --- the same thing that is holding back more aggressive >> WARM updates. >> > > I think I see a problem in that regards when the index column in > updated twice with the same value. For example, > > table - test_undo(c1 int, c2 char(10)); > index - idx_tu on test_undo(c1); > > Step-1 > Insert (2, 'abc') > Heap - 2, abc > Index - 2 > Commit; > > Step -2 > Update (3,def) where c1 = 2; > Heap - 3, def > Index - 3 (another value will be 2) > Undo - 2, abc > Commit; > > At this point a new query which scans the index for value 2 will reach > to the tuple (3,def). As scan started after all commits, tuple > (3,def) appears okay, but it can compare the key value in the tuple to > detect that this is not the right tuple. So, all is well till here. > > Step -3 > Update (2) where c1 = 3; > Heap - 2, def > Index - 2 (another values will be 3,2) > Undo - 3; > Commit > > At this point, index scan for value 2 will find index tuple of step-1 > (2) and will conclude 2,def as a right tuple, but actually, that is > wrong as the step-1 (2) index tuple should not be visible to the user. > Do you also this as a problem or am I missing something? If this a > problem, then I think we need some form of delete marking system for > the index, probably transaction visibility information as well.
Well, my original email did contain a discussion of the need for delete-marking. I said that you couldn't do in-place updates when indexed columns were modified unless the index AMs had support for delete-marking, which is the same point you are making here. However, I agree that the case where the indexed column gets set back to a previous value while the old index tuple for that value still exists is particularly tricky. I think that what we need to do there is avoid inserting an index entry for a given value/TID pair if an index entry for that value/TID pair already exists. That's a little different from your analysis. In your step-3 analysis, you say that an index scan for 2 will find the step-1 tuple, but that's not really right. The index scan will find the index tuple which points to the whole chain of tuples, step-3 => step-2 => step-1, and it will decide which heap tuple from that chain the user can see based on the user's snapshot. That's OK, but we're in real trouble if step-3 inserted an additional index tuple pointing to that chain rather than simply noting that one already exists. If it adds an additional one, then we'll visit two index tuples for that TID. Each time, the visibility information in UNDO will cause us to return the correct tuple, but we've erred by returning it twice (once per index entry) rather than just once. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers