Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2020-01-27 Thread Ashwin Agrawal
On Mon, Jan 27, 2020 at 4:47 PM Thomas Munro  wrote:

> OK, I kept only the small comment change from that little fixup patch,
> and pushed this.
>
> > I had proposed as alternative way in initial email and also later,
> > didn't receive comment on that, so re-posting.
>
> > typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);
> ...
> > Just due to void* callback argument aspect I didn't prefer that
> > solution and felt AM performing checks and calling
> > CheckForSerializableConflictOut() seems better.  Please let me know
> > how you feel about this.
>
> Yeah.  We could always come back to this idea if it looks better once
> we have more experience with new table AMs.
>

Sounds good. Thank You!


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2020-01-27 Thread Thomas Munro
On Wed, Nov 13, 2019 at 7:27 PM Ashwin Agrawal  wrote:
> On Sun, Nov 10, 2019 at 8:21 PM Thomas Munro  wrote:
>> but on another read-through of the main patch
>> I didn't like the comments for CheckForSerializableConflictOut() or
>> the fact that it checks SerializationNeededForRead() again, after I
>> thought a bit about what the contract for this API is now.  Here's a
>> version with small fixup that I'd like to squash into the patch.
>> Please let me know what you think,
>
> The thought or reasoning behind having SerializationNeededForRead()
> inside CheckForSerializableConflictOut() is to keep that API clean and
> complete by itself. Only if AM like heap needs to perform some AM
> specific checking only for serialization needed case can do so but is
> not forced. So, if AM for example Zedstore doesn't need to do any
> specific checking, then it can directly call
> CheckForSerializableConflictOut(). With the modified fixup patch, the
> responsibility is unnecessarily forced to caller even if
> CheckForSerializableConflictOut() can do it. I understand the intent
> is to avoid duplicate check for heap.

OK, I kept only the small comment change from that little fixup patch,
and pushed this.

> I had proposed as alternative way in initial email and also later,
> didn't receive comment on that, so re-posting.

> typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);
...
> Just due to void* callback argument aspect I didn't prefer that
> solution and felt AM performing checks and calling
> CheckForSerializableConflictOut() seems better.  Please let me know
> how you feel about this.

Yeah.  We could always come back to this idea if it looks better once
we have more experience with new table AMs.




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-11-12 Thread Ashwin Agrawal
On Sun, Nov 10, 2019 at 8:21 PM Thomas Munro  wrote:

> I pushed the first two,


Thank You!

but on another read-through of the main patch
> I didn't like the comments for CheckForSerializableConflictOut() or
> the fact that it checks SerializationNeededForRead() again, after I
> thought a bit about what the contract for this API is now.  Here's a
> version with small fixup that I'd like to squash into the patch.
> Please let me know what you think,


The thought or reasoning behind having SerializationNeededForRead()
inside CheckForSerializableConflictOut() is to keep that API clean and
complete by itself. Only if AM like heap needs to perform some AM
specific checking only for serialization needed case can do so but is
not forced. So, if AM for example Zedstore doesn't need to do any
specific checking, then it can directly call
CheckForSerializableConflictOut(). With the modified fixup patch, the
responsibility is unnecessarily forced to caller even if
CheckForSerializableConflictOut() can do it. I understand the intent
is to avoid duplicate check for heap.

>
> or if you see how to improve it
> further.
>

I had proposed as alternative way in initial email and also later,
didn't receive comment on that, so re-posting.

Alternative way to refactor. CheckForSerializableConflictOut() can
take callback function and (void *) callback argument for AM specific
check. So, the flow would be AM calling
CheckForSerializableConflictOut() as today and only if
SerializationNeededForRead() will invoke the callback to check with AM
if more work should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. So, roughly would look like

typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);

void CheckForSerializableConflictOut(Relation relation, TransactionId xid,
Snapshot snapshot, AMCheckForSerializableConflictOutCallback callback, void
*callback_arg)
{
if (!SerializationNeededForRead(relation, snapshot))
   return;
if (callback != NULL && !callback(callback_args))
   return;

.
}

With this AMs which don't have any specific checks to perform can pass
callback as NULL. So, function call is involved only if
SerializationNeededForRead() and only for AMs which need it.

Just due to void* callback argument aspect I didn't prefer that
solution and felt AM performing checks and calling
CheckForSerializableConflictOut() seems better.  Please let me know
how you feel about this.


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-11-10 Thread Thomas Munro
On Sat, Nov 9, 2019 at 8:41 AM Ashwin Agrawal  wrote:
> On Thu, Nov 7, 2019 at 8:44 PM Thomas Munro  wrote:
>> I'm planning to commit these three patches on Monday.  I've attached
>> versions with whitespace-only changes from pgindent, and commit
>> messages lightly massaged and updated to point to this discussion and
>> reviewers.
>
> Thanks a lot, sounds good.

Hi Ashwin,

I pushed the first two, but on another read-through of the main patch
I didn't like the comments for CheckForSerializableConflictOut() or
the fact that it checks SerializationNeededForRead() again, after I
thought a bit about what the contract for this API is now.  Here's a
version with small fixup that I'd like to squash into the patch.
Please let me know what you think, or if you see how to improve it
further.


v4-0001-Remove-dependency-on-HeapTuple-from-predicate-loc.patch
Description: Binary data


v4-0002-fixup.patch
Description: Binary data


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-11-08 Thread Ashwin Agrawal
On Thu, Nov 7, 2019 at 8:44 PM Thomas Munro  wrote:

> On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal  wrote:
> >>> - I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
> >>>   portion of it's code as a static inline. In particular, it's a shame
> >>>   that we currently perform external function calls at quite the
> >>>   frequency when serializable isn't even in use.
> >>
> >> I am not sure on portion of the code part? SerializationNeededForRead()
> is static inline function in C file. Can't inline
> CheckForSerializableConflictOutNeeded() without moving
> SerializationNeededForRead() and some other variables to header file.
> CheckForSerializableConflictOut() wasn't inline either, so a function call
> was performed earlier as well when serializable isn't even in use.
>
> I agree that it's strange that we do these high frequency function
> calls just to figure out that we're not even using this stuff, which
> ultimately comes down to the static global variable MySerializableXact
> being not reachable from (say) an inline function defined in a header.
> That's something to look into in another thread.
>
> > Attaching re-based version of the patches on top of current master,
> which has the fix for HOT serializable predicate locking bug spotted by
> Andres committed now.
>
> I'm planning to commit these three patches on Monday.  I've attached
> versions with whitespace-only changes from pgindent, and commit
> messages lightly massaged and updated to point to this discussion and
> reviewers.
>

Thanks a lot, sounds good.


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-11-07 Thread Thomas Munro
On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal  wrote:
>>> - I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
>>>   portion of it's code as a static inline. In particular, it's a shame
>>>   that we currently perform external function calls at quite the
>>>   frequency when serializable isn't even in use.
>>
>> I am not sure on portion of the code part? SerializationNeededForRead() is 
>> static inline function in C file. Can't inline 
>> CheckForSerializableConflictOutNeeded() without moving 
>> SerializationNeededForRead() and some other variables to header file. 
>> CheckForSerializableConflictOut() wasn't inline either, so a function call 
>> was performed earlier as well when serializable isn't even in use.

I agree that it's strange that we do these high frequency function
calls just to figure out that we're not even using this stuff, which
ultimately comes down to the static global variable MySerializableXact
being not reachable from (say) an inline function defined in a header.
That's something to look into in another thread.

> Attaching re-based version of the patches on top of current master, which has 
> the fix for HOT serializable predicate locking bug spotted by Andres 
> committed now.

I'm planning to commit these three patches on Monday.  I've attached
versions with whitespace-only changes from pgindent, and commit
messages lightly massaged and updated to point to this discussion and
reviewers.


v3-0001-Optimize-TransactionIdIsCurrentTransactionId.patch
Description: Binary data


v3-0002-Optimize-PredicateLockTuple.patch
Description: Binary data


v3-0003-Remove-HeapTuple-dependency-for-predicate-locking.patch
Description: Binary data


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-07 Thread Ashwin Agrawal
On Fri, Aug 2, 2019 at 4:56 PM Ashwin Agrawal  wrote:

>
> On Wed, Jul 31, 2019 at 2:06 PM Andres Freund  wrote:
>
>> Looking at the code as of master, we currently have:
>>
>
> Super awesome feedback and insights, thank you!
>
> - PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
>>   out a whether the tuple has been locked by the current
>>   transaction. That check afaict just should be
>>   TransactionIdIsCurrentTransactionId(), without all the other
>>   stuff that's done today.
>>
>
> Agree. v1-0002 patch attached does that now. Please let me know if that's
> what you meant.
>
>   TransactionIdIsCurrentTransactionId() imo ought to be optimized to
>>   always check for the top level transactionid first - that's a good bet
>>   today, but even moreso for the upcoming AMs that won't have separate
>>   xids for subtransactions.  Alternatively we shouldn't make that a
>>   binary search for each subtrans level, but just have a small
>>   simplehash hashtable for xids.
>>
>
> v1-0001 patch checks for GetTopTransactionIdIfAny() first in
> TransactionIdIsCurrentTransactionId() which seems yes better in general and
> more for future. That mostly meets the needs for current discussion.
>
> The alternative of not using binary search seems bigger refactoring and
> should be handled as separate optimization exercise outside of this thread.
>
>
>> - CheckForSerializableConflictOut() wants to get the toplevel xid for
>>   the tuple, because that's the one the predicate hashtable stores.
>>
>>   In your patch you've already moved the HTSV() call etc out of
>>   CheckForSerializableConflictOut(). I'm somewhat inclined to think that
>>   the SubTransGetTopmostTransaction() call ought to go along with that.
>>   I don't really think that belongs in predicate.c, especially if
>>   most/all new AMs don't use subtransaction ids.
>>
>>   The only downside is that currently the
>>   TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
>>   avoids the SubTransGetTopmostTransaction() check.
>>
>>   But again, the better fix for that seems to be to improve the generic
>>   code. As written the check won't prevent a subtrans lookup for heap
>>   when subtransactions are in use, and it's IME pretty common for tuples
>>   to get looked at again in the transaction that has created them. So
>>   I'm somewhat inclined to think that SubTransGetTopmostTransaction()
>>   should have a fast-path for the current transaction - probably just
>>   employing TransactionIdIsCurrentTransactionId().
>>
>
> That optimization, as Kuntal also mentioned, seems something which can be
> done on-top afterwards on current patch.
>
>
>> I don't really see what we gain by having the subtrans handling in the
>> predicate code. Especially given that we've already moved the HTSV()
>> handling out, it seems architecturally the wrong place to me - but I
>> admit that that's a fuzzy argument.  The relevant mapping should be one
>> line in the caller.
>>
>
> Okay, I moved the sub transaction handling out of
> CheckForSerializableConflictOut() and have it along side HTSV() now.
>
> The reason I felt leaving subtransaction handling in generic place, was it
> might be premature to thing no future AM will need it. Plus, all
> serializable function api's having same expectations is easier. Like
> PredicateLockTuple() can be passed top or subtransaction id and it can
> handle it but with the change CheckForSerializableConflictOut() only be
> feed top transaction ID. But its fine and can see the point of AM needing
> it can easily get top transaction ID and feed it as heap.
>
>
>> I wonder if it'd be wroth to combine the
>> TransactionIdIsCurrentTransactionId() calls in the heap cases that
>> currently do both, PredicateLockTuple() and
>> HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
>> isn't commonly that hot a pathq, but heap_hot_search_buffer() is.
>>
>
> Maybe, will give thought to it separate from the current patch.
>
>
>> Minor notes:
>> - I don't think 'insert_xid' is necessarily great - it could also be the
>>   updating xid etc. And while you can argue that an update is an insert
>>   in the current heap, that's not the case for future AMs.
>> - to me
>> @@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation
>> relation, Buffer buffer,
>> if (valid)
>> {
>> ItemPointerSetOffsetNumber(tid, offnum);
>> -   PredicateLockTuple(relation, heapTuple,
>> snapshot);
>> +   PredicateLockTID(relation,
>> &(heapTuple)->t_self, snapshot,
>> +
>> HeapTupleHeaderGetXmin(heapTuple->t_data));
>> if (all_dead)
>> *all_dead = false;
>> return true;
>>
>>  What are those parens - as placed they can't do anything. Did you
>>  intend to write &(heapTuple->t_self)? Even that is 

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-07 Thread Heikki Linnakangas

On 06/08/2019 13:35, Thomas Munro wrote:

On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas  wrote:

Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.


LGTM.


Pushed, thanks!

- Heikki




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-06 Thread Thomas Munro
On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas  wrote:
> I had some steam, and wrote a spec that reproduces this bug. It wasn't
> actually that hard to reproduce, fortunately. Or unfortunately; people
> might well be hitting it in production. I used the "freezetest.spec"
> from the 2013 thread as the starting point, and added one UPDATE to the
> initialization, so that the test starts with an already HOT-updated
> tuple. It should throw a serialization error, but on current master, it
> does not. After applying your fix.txt, it does.

Thanks!  Ahh, right, I was expecting it to be harder to see an
undetected anomaly, because of the index page lock, but of course we
never actually write to that page so it's just the heap tuple lock
holding everything together.

> Your fix.txt seems correct. For clarity, I'd prefer moving things around
> a bit, though, so that the t_self is set earlier in the function, at the
> same place where the other HeapTuple fields are set. And set blkno and
> offnum together, in one ItemPointerSet call. With that, I'm not sure we
> need such a verbose comment explaining why t_self needs to be updated
> but I kept it for now.

+1

> Attached is a patch that contains your fix.txt with the changes for
> clarity mentioned above, and an isolationtester test case.

LGTM.

> PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self
> to the returned tuple version, updating *tid is redundant. And the call
> in heapam_index_fetch_tuple() wouldn't need to do
> "bslot->base.tupdata.t_self = *tid".

Right, that sounds like a separate improvement for master only.

-- 
Thomas Munro
https://enterprisedb.com




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-06 Thread Heikki Linnakangas

On 06/08/2019 07:20, Thomas Munro wrote:

On Tue, Aug 6, 2019 at 4:35 AM Andres Freund  wrote:

On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:

1.  Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(>t_self,
offnum).


Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.


Adding Heikki and Kevin.

I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it.   I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there.  It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.

As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain?  (2) Is it
OK to do that with a HOT root?

The answer to the first question is given in README-SSI[1].
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here.  By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version.  That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root.  Concretely, heap_update() does
CheckForSerializableConflictIn(relation, , buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root.  A HOT root might never be considered
again by concurrent writers, no?


Your analysis is spot on. Thanks for the clear write-up!


This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.


Indeed.


One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks).  So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1].  I'm out of steam for this problem today though.


I had some steam, and wrote a spec that reproduces this bug. It wasn't 
actually that hard to reproduce, fortunately. Or unfortunately; people 
might well be hitting it in production. I used the "freezetest.spec" 
from the 2013 thread as the starting point, and added one UPDATE to the 
initialization, so that the test starts with an already HOT-updated 
tuple. It should throw a serialization error, but on current master, it 
does not. After applying your fix.txt, it does.


Your fix.txt seems correct. For clarity, I'd prefer moving things around 
a bit, though, so that the t_self is set earlier in the function, at the 
same place where the other HeapTuple fields are set. And set blkno and 
offnum together, in one ItemPointerSet call. With that, I'm not sure we 
need such a verbose comment explaining why t_self needs to be updated 
but I kept it for now.


Attached is a patch that contains your fix.txt with the changes for 
clarity mentioned above, and an isolationtester test case.


PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self 
to the returned tuple version, updating *tid is redundant. And the call 
in heapam_index_fetch_tuple() wouldn't need to do 
"bslot->base.tupdata.t_self = *tid".


- Heikki
>From 23d07e6d5b259e1fd2fe7694cde51212920fbb3a Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Tue, 6 Aug 2019 12:14:14 +0300
Subject: [PATCH 1/1] Fix predicate-locking of HOT updated rows.

In serializable mode, heap_hot_search_buffer() incorrectly acquired a
predicate lock on the root tuple, not the returned tuple that satisfied
the visibility checks. As explained in README-SSI, the predicate lock does
not need to be copied or extended to other tuple versions, but for that to
work, the correct, visible, tuple version must be locked in the first
place.

The original SSI commit had this bug in it, but it was fixed back in 2013,
in commit 81fbbfe335. But unfortunately, it was reintroduced a few months
later in commit b89e151054. 

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-06 Thread Kuntal Ghosh
Hello Thomas,

On Tue, Aug 6, 2019 at 9:50 AM Thomas Munro  wrote:
>
> On Tue, Aug 6, 2019 at 4:35 AM Andres Freund  wrote:
> > On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
> > > 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
> > > level." (2011) locked the root tuple, because it set t_self to *tid.
> > > Possibly due to confusion about the effect of the preceding line
> > > ItemPointerSetOffsetNumber(tid, offnum).
> > >
> > > 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
> > > fixed that by adding ItemPointerSetOffsetNumber(>t_self,
> > > offnum).
> >
> > Hm. It's not at all sure that it's ok to report the non-root tuple tid
> > here. I.e. I'm fairly sure there was a reason to not just set it to the
> > actual tid. I think I might have written that up on the list at some
> > point. Let me dig in memory and list. Obviously possible that that was
> > also obsoleted by parallel changes.
>
> Adding Heikki and Kevin.
>
> I haven't found your earlier discussion about that yet, but would be
> keen to read it if you can find it.   I wondered if your argument
> might have had something to do with heap pruning, but I can't find a
> problem there.  It's not as though the TID of any visible tuples
> change, it's just that some dead stuff goes away and the root line
> pointer is changed to LP_REDIRECT if it wasn't already.
>
> As for the argument for locking the tuple we emit rather than the HOT
> root, I think the questions are: (1) How exactly do we get away with
> locking only one version in a regular non-HOT update chain?  (2) Is it
> OK to do that with a HOT root?
>
> The answer to the first question is given in README-SSI[1].
> Unfortunately it doesn't discuss HOT directly, but I suspect the
> answer is no, HOT is not special here.  By my reading, it relies on
> the version you lock being the version visible to your snapshot, which
> is important because later updates have to touch that tuple to write
> the next version.  That doesn't apply to some arbitrarily older tuple
> that happens to be a HOT root.  Concretely, heap_update() does
> CheckForSerializableConflictIn(relation, , buffer), which is
> only going to produce a rw conflict if T1 took an SIREAD on precisely
> the version T2 locks in that path, not some arbitrarily older version
> that happens to be a HOT root.  A HOT root might never be considered
> again by concurrent writers, no?
>
If I understand the problem, this is the same serialization issue as
with in-place updates for zheap. I had a discussion with Kevin
regarding the same in this thread [1]. It seems if we're locking the
hot root id, we may report some false positive serializable errors.


> > > This must be in want of an isolation test, but I haven't yet tried to
> > > get my head around how to write a test that would show the difference.
> >
> > Indeed.
>
> One practical problem is that the only way to reach
> PredicateLockTuple() is from an index scan, and the index scan locks
> the index page (or the whole index, depending on
> rd_indam->ampredlocks).  So I think if you want to see a serialization
> anomaly you'll need multiple indexes (so that index page locks don't
> hide the problem), a long enough HOT chain and then probably several
> transactions to be able to miss a cycle that should be picked up by
> the logic in [1].  I'm out of steam for this problem today though.
>
> The simple test from the report[3] that resulted in commit 81fbbfe3352
> doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
> twice in a row).  The best I've come up with so far is an assertion
> that we predicate-lock the same row version that we emitted to the
> user, when reached via an index lookup that visits a HOT row.  The
> test outputs 'f' for master, but 't' with the change to heapam.c.
>
Here is an example from the multiple-row-versions isolation test which
fails if we perform in-place updates for zheap. I think the same will
be relevant if we lock root tuple id instead of the tuple itself.
Step 1: T1-> BEGIN; Read FROM t where id=100;
Step 2: T2-> BEGIN; UPDATE t where id=100; COMMIT; (creates T1->T2)
Step 3: T3-> BEGIN; UPDATE t where id=100; Read FROM t where id=50;
Step 4: T4-> BEGIN; UPDATE t where id= 50; Read FROM t where id=1;
COMMIT;  (creates T3->T4)
Step 5: T3-> COMMIT;
Step 6: T1-> UPDATE t where id=1; COMMIT;  (creates T4->T1,)

At step 6, when the update statement is executed, T1 is rolled back
because of T3->T4->T1.

But for zheap, step 3 also creates a dependency T1->T3 because of
in-place update. When T4 commits in step 4, it marks T3 as doomed
because of T1 --> T3 --> T4. Hence, in step 5, T3 is rolled back.

[1]  Re: In-place updates and serializable transactions:
https://www.postgresql.org/message-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg%40mail.gmail.com

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-05 Thread Thomas Munro
On Tue, Aug 6, 2019 at 4:35 AM Andres Freund  wrote:
> On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
> > 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
> > level." (2011) locked the root tuple, because it set t_self to *tid.
> > Possibly due to confusion about the effect of the preceding line
> > ItemPointerSetOffsetNumber(tid, offnum).
> >
> > 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
> > fixed that by adding ItemPointerSetOffsetNumber(>t_self,
> > offnum).
>
> Hm. It's not at all sure that it's ok to report the non-root tuple tid
> here. I.e. I'm fairly sure there was a reason to not just set it to the
> actual tid. I think I might have written that up on the list at some
> point. Let me dig in memory and list. Obviously possible that that was
> also obsoleted by parallel changes.

Adding Heikki and Kevin.

I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it.   I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there.  It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.

As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain?  (2) Is it
OK to do that with a HOT root?

The answer to the first question is given in README-SSI[1].
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here.  By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version.  That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root.  Concretely, heap_update() does
CheckForSerializableConflictIn(relation, , buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root.  A HOT root might never be considered
again by concurrent writers, no?

As a minor consequence, the optimisation in
CheckTargetForConflictsIn() assumes that a tuple being updated has the
same tag as we locked when reading the tuple, which isn't the case if
we locked the root while reading but now have the TID for the version
we actually read, so in master we leak a tuple lock unnecessarily
until end-of-transaction when we update a HOT tuple.

> > This must be in want of an isolation test, but I haven't yet tried to
> > get my head around how to write a test that would show the difference.
>
> Indeed.

One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks).  So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1].  I'm out of steam for this problem today though.

The simple test from the report[3] that resulted in commit 81fbbfe3352
doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row).  The best I've come up with so far is an assertion
that we predicate-lock the same row version that we emitted to the
user, when reached via an index lookup that visits a HOT row.  The
test outputs 'f' for master, but 't' with the change to heapam.c.

[1] Excerpt from README-SSI:

===
* PostgreSQL does not use "update in place" with a rollback log
for its MVCC implementation.  Where possible it uses "HOT" updates on
the same page (if there is room and no indexed value is changed).
For non-HOT updates the old tuple is expired in place and a new tuple
is inserted at a new location.  Because of this difference, a tuple
lock in PostgreSQL doesn't automatically lock any other versions of a
row.  We don't try to copy or expand a tuple lock to any other
versions of the row, based on the following proof that any additional
serialization failures we would get from that would be false
positives:

  o If transaction T1 reads a row version (thus acquiring a
predicate lock on it) and a second transaction T2 updates that row
version (thus creating a rw-conflict graph edge from T1 to T2), must a
third transaction T3 which re-updates the new version of the row also
have a rw-conflict in from T1 to prevent anomalies?  In other words,
does it matter whether we recognize the edge T1 -> T3?

  o If T1 has a conflict in, it certainly doesn't. Adding the
edge T1 -> T3 would create a dangerous structure, but we already had
one from the edge T1 -> T2, so we would have aborted something 

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-05 Thread Andres Freund
Hi,

On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
> On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal  wrote:
> > On Wed, Jul 31, 2019 at 2:06 PM Andres Freund  wrote:
> >>   I'm also a bit confused why we don't need to pass in the offset of the
> >>  current tuple, rather than the HOT root tuple here. That's not related
> >>  to this patch. But aren't we locking the wrong tuple here, in case of
> >>  HOT?
> >
> > Yes, root is being locked here instead of the HOT. But I don't have full 
> > context on the same. If we wish to fix it though, can be easily done now 
> > with the patch by passing "tid" instead of &(heapTuple->t_self).
> 
> Here are three relevant commits:

Thanks for digging!


> 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
> level." (2011) locked the root tuple, because it set t_self to *tid.
> Possibly due to confusion about the effect of the preceding line
> ItemPointerSetOffsetNumber(tid, offnum).
> 
> 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
> fixed that by adding ItemPointerSetOffsetNumber(>t_self,
> offnum).

Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.


> 3.  Commit b89e151054a "Introduce logical decoding." (2014) also did
> ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
> offnum), for the benefit of historical MVCC snapshots (unnecessarily,
> considering the change in the commit #2), but then, intending to
> "reset to original, non-redirected, tid", clobbered it, reintroducing
> the bug fixed by #2.

> My first guess is that commit #3 above was developed before commit #2,
> and finished up clobbering it.

Yea, that sounds likely.


> This must be in want of an isolation test, but I haven't yet tried to
> get my head around how to write a test that would show the difference.

Indeed.

Greetings,

Andres Freund




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-05 Thread Thomas Munro
On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal  wrote:
> On Wed, Jul 31, 2019 at 2:06 PM Andres Freund  wrote:
>>   I'm also a bit confused why we don't need to pass in the offset of the
>>  current tuple, rather than the HOT root tuple here. That's not related
>>  to this patch. But aren't we locking the wrong tuple here, in case of
>>  HOT?
>
> Yes, root is being locked here instead of the HOT. But I don't have full 
> context on the same. If we wish to fix it though, can be easily done now with 
> the patch by passing "tid" instead of &(heapTuple->t_self).

Here are three relevant commits:

1.  Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(>t_self,
offnum).

3.  Commit b89e151054a "Introduce logical decoding." (2014) also did
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
offnum), for the benefit of historical MVCC snapshots (unnecessarily,
considering the change in the commit #2), but then, intending to
"reset to original, non-redirected, tid", clobbered it, reintroducing
the bug fixed by #2.

My first guess is that commit #3 above was developed before commit #2,
and finished up clobbering it.  In fact, both logical decoding and SSI
want offnum, so we should be able to just remove the "reset" bit
(perhaps like in the attached sketch, not really tested, though it
passes).  This must be in want of an isolation test, but I haven't yet
tried to get my head around how to write a test that would show the
difference.

-- 
Thomas Munro
https://enterprisedb.com
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 94309949fa..107605d276 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1576,7 +1576,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation 
relation, Buffer buffer,
heapTuple->t_data = (HeapTupleHeader) PageGetItem(dp, lp);
heapTuple->t_len = ItemIdGetLength(lp);
heapTuple->t_tableOid = RelationGetRelid(relation);
-   ItemPointerSetOffsetNumber(>t_self, offnum);
 
/*
 * Shouldn't see a HEAP_ONLY tuple at chain start.
@@ -1608,15 +1607,14 @@ heap_hot_search_buffer(ItemPointer tid, Relation 
relation, Buffer buffer,
 * of the root tuple of the HOT chain. This is 
important because
 * the *Satisfies routine for historical mvcc snapshots 
needs the
 * correct tid to decide about the visibility in some 
cases.
+* SSI also needs offnum.
 */
-   ItemPointerSet(&(heapTuple->t_self), 
BufferGetBlockNumber(buffer), offnum);
+   ItemPointerSetOffsetNumber(>t_self, offnum);
 
/* If it's visible per the snapshot, we must return it 
*/
valid = HeapTupleSatisfiesVisibility(heapTuple, 
snapshot, buffer);
CheckForSerializableConflictOut(valid, relation, 
heapTuple,

buffer, snapshot);
-   /* reset to original, non-redirected, tid */
-   heapTuple->t_self = *tid;
 
if (valid)
{


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-02 Thread Ashwin Agrawal
On Wed, Jul 31, 2019 at 2:06 PM Andres Freund  wrote:

> Looking at the code as of master, we currently have:
>

Super awesome feedback and insights, thank you!

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
>   out a whether the tuple has been locked by the current
>   transaction. That check afaict just should be
>   TransactionIdIsCurrentTransactionId(), without all the other
>   stuff that's done today.
>

Agree. v1-0002 patch attached does that now. Please let me know if that's
what you meant.

  TransactionIdIsCurrentTransactionId() imo ought to be optimized to
>   always check for the top level transactionid first - that's a good bet
>   today, but even moreso for the upcoming AMs that won't have separate
>   xids for subtransactions.  Alternatively we shouldn't make that a
>   binary search for each subtrans level, but just have a small
>   simplehash hashtable for xids.
>

v1-0001 patch checks for GetTopTransactionIdIfAny() first in
TransactionIdIsCurrentTransactionId() which seems yes better in general and
more for future. That mostly meets the needs for current discussion.

The alternative of not using binary search seems bigger refactoring and
should be handled as separate optimization exercise outside of this thread.


> - CheckForSerializableConflictOut() wants to get the toplevel xid for
>   the tuple, because that's the one the predicate hashtable stores.
>
>   In your patch you've already moved the HTSV() call etc out of
>   CheckForSerializableConflictOut(). I'm somewhat inclined to think that
>   the SubTransGetTopmostTransaction() call ought to go along with that.
>   I don't really think that belongs in predicate.c, especially if
>   most/all new AMs don't use subtransaction ids.
>
>   The only downside is that currently the
>   TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
>   avoids the SubTransGetTopmostTransaction() check.
>
>   But again, the better fix for that seems to be to improve the generic
>   code. As written the check won't prevent a subtrans lookup for heap
>   when subtransactions are in use, and it's IME pretty common for tuples
>   to get looked at again in the transaction that has created them. So
>   I'm somewhat inclined to think that SubTransGetTopmostTransaction()
>   should have a fast-path for the current transaction - probably just
>   employing TransactionIdIsCurrentTransactionId().
>

That optimization, as Kuntal also mentioned, seems something which can be
done on-top afterwards on current patch.


> I don't really see what we gain by having the subtrans handling in the
> predicate code. Especially given that we've already moved the HTSV()
> handling out, it seems architecturally the wrong place to me - but I
> admit that that's a fuzzy argument.  The relevant mapping should be one
> line in the caller.
>

Okay, I moved the sub transaction handling out of
CheckForSerializableConflictOut() and have it along side HTSV() now.

The reason I felt leaving subtransaction handling in generic place, was it
might be premature to thing no future AM will need it. Plus, all
serializable function api's having same expectations is easier. Like
PredicateLockTuple() can be passed top or subtransaction id and it can
handle it but with the change CheckForSerializableConflictOut() only be
feed top transaction ID. But its fine and can see the point of AM needing
it can easily get top transaction ID and feed it as heap.


> I wonder if it'd be wroth to combine the
> TransactionIdIsCurrentTransactionId() calls in the heap cases that
> currently do both, PredicateLockTuple() and
> HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
> isn't commonly that hot a pathq, but heap_hot_search_buffer() is.
>

Maybe, will give thought to it separate from the current patch.


> Minor notes:
> - I don't think 'insert_xid' is necessarily great - it could also be the
>   updating xid etc. And while you can argue that an update is an insert
>   in the current heap, that's not the case for future AMs.
> - to me
> @@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation
> relation, Buffer buffer,
> if (valid)
> {
> ItemPointerSetOffsetNumber(tid, offnum);
> -   PredicateLockTuple(relation, heapTuple,
> snapshot);
> +   PredicateLockTID(relation,
> &(heapTuple)->t_self, snapshot,
> +
> HeapTupleHeaderGetXmin(heapTuple->t_data));
> if (all_dead)
> *all_dead = false;
> return true;
>
>  What are those parens - as placed they can't do anything. Did you
>  intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
>  but it at least clarifies the precedence.
>

Fixed. No idea what I was thinking there, mostly feel I intended to have it
as like &(heapTuple->t_self).

  I'm also a bit 

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-08-01 Thread Kuntal Ghosh
On Thu, Aug 1, 2019 at 2:36 AM Andres Freund  wrote:
> > > >
> > > > I think the only part its doing for sub-transaction is
> > > > SubTransGetTopmostTransaction(xid). If xid passed to this function is
> > > > already top most transaction which is case for zheap and zedstore, then
> > > > there is no downside to keeping that code here in common place.
> > >
> > > Well, it's far from a cheap function. It'll do unnecessary on-disk
> > > lookups in many cases. I'd call that quite a downside.
> > >
> >
> > Okay, agree, its costly function and better to avoid the call if possible.
> >
> > Instead of moving the handling out of the function, how do feel about
> > adding boolean isTopTransactionId argument to function
> > CheckForSerializableConflictOut(). The AMs, which implicitly know, only
> > pass top transaction Id to this function, can pass true and avoid the
> > function call to SubTransGetTopmostTransaction(xid). With this
> > subtransaction code remains in generic place and AMs intending to use it
> > continue to leverage the common code, plus explicitly clarifies the
> > behavior as well.
>
> Looking at the code as of master, we currently have:
>
> - PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
>   out a whether the tuple has been locked by the current
>   transaction. That check afaict just should be
>   TransactionIdIsCurrentTransactionId(), without all the other
>   stuff that's done today.
>
Yeah. this is the only part where predicate locking uses the subxids.
Since, predicate locking always use the top xid, IMHO, it'll be good
to make this api independent of subxids.

>   TransactionIdIsCurrentTransactionId() imo ought to be optimized to
>   always check for the top level transactionid first - that's a good bet
>   today, but even moreso for the upcoming AMs that won't have separate
>   xids for subtransactions.  Alternatively we shouldn't make that a
>   binary search for each subtrans level, but just have a small
>   simplehash hashtable for xids.
A check for top transaction id first and usage of simple sound like
good optimizations. But, I'm not sure whether these changes should be
part of this patch or a separate one.

-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-07-31 Thread Andres Freund
Hi,

On 2019-07-31 12:37:58 -0700, Ashwin Agrawal wrote:
> On Wed, Jul 31, 2019 at 10:55 AM Andres Freund  wrote:
>
> > Hi,
> >
> > On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> > > On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro 
> > wrote:
> > >
> > > > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund 
> > wrote:
> > > > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > > > >   buffer, instead just takes xid. Push heap specific parts from
> > > > > >   CheckForSerializableConflictOut() into its own function
> > > > > >   HeapCheckForSerializableConflictOut() which calls
> > > > > >   CheckForSerializableConflictOut(). The alternative option could
> > be
> > > > > >   CheckForSerializableConflictOut() take callback function and
> > > > > >   callback arguments, which gets called if required after
> > performing
> > > > > >   prechecks. Though currently I fell AM having its own wrapper to
> > > > > >   perform AM specific task and then calling
> > > > > >   CheckForSerializableConflictOut() is fine.
> > > > >
> > > > > I think it's right to move the xid handling out of
> > > > > CheckForSerializableConflictOut(). But I think we also ought to move
> > the
> > > > > subtransaction handling out of the function - e.g. zheap doesn't
> > > > > want/need that.
> > > >
> > > > Thoughts on this Ashwin?
> > > >
> > >
> > > I think the only part its doing for sub-transaction is
> > > SubTransGetTopmostTransaction(xid). If xid passed to this function is
> > > already top most transaction which is case for zheap and zedstore, then
> > > there is no downside to keeping that code here in common place.
> >
> > Well, it's far from a cheap function. It'll do unnecessary on-disk
> > lookups in many cases. I'd call that quite a downside.
> >
>
> Okay, agree, its costly function and better to avoid the call if possible.
>
> Instead of moving the handling out of the function, how do feel about
> adding boolean isTopTransactionId argument to function
> CheckForSerializableConflictOut(). The AMs, which implicitly know, only
> pass top transaction Id to this function, can pass true and avoid the
> function call to SubTransGetTopmostTransaction(xid). With this
> subtransaction code remains in generic place and AMs intending to use it
> continue to leverage the common code, plus explicitly clarifies the
> behavior as well.

Looking at the code as of master, we currently have:

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
  out a whether the tuple has been locked by the current
  transaction. That check afaict just should be
  TransactionIdIsCurrentTransactionId(), without all the other
  stuff that's done today.

  TransactionIdIsCurrentTransactionId() imo ought to be optimized to
  always check for the top level transactionid first - that's a good bet
  today, but even moreso for the upcoming AMs that won't have separate
  xids for subtransactions.  Alternatively we shouldn't make that a
  binary search for each subtrans level, but just have a small
  simplehash hashtable for xids.

- CheckForSerializableConflictOut() wants to get the toplevel xid for
  the tuple, because that's the one the predicate hashtable stores.

  In your patch you've already moved the HTSV() call etc out of
  CheckForSerializableConflictOut(). I'm somewhat inclined to think that
  the SubTransGetTopmostTransaction() call ought to go along with that.
  I don't really think that belongs in predicate.c, especially if
  most/all new AMs don't use subtransaction ids.

  The only downside is that currently the
  TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
  avoids the SubTransGetTopmostTransaction() check.

  But again, the better fix for that seems to be to improve the generic
  code. As written the check won't prevent a subtrans lookup for heap
  when subtransactions are in use, and it's IME pretty common for tuples
  to get looked at again in the transaction that has created them. So
  I'm somewhat inclined to think that SubTransGetTopmostTransaction()
  should have a fast-path for the current transaction - probably just
  employing TransactionIdIsCurrentTransactionId().

I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument.  The relevant mapping should be one
line in the caller.

I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.


Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
  updating xid etc. And while you can argue that an update is an insert
  in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-07-31 Thread Ashwin Agrawal
On Wed, Jul 31, 2019 at 12:37 PM Ashwin Agrawal  wrote:

>
> On Wed, Jul 31, 2019 at 10:55 AM Andres Freund  wrote:
>
>> Hi,
>>
>> On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
>> > On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro 
>> wrote:
>> >
>> > > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund 
>> wrote:
>> > > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
>> > > > >   buffer, instead just takes xid. Push heap specific parts from
>> > > > >   CheckForSerializableConflictOut() into its own function
>> > > > >   HeapCheckForSerializableConflictOut() which calls
>> > > > >   CheckForSerializableConflictOut(). The alternative option could
>> be
>> > > > >   CheckForSerializableConflictOut() take callback function and
>> > > > >   callback arguments, which gets called if required after
>> performing
>> > > > >   prechecks. Though currently I fell AM having its own wrapper to
>> > > > >   perform AM specific task and then calling
>> > > > >   CheckForSerializableConflictOut() is fine.
>> > > >
>> > > > I think it's right to move the xid handling out of
>> > > > CheckForSerializableConflictOut(). But I think we also ought to
>> move the
>> > > > subtransaction handling out of the function - e.g. zheap doesn't
>> > > > want/need that.
>> > >
>> > > Thoughts on this Ashwin?
>> > >
>> >
>> > I think the only part its doing for sub-transaction is
>> > SubTransGetTopmostTransaction(xid). If xid passed to this function is
>> > already top most transaction which is case for zheap and zedstore, then
>> > there is no downside to keeping that code here in common place.
>>
>> Well, it's far from a cheap function. It'll do unnecessary on-disk
>> lookups in many cases. I'd call that quite a downside.
>>
>
> Okay, agree, its costly function and better to avoid the call if possible.
>
> Instead of moving the handling out of the function, how do feel about
> adding boolean isTopTransactionId argument to function
> CheckForSerializableConflictOut(). The AMs, which implicitly know, only
> pass top transaction Id to this function, can pass true and avoid the
> function call to SubTransGetTopmostTransaction(xid). With this
> subtransaction code remains in generic place and AMs intending to use it
> continue to leverage the common code, plus explicitly clarifies the
> behavior as well.
>

Added argument to function to make the subtransaction handling optional in
attached v2 of patch.
From 4ca67592f34b63cf80247068a128407d800c62a6 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal 
Date: Wed, 31 Jul 2019 13:53:52 -0700
Subject: [PATCH v2] Remove HeapTuple dependency for predicate locking
 functions.

Following changes help to make predicate locking functions generic and
not tied to particular AM.

- PredicateLockTuple() is renamed to PredicateLockTID(). It takes
  ItemPointer and inserting transaction id now instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer

- CheckForSerializableConflictOut() no more takes HeapTuple nor buffer
---
 src/backend/access/gin/ginbtree.c|   2 +-
 src/backend/access/gin/ginfast.c |   2 +-
 src/backend/access/gin/gininsert.c   |   4 +-
 src/backend/access/gist/gist.c   |   2 +-
 src/backend/access/hash/hashinsert.c |   2 +-
 src/backend/access/heap/heapam.c | 104 --
 src/backend/access/heap/heapam_handler.c |   7 +-
 src/backend/access/index/indexam.c   |   4 +-
 src/backend/access/nbtree/nbtinsert.c|   4 +-
 src/backend/storage/lmgr/predicate.c | 134 ---
 src/include/access/heapam.h  |   2 +
 src/include/storage/predicate.h  |  10 +-
 12 files changed, 162 insertions(+), 115 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 11a8ed7bbc2..e795375495b 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -89,7 +89,7 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 	stack->predictNumber = 1;
 
 	if (rootConflictCheck)
-		CheckForSerializableConflictIn(btree->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(btree->index, NULL, btree->rootBlkno);
 
 	for (;;)
 	{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b3e61..d7b52476817 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -246,7 +246,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * tree, so it conflicts with all serializable scans.  All scans acquire a
 	 * predicate lock on the metabuffer to represent that.
 	 */
-	CheckForSerializableConflictIn(index, NULL, metabuffer);
+	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
 	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
 	{
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab146173..046a20a3d41 100644
--- 

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-07-31 Thread Ashwin Agrawal
On Wed, Jul 31, 2019 at 10:55 AM Andres Freund  wrote:

> Hi,
>
> On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> > On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro 
> wrote:
> >
> > > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund 
> wrote:
> > > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > > >   buffer, instead just takes xid. Push heap specific parts from
> > > > >   CheckForSerializableConflictOut() into its own function
> > > > >   HeapCheckForSerializableConflictOut() which calls
> > > > >   CheckForSerializableConflictOut(). The alternative option could
> be
> > > > >   CheckForSerializableConflictOut() take callback function and
> > > > >   callback arguments, which gets called if required after
> performing
> > > > >   prechecks. Though currently I fell AM having its own wrapper to
> > > > >   perform AM specific task and then calling
> > > > >   CheckForSerializableConflictOut() is fine.
> > > >
> > > > I think it's right to move the xid handling out of
> > > > CheckForSerializableConflictOut(). But I think we also ought to move
> the
> > > > subtransaction handling out of the function - e.g. zheap doesn't
> > > > want/need that.
> > >
> > > Thoughts on this Ashwin?
> > >
> >
> > I think the only part its doing for sub-transaction is
> > SubTransGetTopmostTransaction(xid). If xid passed to this function is
> > already top most transaction which is case for zheap and zedstore, then
> > there is no downside to keeping that code here in common place.
>
> Well, it's far from a cheap function. It'll do unnecessary on-disk
> lookups in many cases. I'd call that quite a downside.
>

Okay, agree, its costly function and better to avoid the call if possible.

Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-07-31 Thread Andres Freund
Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro  wrote:
> 
> > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund  wrote:
> > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > >   buffer, instead just takes xid. Push heap specific parts from
> > > >   CheckForSerializableConflictOut() into its own function
> > > >   HeapCheckForSerializableConflictOut() which calls
> > > >   CheckForSerializableConflictOut(). The alternative option could be
> > > >   CheckForSerializableConflictOut() take callback function and
> > > >   callback arguments, which gets called if required after performing
> > > >   prechecks. Though currently I fell AM having its own wrapper to
> > > >   perform AM specific task and then calling
> > > >   CheckForSerializableConflictOut() is fine.
> > >
> > > I think it's right to move the xid handling out of
> > > CheckForSerializableConflictOut(). But I think we also ought to move the
> > > subtransaction handling out of the function - e.g. zheap doesn't
> > > want/need that.
> >
> > Thoughts on this Ashwin?
> >
> 
> I think the only part its doing for sub-transaction is
> SubTransGetTopmostTransaction(xid). If xid passed to this function is
> already top most transaction which is case for zheap and zedstore, then
> there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Greetings,

Andres Freund




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-07-31 Thread Andres Freund
Hi,

On 2019-07-31 09:57:58 +1200, Thomas Munro wrote:
> On Tue, Jun 25, 2019 at 6:02 AM Andres Freund  wrote:
> > Hm.  I wonder if we somehow ought to generalize the granularity scheme
> > for predicate locks to not be tuple/page/relation.  But even if, that's
> > probably a separate patch.
> 
> What do you have in mind?

My concern is that continuing to inferring the granularity levels from
the tid doesn't seem like a great path forward. An AMs use of tids might
not necessarily be very amenable to that, if the mapping isn't actually
block based.


> Perhaps you just want to give those things different labels, "TID
> range" instead of page, for the benefit of "logical" TID users?
> Perhaps you want to permit more levels?  That seems premature as long
> as TIDs are defined in terms of blocks and offsets, so this stuff is
> reflected all over the source tree...

I'm mostly wondering if the different levels shouldn't be computed
outside of predicate.c.

Greetings,

Andres Freund




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-07-31 Thread Ashwin Agrawal
On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro  wrote:

> On Tue, Jun 25, 2019 at 6:02 AM Andres Freund  wrote:
> > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > >   buffer, instead just takes xid. Push heap specific parts from
> > >   CheckForSerializableConflictOut() into its own function
> > >   HeapCheckForSerializableConflictOut() which calls
> > >   CheckForSerializableConflictOut(). The alternative option could be
> > >   CheckForSerializableConflictOut() take callback function and
> > >   callback arguments, which gets called if required after performing
> > >   prechecks. Though currently I fell AM having its own wrapper to
> > >   perform AM specific task and then calling
> > >   CheckForSerializableConflictOut() is fine.
> >
> > I think it's right to move the xid handling out of
> > CheckForSerializableConflictOut(). But I think we also ought to move the
> > subtransaction handling out of the function - e.g. zheap doesn't
> > want/need that.
>
> Thoughts on this Ashwin?
>

I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.


Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-07-30 Thread Thomas Munro
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund  wrote:
> On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:
> > Proposing following changes to make predicate locking and checking
> > functions generic and remove dependency on HeapTuple and Heap AM. We
> > made these changes to help with Zedstore. I think the changes should
> > help Zheap and other AMs in general.
>
> Indeed.

+1

> > - Change PredicateLockTuple() to PredicateLockTID(). So, instead of
> >   passing HeapTuple to it, just pass ItemPointer and tuple insert
> >   transaction id if known. This was also discussed earlier in [1],
> >   don't think was done in that context but would be helpful in future
> >   if such requirement comes up as well.
>
> Right.

+1

> > - CheckForSerializableConflictIn() take blocknum instead of
> >   buffer. Currently, the function anyways does nothing with the buffer
> >   just needs blocknum. Also, helps to decouple dependency on buffer as
> >   not all AMs may have one to one notion between blocknum and single
> >   buffer. Like for zedstore, tuple is stored across individual column
> >   buffers. So, wish to have way to lock not physical buffer but
> >   logical blocknum.
>
> Hm.  I wonder if we somehow ought to generalize the granularity scheme
> for predicate locks to not be tuple/page/relation.  But even if, that's
> probably a separate patch.

What do you have in mind?  This is certainly the traditional way to do
lock hierarchies (archeological fun: we used to have
src/backend/storage/lock/multi.c, a "standard multi-level lock manager
as per the Gray paper", before commits 3f7fbf85 and e6e9e18e
decommissioned it;  SSI brought the concept back again in a parallel
lock manager), but admittedly it has assumptions about physical
storage baked into the naming.  Perhaps you just want to give those
things different labels, "TID range" instead of page, for the benefit
of "logical" TID users?  Perhaps you want to permit more levels?  That
seems premature as long as TIDs are defined in terms of blocks and
offsets, so this stuff is reflected all over the source tree...

> > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> >   buffer, instead just takes xid. Push heap specific parts from
> >   CheckForSerializableConflictOut() into its own function
> >   HeapCheckForSerializableConflictOut() which calls
> >   CheckForSerializableConflictOut(). The alternative option could be
> >   CheckForSerializableConflictOut() take callback function and
> >   callback arguments, which gets called if required after performing
> >   prechecks. Though currently I fell AM having its own wrapper to
> >   perform AM specific task and then calling
> >   CheckForSerializableConflictOut() is fine.
>
> I think it's right to move the xid handling out of
> CheckForSerializableConflictOut(). But I think we also ought to move the
> subtransaction handling out of the function - e.g. zheap doesn't
> want/need that.

Thoughts on this Ashwin?

> > Attaching patch which makes these changes.
>
> Please make sure that there's a CF entry for this (I'm in a plane with a
> super slow connection, otherwise I'd check).

This all makes sense, and I'd like to be able to commit this soon.  We
come up with something nearly identical for zheap.  I'm subscribing
Kuntal Ghosh to see if he has comments, as he worked on that.  The
main point of difference seems to be the assumption about how
subtransactions work.

-- 
Thomas Munro
https://enterprisedb.com




Re: Remove HeapTuple and Buffer dependency for predicate locking functions

2019-06-24 Thread Andres Freund
Hi,

On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:
> Proposing following changes to make predicate locking and checking
> functions generic and remove dependency on HeapTuple and Heap AM. We
> made these changes to help with Zedstore. I think the changes should
> help Zheap and other AMs in general.

Indeed.


> - Change PredicateLockTuple() to PredicateLockTID(). So, instead of
>   passing HeapTuple to it, just pass ItemPointer and tuple insert
>   transaction id if known. This was also discussed earlier in [1],
>   don't think was done in that context but would be helpful in future
>   if such requirement comes up as well.

Right.


> - CheckForSerializableConflictIn() take blocknum instead of
>   buffer. Currently, the function anyways does nothing with the buffer
>   just needs blocknum. Also, helps to decouple dependency on buffer as
>   not all AMs may have one to one notion between blocknum and single
>   buffer. Like for zedstore, tuple is stored across individual column
>   buffers. So, wish to have way to lock not physical buffer but
>   logical blocknum.

Hm.  I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation.  But even if, that's
probably a separate patch.


> - CheckForSerializableConflictOut() no more takes HeapTuple nor
>   buffer, instead just takes xid. Push heap specific parts from
>   CheckForSerializableConflictOut() into its own function
>   HeapCheckForSerializableConflictOut() which calls
>   CheckForSerializableConflictOut(). The alternative option could be
>   CheckForSerializableConflictOut() take callback function and
>   callback arguments, which gets called if required after performing
>   prechecks. Though currently I fell AM having its own wrapper to
>   perform AM specific task and then calling
>   CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.


> Attaching patch which makes these changes.

Please make sure that there's a CF entry for this (I'm in a plane with a
super slow connection, otherwise I'd check).

Greetings,

Andres Freund