Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2017-01-31 Thread Michael Paquier
On Wed, Jan 25, 2017 at 2:53 PM, Michael Paquier
 wrote:
> The latest patch available still applies, one person has added his
> name (John G) in October though there have been no reviews. There have
> been a couple of arguments against this patch, and the thread has had
> no activity for the last month, so I would be incline to mark that as
> returned with feedback. Thoughts?

And done as such. Feel free to update if there is something fresh.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2017-01-24 Thread Michael Paquier
On Mon, Dec 5, 2016 at 11:32 AM, Haribabu Kommi
 wrote:
>
>
> On Tue, Nov 15, 2016 at 1:17 PM, Noah Misch  wrote:
>>
>> On Mon, Nov 14, 2016 at 10:21:53AM -0800, Peter Geoghegan wrote:
>> > BTW, I recently noticed that the latest version of Valgrind, 3.12,
>> > added this new feature:
>> >
>> > * Memcheck:
>> >
>> >   - Added meta mempool support for describing a custom allocator which:
>> >  - Auto-frees all chunks assuming that destroying a pool destroys
>> > all
>> >objects in the pool
>> >  - Uses itself to allocate other memory blocks
>> >
>> > It occurred to me that we might be able to make good use of this.
>>
>> PostgreSQL memory contexts don't acquire blocks from other contexts; they
>> all
>> draw from malloc() directly.  Therefore, I don't see this feature giving
>> us
>> something that the older Memcheck MEMPOOL support does not give us.
>
> Moved to next CF with "needs review" status.
> Please feel free to update the status if the current status doesn't reflect
> the status of the patch.

The latest patch available still applies, one person has added his
name (John G) in October though there have been no reviews. There have
been a couple of arguments against this patch, and the thread has had
no activity for the last month, so I would be incline to mark that as
returned with feedback. Thoughts?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-12-04 Thread Haribabu Kommi
On Tue, Nov 15, 2016 at 1:17 PM, Noah Misch  wrote:

> On Mon, Nov 14, 2016 at 10:21:53AM -0800, Peter Geoghegan wrote:
> > BTW, I recently noticed that the latest version of Valgrind, 3.12,
> > added this new feature:
> >
> > * Memcheck:
> >
> >   - Added meta mempool support for describing a custom allocator which:
> >  - Auto-frees all chunks assuming that destroying a pool destroys all
> >objects in the pool
> >  - Uses itself to allocate other memory blocks
> >
> > It occurred to me that we might be able to make good use of this.
>
> PostgreSQL memory contexts don't acquire blocks from other contexts; they
> all
> draw from malloc() directly.  Therefore, I don't see this feature giving us
> something that the older Memcheck MEMPOOL support does not give us.
>
>
Moved to next CF with "needs review" status.
Please feel free to update the status if the current status doesn't reflect
the status
of the patch.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Noah Misch
On Mon, Nov 14, 2016 at 10:21:53AM -0800, Peter Geoghegan wrote:
> BTW, I recently noticed that the latest version of Valgrind, 3.12,
> added this new feature:
> 
> * Memcheck:
> 
>   - Added meta mempool support for describing a custom allocator which:
>  - Auto-frees all chunks assuming that destroying a pool destroys all
>objects in the pool
>  - Uses itself to allocate other memory blocks
> 
> It occurred to me that we might be able to make good use of this.

PostgreSQL memory contexts don't acquire blocks from other contexts; they all
draw from malloc() directly.  Therefore, I don't see this feature giving us
something that the older Memcheck MEMPOOL support does not give us.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Peter Geoghegan
On Mon, Nov 14, 2016 at 10:17 AM, Andres Freund  wrote:
> I think so, yes. IIRC I discussed it with Noah and Peter G. at a
> conference recently. We'd basically mark the content of shared buffers
> inaccessible at backend startup, and mark it accessible whenever a
> PinBuffer() happens, and then inaccessible during unpinning. We probably
> have to exclude the page header though, as we intentionally access them
> unpinned in some cases IIRC.

BTW, I recently noticed that the latest version of Valgrind, 3.12,
added this new feature:

* Memcheck:

  - Added meta mempool support for describing a custom allocator which:
 - Auto-frees all chunks assuming that destroying a pool destroys all
   objects in the pool
 - Uses itself to allocate other memory blocks

It occurred to me that we might be able to make good use of this. To
be clear, I don't think that there is reason to tie it to adding the
PinBuffer() stuff, which we've been talking about for years now. It
just caught my eye.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Andres Freund
On 2016-11-14 13:12:28 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-11-14 12:32:53 -0500, Tom Lane wrote:
> >> Basically my concern is that this restriction isn't documented anywhere
> >> and I'm not entirely certain it's been adhered to everywhere.  I'd feel
> >> much better about it if there were some way we could verify that.
> 
> > Would support for valgrind complaining about access to unpinned buffers
> > suffice?
> 
> I don't think it directly addresses the issue, but certainly it'd help.

Well, it detects situations where removed pins cause "unprotected
access", but of course that doesn't protect against cases where
independent pins hide that issue.


> Do you think that's easily doable?

I think so, yes. IIRC I discussed it with Noah and Peter G. at a
conference recently. We'd basically mark the content of shared buffers
inaccessible at backend startup, and mark it accessible whenever a
PinBuffer() happens, and then inaccessible during unpinning. We probably
have to exclude the page header though, as we intentionally access them
unpinned in some cases IIRC.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Tom Lane
Andres Freund  writes:
> On 2016-11-14 12:32:53 -0500, Tom Lane wrote:
>> Basically my concern is that this restriction isn't documented anywhere
>> and I'm not entirely certain it's been adhered to everywhere.  I'd feel
>> much better about it if there were some way we could verify that.

> Would support for valgrind complaining about access to unpinned buffers
> suffice?

I don't think it directly addresses the issue, but certainly it'd help.
Do you think that's easily doable?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Andres Freund
On 2016-11-14 12:32:53 -0500, Tom Lane wrote:
> Heikki Linnakangas  writes:
> > On 11/14/2016 06:18 PM, Tom Lane wrote:
> >> You're implicitly assuming that a scan always returns its results in the
> >> same slot, and that no other slot could contain a copy of that data, but
> >> there is no guarantee of either.  See bug #14344 and d8589946d for a
> >> pretty recent example where that failed to be true --- admittedly, for
> >> a tuplesort scan not a table scan.
> 
> > It's the other way round. ExecProcNode might not always return its 
> > result in the same slot, but all the callers must assume that it might.
> 
> Basically my concern is that this restriction isn't documented anywhere
> and I'm not entirely certain it's been adhered to everywhere.  I'd feel
> much better about it if there were some way we could verify that.

Would support for valgrind complaining about access to unpinned buffers
suffice?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Peter Geoghegan
On Mon, Nov 14, 2016 at 9:22 AM, Heikki Linnakangas  wrote:
> I think that difference in the API is exactly what caught Peter by surprise
> and led to bug #14344. And I didn't see it either, until you two debugged
> it.

That is accurate, of course.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Tom Lane
Heikki Linnakangas  writes:
> On 11/14/2016 06:18 PM, Tom Lane wrote:
>> You're implicitly assuming that a scan always returns its results in the
>> same slot, and that no other slot could contain a copy of that data, but
>> there is no guarantee of either.  See bug #14344 and d8589946d for a
>> pretty recent example where that failed to be true --- admittedly, for
>> a tuplesort scan not a table scan.

> It's the other way round. ExecProcNode might not always return its 
> result in the same slot, but all the callers must assume that it might.

Basically my concern is that this restriction isn't documented anywhere
and I'm not entirely certain it's been adhered to everywhere.  I'd feel
much better about it if there were some way we could verify that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Heikki Linnakangas

On 11/14/2016 06:18 PM, Tom Lane wrote:

Robert Haas  writes:

On Mon, Nov 14, 2016 at 10:23 AM, Tom Lane  wrote:

I don't believe Andres' claim anyway.  There are certainly cases where an
allegedly-valid slot could be pointing at garbage, but table scans aren't
one of them, precisely because of the pin held by the slot.  It would take
a fairly wide-ranging code review to convince me that it's okay to lose
that mechanism.



I don't understand your objection.  It seems to me that the
TupleTableSlot is holding a pin, and the scan is also holding a pin,
so one of them is redundant.  You speculated that the slot could
continue to point at the tuple after the scan has moved on, but how
could such a thing actually happen?


You're implicitly assuming that a scan always returns its results in the
same slot, and that no other slot could contain a copy of that data, but
there is no guarantee of either.  See bug #14344 and d8589946d for a
pretty recent example where that failed to be true --- admittedly, for
a tuplesort scan not a table scan.


It's the other way round. ExecProcNode might not always return its 
result in the same slot, but all the callers must assume that it might.


The tuplesort interface is slightly different: the caller passes a slot 
to tuplesort_gettupleslot() as argument, and tuplesort_gettupleslot() 
places the next tuple in that slot. With executor nodes, a node returns 
a slot that it allocated itself, or it got from a child node. After you 
call ExecProcNode(), you can *not* assume that the old tuple in the slot 
is still valid. The child node can, and in most cases will, reuse the 
same slot, overwriting its contents.


I think that difference in the API is exactly what caught Peter by 
surprise and led to bug #14344. And I didn't see it either, until you 
two debugged it.



Also, there might well be places that are relying on the ability of a
slot to hold a pin for slots that are not simply the return slot of
a plan node.  We could perhaps remove the *use* of this slot feature in
the normal table-scan case, without removing the feature itself.


I didn't see any such use.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Robert Haas
On Mon, Nov 14, 2016 at 11:18 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Nov 14, 2016 at 10:23 AM, Tom Lane  wrote:
>>> I don't believe Andres' claim anyway.  There are certainly cases where an
>>> allegedly-valid slot could be pointing at garbage, but table scans aren't
>>> one of them, precisely because of the pin held by the slot.  It would take
>>> a fairly wide-ranging code review to convince me that it's okay to lose
>>> that mechanism.
>
>> I don't understand your objection.  It seems to me that the
>> TupleTableSlot is holding a pin, and the scan is also holding a pin,
>> so one of them is redundant.  You speculated that the slot could
>> continue to point at the tuple after the scan has moved on, but how
>> could such a thing actually happen?
>
> You're implicitly assuming that a scan always returns its results in the
> same slot, and that no other slot could contain a copy of that data, but
> there is no guarantee of either.  See bug #14344 and d8589946d for a
> pretty recent example where that failed to be true --- admittedly, for
> a tuplesort scan not a table scan.
>
> We could certainly set up some global convention that would make this
> universally true, but I think we'd have to do a lot of code-reading
> to ensure that everything is following that convention.
>
> Also, there might well be places that are relying on the ability of a
> slot to hold a pin for slots that are not simply the return slot of
> a plan node.  We could perhaps remove the *use* of this slot feature in
> the normal table-scan case, without removing the feature itself.

I'm still not getting it.  We don't have to guess who is using this
feature; we can find it out by looking at every place that calls
ExecStoreTuple and looking at whether they are passing InvalidBuffer.
As Heikki's original patch upthread demonstrates, most of them are.
The exceptions are in BitmapHeapNext, IndexNext, ExecOnConflictUpdate,
SampleNext, SeqNext, and TidNext.  If all of those places are or can
be made to hold a buffer pin for as long as the slot would have held
the same pin, we're done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Tom Lane
Robert Haas  writes:
> On Mon, Nov 14, 2016 at 10:23 AM, Tom Lane  wrote:
>> I don't believe Andres' claim anyway.  There are certainly cases where an
>> allegedly-valid slot could be pointing at garbage, but table scans aren't
>> one of them, precisely because of the pin held by the slot.  It would take
>> a fairly wide-ranging code review to convince me that it's okay to lose
>> that mechanism.

> I don't understand your objection.  It seems to me that the
> TupleTableSlot is holding a pin, and the scan is also holding a pin,
> so one of them is redundant.  You speculated that the slot could
> continue to point at the tuple after the scan has moved on, but how
> could such a thing actually happen?

You're implicitly assuming that a scan always returns its results in the
same slot, and that no other slot could contain a copy of that data, but
there is no guarantee of either.  See bug #14344 and d8589946d for a
pretty recent example where that failed to be true --- admittedly, for
a tuplesort scan not a table scan.

We could certainly set up some global convention that would make this
universally true, but I think we'd have to do a lot of code-reading
to ensure that everything is following that convention.

Also, there might well be places that are relying on the ability of a
slot to hold a pin for slots that are not simply the return slot of
a plan node.  We could perhaps remove the *use* of this slot feature in
the normal table-scan case, without removing the feature itself.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Robert Haas
On Mon, Nov 14, 2016 at 10:23 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, Nov 12, 2016 at 10:28 AM, Andres Freund  wrote:
>>> On 2016-08-30 07:38:10 -0400, Tom Lane wrote:
 I think this is probably wrong, or at least very dangerous to remove.
 The reason for the feature is that the slot may continue to point at
 the tuple after the scan has moved on.
>
>>> FWIW, that's not safe to assume in upper layers *anyway*. If you want to
>>> do that, the slot has to be materialized, and that'd make a local
>>> copy. If you don't materialize tts_values/isnull can point into random
>>> old memory (common e.g. for projections and virtual tuples in general).
>
>> So, I think you are arguing in favor of proceeding with this patch?
>
> I don't believe Andres' claim anyway.  There are certainly cases where an
> allegedly-valid slot could be pointing at garbage, but table scans aren't
> one of them, precisely because of the pin held by the slot.  It would take
> a fairly wide-ranging code review to convince me that it's okay to lose
> that mechanism.

I don't understand your objection.  It seems to me that the
TupleTableSlot is holding a pin, and the scan is also holding a pin,
so one of them is redundant.  You speculated that the slot could
continue to point at the tuple after the scan has moved on, but how
could such a thing actually happen?  Once the scan finds a tuple, it's
going to store the tuple in the slot and return.  It won't get control
back to advance the scan until the next time it's asked for a tuple,
and then it has to update the slot before returning.  So I don't see
the problem.  If in the future somebody wants to write an executor
node that does random extra work - like advancing the scan - before
returning the tuple the scan already found, they'll have to take
special precautions, but why should anybody want to do that?

I'm kind of puzzled, here.  Perhaps I am missing something obvious.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Andres Freund
On 2016-11-14 10:09:02 -0500, Robert Haas wrote:
> On Sat, Nov 12, 2016 at 10:28 AM, Andres Freund  wrote:
> > On 2016-08-30 07:38:10 -0400, Tom Lane wrote:
> >> Heikki Linnakangas  writes:
> >> > While profiling some queries and looking at executor overhead, I
> >> > realized that we're not making much use of TupleTableSlot's ability to
> >> > hold a buffer pin. In a SeqScan, the buffer is held pinned by the
> >> > underlying heap-scan anyway. Same with an IndexScan, and the SampleScan.
> >>
> >> I think this is probably wrong, or at least very dangerous to remove.
> >> The reason for the feature is that the slot may continue to point at
> >> the tuple after the scan has moved on.
> >
> > FWIW, that's not safe to assume in upper layers *anyway*. If you want to
> > do that, the slot has to be materialized, and that'd make a local
> > copy. If you don't materialize tts_values/isnull can point into random
> > old memory (common e.g. for projections and virtual tuples in general).
>
> So, I think you are arguing in favor of proceeding with this patch?

Not really, now. I don't buy the argument here against it. I do think
the overhead is quite noticeable. But I also think it has quite the
potential for subtle bugs.  I think I'd feel better if we had some form
of instrumentation trapping buffer accesses without pins present. We've
previously discussed doing that with valgrind...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Tom Lane
Robert Haas  writes:
> On Sat, Nov 12, 2016 at 10:28 AM, Andres Freund  wrote:
>> On 2016-08-30 07:38:10 -0400, Tom Lane wrote:
>>> I think this is probably wrong, or at least very dangerous to remove.
>>> The reason for the feature is that the slot may continue to point at
>>> the tuple after the scan has moved on.

>> FWIW, that's not safe to assume in upper layers *anyway*. If you want to
>> do that, the slot has to be materialized, and that'd make a local
>> copy. If you don't materialize tts_values/isnull can point into random
>> old memory (common e.g. for projections and virtual tuples in general).

> So, I think you are arguing in favor of proceeding with this patch?

I don't believe Andres' claim anyway.  There are certainly cases where an
allegedly-valid slot could be pointing at garbage, but table scans aren't
one of them, precisely because of the pin held by the slot.  It would take
a fairly wide-ranging code review to convince me that it's okay to lose
that mechanism.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-14 Thread Robert Haas
On Sat, Nov 12, 2016 at 10:28 AM, Andres Freund  wrote:
> On 2016-08-30 07:38:10 -0400, Tom Lane wrote:
>> Heikki Linnakangas  writes:
>> > While profiling some queries and looking at executor overhead, I
>> > realized that we're not making much use of TupleTableSlot's ability to
>> > hold a buffer pin. In a SeqScan, the buffer is held pinned by the
>> > underlying heap-scan anyway. Same with an IndexScan, and the SampleScan.
>>
>> I think this is probably wrong, or at least very dangerous to remove.
>> The reason for the feature is that the slot may continue to point at
>> the tuple after the scan has moved on.
>
> FWIW, that's not safe to assume in upper layers *anyway*. If you want to
> do that, the slot has to be materialized, and that'd make a local
> copy. If you don't materialize tts_values/isnull can point into random
> old memory (common e.g. for projections and virtual tuples in general).

So, I think you are arguing in favor of proceeding with this patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-11-12 Thread Andres Freund
On 2016-08-30 07:38:10 -0400, Tom Lane wrote:
> Heikki Linnakangas  writes:
> > While profiling some queries and looking at executor overhead, I
> > realized that we're not making much use of TupleTableSlot's ability to
> > hold a buffer pin. In a SeqScan, the buffer is held pinned by the
> > underlying heap-scan anyway. Same with an IndexScan, and the SampleScan.
>
> I think this is probably wrong, or at least very dangerous to remove.
> The reason for the feature is that the slot may continue to point at
> the tuple after the scan has moved on.

FWIW, that's not safe to assume in upper layers *anyway*. If you want to
do that, the slot has to be materialized, and that'd make a local
copy. If you don't materialize tts_values/isnull can point into random
old memory (common e.g. for projections and virtual tuples in general).

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-10-02 Thread Michael Paquier
On Wed, Aug 31, 2016 at 6:03 AM, Andres Freund  wrote:
> On 2016-08-30 21:59:44 +0100, Greg Stark wrote:
>> On Tue, Aug 30, 2016 at 11:12 AM, Heikki Linnakangas  wrote:
>> > While profiling some queries and looking at executor overhead, I realized
>> > that we're not making much use of TupleTableSlot's ability to hold a buffer
>> > pin. In a SeqScan, the buffer is held pinned by the underlying heap-scan
>> > anyway. Same with an IndexScan, and the SampleScan. The only thing that
>> > relies on that feature is TidScan, but we could easily teach TidScan to 
>> > hold
>> > the buffer pin directly.
>> >
>> > So, how about we remove the ability of a TupleTableSlot to hold a buffer
>> > pin, per the attached patch? It shaves a few cycles from a ExecStoreTuple()
>> > and ExecClearTuple(), which get called a lot. I couldn't measure any actual
>> > difference from that, though, but it seems like a good idea from a
>> > readability point of view, anyway.
>>
>> Out of curiosity why go in this direction and not the other? Why not
>> move those other scans to use the TupleTableSlot API to manage the
>> pins. Offhand it sounds more readable not less to have the
>> TupleTableSlot be a self contained data structure that guarantees the
>> lifetime of the pins matches the slots rather than have a dependency
>> on the code structure in some far-away scan.
>
> At least for heap scans the pins are page level, and thus longer lived
> than the data in a slot. It's important that a scan holds a pin, because
> it needs to rely on the page not being hot pruned etc..

I have moved this patch to next CF. Heikki, are you still planning to
work on it?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-08-30 Thread Andres Freund
On 2016-08-30 21:59:44 +0100, Greg Stark wrote:
> On Tue, Aug 30, 2016 at 11:12 AM, Heikki Linnakangas  wrote:
> > While profiling some queries and looking at executor overhead, I realized
> > that we're not making much use of TupleTableSlot's ability to hold a buffer
> > pin. In a SeqScan, the buffer is held pinned by the underlying heap-scan
> > anyway. Same with an IndexScan, and the SampleScan. The only thing that
> > relies on that feature is TidScan, but we could easily teach TidScan to hold
> > the buffer pin directly.
> >
> > So, how about we remove the ability of a TupleTableSlot to hold a buffer
> > pin, per the attached patch? It shaves a few cycles from a ExecStoreTuple()
> > and ExecClearTuple(), which get called a lot. I couldn't measure any actual
> > difference from that, though, but it seems like a good idea from a
> > readability point of view, anyway.
> 
> Out of curiosity why go in this direction and not the other? Why not
> move those other scans to use the TupleTableSlot API to manage the
> pins. Offhand it sounds more readable not less to have the
> TupleTableSlot be a self contained data structure that guarantees the
> lifetime of the pins matches the slots rather than have a dependency
> on the code structure in some far-away scan.

At least for heap scans the pins are page level, and thus longer lived
than the data in a slot. It's important that a scan holds a pin, because
it needs to rely on the page not being hot pruned etc..


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-08-30 Thread Greg Stark
On Tue, Aug 30, 2016 at 11:12 AM, Heikki Linnakangas  wrote:
> While profiling some queries and looking at executor overhead, I realized
> that we're not making much use of TupleTableSlot's ability to hold a buffer
> pin. In a SeqScan, the buffer is held pinned by the underlying heap-scan
> anyway. Same with an IndexScan, and the SampleScan. The only thing that
> relies on that feature is TidScan, but we could easily teach TidScan to hold
> the buffer pin directly.
>
> So, how about we remove the ability of a TupleTableSlot to hold a buffer
> pin, per the attached patch? It shaves a few cycles from a ExecStoreTuple()
> and ExecClearTuple(), which get called a lot. I couldn't measure any actual
> difference from that, though, but it seems like a good idea from a
> readability point of view, anyway.

Out of curiosity why go in this direction and not the other? Why not
move those other scans to use the TupleTableSlot API to manage the
pins. Offhand it sounds more readable not less to have the
TupleTableSlot be a self contained data structure that guarantees the
lifetime of the pins matches the slots rather than have a dependency
on the code structure in some far-away scan.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-08-30 Thread Andres Freund
Hi,

On 2016-08-30 13:12:41 +0300, Heikki Linnakangas wrote:
> While profiling some queries and looking at executor overhead, I realized
> that we're not making much use of TupleTableSlot's ability to hold a buffer
> pin.

FWIW, I came to a similar conclusion, while working on passing around
making the executor batched.


> So, how about we remove the ability of a TupleTableSlot to hold a buffer
> pin, per the attached patch? It shaves a few cycles from a ExecStoreTuple()
> and ExecClearTuple(), which get called a lot. I couldn't measure any actual
> difference from that, though, but it seems like a good idea from a
> readability point of view, anyway.

I actually found that rather beneficial from a performance point of
view.


> How much do we need to worry about breaking extensions? I.e. to what extent
> do we consider the TupleTableSlot API to be stable, for extensions to use?
> The FDW API uses TupleTableSlots - this patch had to fix the ExecStoreTuple
> calls in postgres_fdw.

I think we're just going to have to deal with the fallout. Trying to
maintain backward compatibility with the TupleTableSlot API seems to
prevent some rather important improvement in the area.


> We could refrain from changing the signature of ExecStoreTuple(), and throw
> an error if you try to pass a valid buffer to it. But I also have some
> bigger changes to TupleTableSlot in mind.

We should probably coordinate ;)


> I think we could gain some speed
> by making slots "read-only". For example, it would be nice if a SeqScan
> could just replace the tts_tuple pointer in the slot, and not have to worry
> that the upper nodes might have materialized the slot, and freeing the
> copied tuple. Because that happens very rarely in practice. It would be
> better if those few places where we currently materialize an existing slot,
> we would create a copy of the slot, and leave the original slot unmodified.
> I'll start a different thread on that after some more experimentation, but
> if we e.g. get rid of ExecMaterializeSlot() in its current form, would that
> be OK?

Hm.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-08-30 Thread Heikki Linnakangas

On 08/30/2016 02:38 PM, Tom Lane wrote:

Heikki Linnakangas  writes:

While profiling some queries and looking at executor overhead, I
realized that we're not making much use of TupleTableSlot's ability to
hold a buffer pin. In a SeqScan, the buffer is held pinned by the
underlying heap-scan anyway. Same with an IndexScan, and the SampleScan.


I think this is probably wrong, or at least very dangerous to remove.
The reason for the feature is that the slot may continue to point at
the tuple after the scan has moved on.  You've given no evidence at all
that that scenario doesn't arise (or that we wouldn't need it in future).


For a SeqScan, the buffer is pinned, and the tuple stays valid, because 
we have an open HeapScan on it. It will stay valid until the next 
heap_getnext() call.



At the very least I'd want to see this patch clearing the slot before
advancing the scan, so that it doesn't have actual dangling pointers
laying about.


Fair enough, although we're not 100% careful about that currently 
either. For examples, see gather_getnext, CopyFrom, and others. 
Personally I think that's OK, as long as the window between invalidating 
the underlying buffer (or other source of the tuple), and storing a new 
tuple in it, is small enough. In particular, I think this is OK:


tuple = heap_getnext(); /* this leaves a dangling pointer in slot */
slot = ExecStoreTuple(tuple); /* and this makes it valid again */


So, how about we remove the ability of a TupleTableSlot to hold a buffer
pin, per the attached patch? It shaves a few cycles from a
ExecStoreTuple() and ExecClearTuple(), which get called a lot. I
couldn't measure any actual difference from that, though, but it seems
like a good idea from a readability point of view, anyway.


If you can't measure a clear performance improvement, I'm -1 on the
whole thing.  You've got risk and breakage of third-party code, and
what to show for it?


Well, I think simplicity is a virtue all by itself. But ok, let me try a 
bit harder to benchmark this:


I created a test table with:

create table foo as select repeat('x', 500) || g from generate_series(1, 
100) g;

vacuum freeze;

And then ran:

$ cat count.sql
select count(*) from foo;
$ pgbench -n -f count.sql postgres -t1000

This is pretty much a best case for this patch. A "count(*)" doesn't do 
much else than put the tuple in the slot, and the tuples are wide 
enough, that you switch from one buffer to another every 15 tuples, 
which exercises the buffer pinning code.


I ran that pgbench command three times with and without the patch, and 
picked the best times:


Without patch:
tps = 12.413360 (excluding connections establishing)

With the patch:
tps = 13.183164 (excluding connections establishing)

That's about 6% improvement.

This was with the attached version of this patch. Compared to the 
previous, I added ExecClearTuple() calls to clear the slots before 
advancing the heap/index scan, to avoid the dangling pointers. Doing 
that added an extra function call to this hot codepath, however, so I 
compensated for that by adding an inline version of ExecStoreTuple(), 
for those callers that know that the slot is already empty.


BTW, turning ExecStoreVirtualTuple into a static inline function would 
also be an easy way to shave some cycles. But I refrained from including 
that change into this patch.



I'll start a different thread on that after
some more experimentation, but if we e.g. get rid of
ExecMaterializeSlot() in its current form, would that be OK?


INSERT/UPDATE, for one, relies on that.


Sure. I'm thinking of refactoring that so that it doesn't. Or maybe add 
a new kind of a TupleTableSlot that cannot be materialized, which makes 
ExecStore/ClearTuple for that slot simpler, and use that in SeqScan and 
friends. INSERT/UPDATE could continue to use ExecMaterializeSlot(), but 
it would need to have a slot of its own, instead of materializing the 
slot it got from plan tree. Yes, this is still very hand-wavey...


- Heikki

commit 6f047ac81172c7ea559c9992bfbc1e05a4c3bedd
Author: Heikki Linnakangas 
Date:   Tue Aug 30 20:11:25 2016 +0300

Remove support for holding a buffer pin in a TupleTableSlot.

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index daf0438..299fa62 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1387,7 +1387,6 @@ postgresIterateForeignScan(ForeignScanState *node)
 	 */
 	ExecStoreTuple(fsstate->tuples[fsstate->next_tuple++],
    slot,
-   InvalidBuffer,
    false);
 
 	return slot;
@@ -3152,7 +3151,7 @@ store_returning_result(PgFdwModifyState *fmstate,
 			NULL,
 			fmstate->temp_cxt);
 		/* tuple will be deleted when it is cleared from the slot */
-		ExecStoreTuple(newtup, slot, InvalidBuffer, true);
+		ExecStoreTuple(newtup, slot, true);
 	}
 	PG_CATCH();
 	{
@@ -3258,7 +3257,7 @@ 

Re: [HACKERS] Pinning a buffer in TupleTableSlot is unnecessary

2016-08-30 Thread Tom Lane
Heikki Linnakangas  writes:
> While profiling some queries and looking at executor overhead, I 
> realized that we're not making much use of TupleTableSlot's ability to 
> hold a buffer pin. In a SeqScan, the buffer is held pinned by the 
> underlying heap-scan anyway. Same with an IndexScan, and the SampleScan. 

I think this is probably wrong, or at least very dangerous to remove.
The reason for the feature is that the slot may continue to point at
the tuple after the scan has moved on.  You've given no evidence at all
that that scenario doesn't arise (or that we wouldn't need it in future).

At the very least I'd want to see this patch clearing the slot before
advancing the scan, so that it doesn't have actual dangling pointers
laying about.

> So, how about we remove the ability of a TupleTableSlot to hold a buffer 
> pin, per the attached patch? It shaves a few cycles from a 
> ExecStoreTuple() and ExecClearTuple(), which get called a lot. I 
> couldn't measure any actual difference from that, though, but it seems 
> like a good idea from a readability point of view, anyway.

If you can't measure a clear performance improvement, I'm -1 on the
whole thing.  You've got risk and breakage of third-party code, and
what to show for it?

> I'll start a different thread on that after 
> some more experimentation, but if we e.g. get rid of 
> ExecMaterializeSlot() in its current form, would that be OK?

INSERT/UPDATE, for one, relies on that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers