Re: [HACKERS] tuplestore API problem

2009-04-01 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 I don't think advising or documenting such restriction to the future
 programmer is a good idea from the point of encapsulation. I've come
 up with an idea, that the read pointers remember their last slot as
 you suggest and materialize it when another slot comes in
 tuplestore_gettupleslot() and forget the former one. By this, you
 don't need the restriction above, adding minimum penalty that is paid
 if and only if you pass more than one tupleslot to tuplestore, which
 doesn't seem to be occurred currently.

I think that the problem I found a couple days ago
http://archives.postgresql.org/pgsql-hackers/2009-03/msg01247.php
probably blows a hole in all these schemes.  After-the-fact
materialization of a TupleTableSlot won't protect any pass-by-reference
Datums that have already been fetched from that slot.  Perhaps we
could invent a coding rule that would prevent the situation, but
I think it would be awfully easy to mess up in any case where you
actually had a need to keep track of more than one current tuple.

I now think that the CVS-HEAD arrangement is about as good as we
should expect to get for 8.4.  The experiments I've been doing
suggest that the extra tuple copying isn't a major bottleneck
anyway...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-30 Thread Hitoshi Harada
2009/3/30 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 2009/3/29 Tom Lane t...@sss.pgh.pa.us:
 ... What might be a bit saner is to remember the slot last passed to
 tuplestore_gettupleslot for each read pointer.  The implication would be
 that we'd be assuming only one slot is used to fetch from any one read
 pointer, but that is probably a reasonable restriction.

 Hm, this choice is better than mine. But if we take this, I suppose we
 need to consider the way to break the restriction, for the case we
 will be forced to use many TupleTableSlots on one read pointer.

 I don't think we'd ever be forced to do that; and it would be easy to
 add an Assert to tuplestore_gettupleslot to check that it gets the same
 slot on each call.  Someone who needed to save previous tuples would be
 advised to copy older tuples to some other slot after fetching them.

I don't think advising or documenting such restriction to the future
programmer is a good idea from the point of encapsulation. I've come
up with an idea, that the read pointers remember their last slot as
you suggest and materialize it when another slot comes in
tuplestore_gettupleslot() and forget the former one. By this, you
don't need the restriction above, adding minimum penalty that is paid
if and only if you pass more than one tupleslot to tuplestore, which
doesn't seem to be occurred currently.

Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-29 Thread Hitoshi Harada
2009/3/29 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 So I tried pass EState.es_tupleTables to tuplestore_begin_heap() to
 trace those TupleTableSlots. Note that if you pass NULL the behavior
 is as before so nothing's broken. Regression passes but I'm not quite
 sure EState.es_tupleTable is only place that holds TupleTableSlots
 passed to tuplestore...

 I've got zero confidence in that, and it seems pretty horrid from a
 system structural point of view even if it worked: neither tuplestore.c
 nor its direct callers have any business knowing where all the slots
 are.  What might be a bit saner is to remember the slot last passed to
 tuplestore_gettupleslot for each read pointer.  The implication would be
 that we'd be assuming only one slot is used to fetch from any one read
 pointer, but that is probably a reasonable restriction.

Hm, this choice is better than mine. But if we take this, I suppose we
need to consider the way to break the restriction, for the case we
will be forced to use many TupleTableSlots on one read pointer.
Without that, I don't think it's a good idea to take this way.

 I know you propose should_copy boolean parameter would be added to
 tuplestore_gettupleslot().

 I already did it, and concluded that not only were all the
 tuplestore_gettupleslot calls in nodeWindowAgg broken, but so was
 nodeCtescan.  I'm open to reverting that patch if you have a better
 solution though.

For now, yours is the better than everything. As we're not staying
here forever, let's choose copy way. Then I'll keep it in my mind and
retry when I will improve the tuplestore performance, with the later
window function's patch.


Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-29 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 2009/3/29 Tom Lane t...@sss.pgh.pa.us:
 ... What might be a bit saner is to remember the slot last passed to
 tuplestore_gettupleslot for each read pointer.  The implication would be
 that we'd be assuming only one slot is used to fetch from any one read
 pointer, but that is probably a reasonable restriction.

 Hm, this choice is better than mine. But if we take this, I suppose we
 need to consider the way to break the restriction, for the case we
 will be forced to use many TupleTableSlots on one read pointer.

I don't think we'd ever be forced to do that; and it would be easy to
add an Assert to tuplestore_gettupleslot to check that it gets the same
slot on each call.  Someone who needed to save previous tuples would be
advised to copy older tuples to some other slot after fetching them.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-28 Thread Hitoshi Harada
2009/3/28 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 2009/3/27 Hitoshi Harada umi.tan...@gmail.com:
 2009/3/27 Tom Lane t...@sss.pgh.pa.us:
 A brute-force solution is to change tuplestore_gettupleslot() so that it
 always copies the tuple, but this would be wasted cycles for most uses
 of tuplestores.  I'm thinking of changing tuplestore_gettupleslot's API
 to add a bool parameter specifying whether the caller wants to force
 a copy.

 Here's the patch. Hope there are no more on the same reason. It seems
 that we'd need to implement something like garbage collector in
 tuplestore, marking and tracing each row references, if the complete
 solution is required.

 I don't like this; I'm planning to go with the aforementioned API
 change instead.  The way you have it guarantees an extra copy cycle
 even when tuplestore is already making a copy internally; and it doesn't
 help if we find similar problems elsewhere.  (While I'm making the
 API change I'll take a close look at each call site to see if it has
 any similar risk.)

You're right. It kills performance even after dumptuples(). Thinking
more, I found the cause is only around dumptuples(). If you can trace
TupleTableSlots that points to memtuples inside tuplestore, you can
materialize them just before WRITETUP() in dumptuples().
So I tried pass EState.es_tupleTables to tuplestore_begin_heap() to
trace those TupleTableSlots. Note that if you pass NULL the behavior
is as before so nothing's broken. Regression passes but I'm not quite
sure EState.es_tupleTable is only place that holds TupleTableSlots
passed to tuplestore...

I know you propose should_copy boolean parameter would be added to
tuplestore_gettupleslot(). That always adds overhead even if
tuplestore *doesn't* dump tuples. That case doesn't need copy tuples,
I guess.

Regards,



-- 
Hitoshi Harada


tuplestore_apichange.20090328.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-28 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 So I tried pass EState.es_tupleTables to tuplestore_begin_heap() to
 trace those TupleTableSlots. Note that if you pass NULL the behavior
 is as before so nothing's broken. Regression passes but I'm not quite
 sure EState.es_tupleTable is only place that holds TupleTableSlots
 passed to tuplestore...

I've got zero confidence in that, and it seems pretty horrid from a
system structural point of view even if it worked: neither tuplestore.c
nor its direct callers have any business knowing where all the slots
are.  What might be a bit saner is to remember the slot last passed to
tuplestore_gettupleslot for each read pointer.  The implication would be
that we'd be assuming only one slot is used to fetch from any one read
pointer, but that is probably a reasonable restriction.

 I know you propose should_copy boolean parameter would be added to
 tuplestore_gettupleslot().

I already did it, and concluded that not only were all the
tuplestore_gettupleslot calls in nodeWindowAgg broken, but so was
nodeCtescan.  I'm open to reverting that patch if you have a better
solution though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-27 Thread Hitoshi Harada
2009/3/27 Hitoshi Harada umi.tan...@gmail.com:
 2009/3/27 Tom Lane t...@sss.pgh.pa.us:
 By chance I discovered that this query in the regression tests

 SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;

 stops working if work_mem is small enough: it either dumps core or
 delivers wrong answers depending on platform.

 After some tracing I found out the reason.  ExecWindowAgg() does this:

    if (!tuplestore_gettupleslot(winstate-buffer, true,
                                 winstate-ss.ss_ScanTupleSlot))
        elog(ERROR, unexpected end of tuplestore);

 and then goes off and calls the window functions (ntile() here), and
 expects the ScanTupleSlot to still be valid afterwards.  However,
 ntile() forces us to read to the end of the input to find out the number
 of rows.  If work_mem is small enough, that means the tuplestore is
 forced into dump-to-disk mode, which means it releases all its in-memory
 tuples.  And guess what: the ScanTupleSlot is pointing at one of those,
 it doesn't have its own copy of the tuple.  So we wind up trying to read
 from a trashed bit of memory.

 A brute-force solution is to change tuplestore_gettupleslot() so that it
 always copies the tuple, but this would be wasted cycles for most uses
 of tuplestores.  I'm thinking of changing tuplestore_gettupleslot's API
 to add a bool parameter specifying whether the caller wants to force
 a copy.

 Comments, better ideas?

 Is this tuplestore API problem? ISTM this is window function's
 problem. I think my early code was holding heaptuple instead of
 tupleslot for the current row. At a glance, the issue appears in only
 current row in window function, which fetches row and uses it later
 after storing following rows in some cases. So a brute-force solution
 might be that ExecWindowAgg() copies the current row from tuplestore
 instead of pointing directly to inside tuplestore memory, not changing
 tuplestore API.
Here's the patch. Hope there are no more on the same reason. It seems
that we'd need to implement something like garbage collector in
tuplestore, marking and tracing each row references, if the complete
solution is required.

Regards,

-- 
Hitoshi Harada


windowagg_tempslot.20090327.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-27 Thread Tom Lane
Hitoshi Harada umi.tan...@gmail.com writes:
 2009/3/27 Hitoshi Harada umi.tan...@gmail.com:
 2009/3/27 Tom Lane t...@sss.pgh.pa.us:
 A brute-force solution is to change tuplestore_gettupleslot() so that it
 always copies the tuple, but this would be wasted cycles for most uses
 of tuplestores.  I'm thinking of changing tuplestore_gettupleslot's API
 to add a bool parameter specifying whether the caller wants to force
 a copy.

 Here's the patch. Hope there are no more on the same reason. It seems
 that we'd need to implement something like garbage collector in
 tuplestore, marking and tracing each row references, if the complete
 solution is required.

I don't like this; I'm planning to go with the aforementioned API
change instead.  The way you have it guarantees an extra copy cycle
even when tuplestore is already making a copy internally; and it doesn't
help if we find similar problems elsewhere.  (While I'm making the
API change I'll take a close look at each call site to see if it has
any similar risk.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-26 Thread Simon Riggs

On Thu, 2009-03-26 at 12:57 -0400, Tom Lane wrote:
 If work_mem is small enough, that means the tuplestore is
 forced into dump-to-disk mode, which means it releases all its
 in-memory tuples.  And guess what: the ScanTupleSlot is pointing at
 one of those, it doesn't have its own copy of the tuple.  So we wind
 up trying to read from a trashed bit of memory.
 
 A brute-force solution is to change tuplestore_gettupleslot() so that
 it always copies the tuple, but this would be wasted cycles for most
 uses of tuplestores.  I'm thinking of changing
 tuplestore_gettupleslot's API
 to add a bool parameter specifying whether the caller wants to force
 a copy.
 
 Comments, better ideas?

Sounds very similar to the solution that you just removed from the hash
join code for performance reasons. Flushing memory when we overflow
sounds like an artifact from the time when tuplestore split from
tuplesort. Can't we keep the appropriate rows in memory and scroll
through them?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-26 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Sounds very similar to the solution that you just removed from the hash
 join code for performance reasons. Flushing memory when we overflow
 sounds like an artifact from the time when tuplestore split from
 tuplesort. Can't we keep the appropriate rows in memory and scroll
 through them?

We're not doing a fundamental redesign of the window function support
right now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tuplestore API problem

2009-03-26 Thread Hitoshi Harada
2009/3/27 Tom Lane t...@sss.pgh.pa.us:
 By chance I discovered that this query in the regression tests

 SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;

 stops working if work_mem is small enough: it either dumps core or
 delivers wrong answers depending on platform.

 After some tracing I found out the reason.  ExecWindowAgg() does this:

    if (!tuplestore_gettupleslot(winstate-buffer, true,
                                 winstate-ss.ss_ScanTupleSlot))
        elog(ERROR, unexpected end of tuplestore);

 and then goes off and calls the window functions (ntile() here), and
 expects the ScanTupleSlot to still be valid afterwards.  However,
 ntile() forces us to read to the end of the input to find out the number
 of rows.  If work_mem is small enough, that means the tuplestore is
 forced into dump-to-disk mode, which means it releases all its in-memory
 tuples.  And guess what: the ScanTupleSlot is pointing at one of those,
 it doesn't have its own copy of the tuple.  So we wind up trying to read
 from a trashed bit of memory.

 A brute-force solution is to change tuplestore_gettupleslot() so that it
 always copies the tuple, but this would be wasted cycles for most uses
 of tuplestores.  I'm thinking of changing tuplestore_gettupleslot's API
 to add a bool parameter specifying whether the caller wants to force
 a copy.

 Comments, better ideas?

Is this tuplestore API problem? ISTM this is window function's
problem. I think my early code was holding heaptuple instead of
tupleslot for the current row. At a glance, the issue appears in only
current row in window function, which fetches row and uses it later
after storing following rows in some cases. So a brute-force solution
might be that ExecWindowAgg() copies the current row from tuplestore
instead of pointing directly to inside tuplestore memory, not changing
tuplestore API.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers