subject:"\[HACKERS\] Changed SRF in targetlist handling"

Re: [HACKERS] Changed SRF in targetlist handling

2016-12-08 Thread Tom Lane

Robert Haas  writes:
> Tom, it's been about 3.5 months since you wrote this.  I think it
> would be really valuable if you could get to this RSN because the
> large patch set posted on the "Changed SRF in targetlist handling"
> thread is backed up behind this -- and I think that's really valuable
> work which I don't want to see slip out of this release.

Yeah, I was busy with other stuff during the recent commitfest.
I'll try to get back to this.  There's still only 24 hours in a day,
though.  (And no, [1] is not enough to help.)

regards, tom lane

[1] 
https://www.theguardian.com/science/2016/dec/07/earths-day-lengthens-by-two-milliseconds-a-century-astronomers-find


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-12-08 Thread Robert Haas

On Mon, Aug 22, 2016 at 4:20 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2016-08-17 17:41:28 -0700, Andres Freund wrote:
>>> Tom, do you think this is roughly going in the right direction?
>
> I've not had time to look at this patch, I'm afraid.  If you still
> want me to, I can make time in a day or so.

Tom, it's been about 3.5 months since you wrote this.  I think it
would be really valuable if you could get to this RSN because the
large patch set posted on the "Changed SRF in targetlist handling"
thread is backed up behind this -- and I think that's really valuable
work which I don't want to see slip out of this release.  At the same
time, both that and this are quite invasive, and I don't want it all
to get committed the day before feature freeze, because that will mess
up the schedule.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-10-02 Thread Michael Paquier

On Wed, Aug 24, 2016 at 3:55 AM, Andres Freund  wrote:
> Comments?

This thread has no activity for some time now and it is linked to this CF entry:
https://commitfest.postgresql.org/10/759/
I am marking it as returned with feedback..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-23 Thread Andres Freund

On 2016-08-17 17:41:28 -0700, Andres Freund wrote:
> a) Allow to avoid using a tuplestore for SRF_PERCALL SRFs in ROWS FROM -
>otherwise our performance would regress noticeably in some cases.

To demonstrate the problem:

master:
=# COPY (SELECT generate_series(1, 5000)) TO '/dev/null';
COPY 5000
Time: 6859.830 ms
=# COPY (SELECT * FROM generate_series(1, 5000)) TO '/dev/null';
COPY 5000
Time: 11314.507 ms

getting rid of the materialization indeed fixes the problem:

dev:
=# COPY (SELECT generate_series(1, 5000)) TO
'/dev/null';
COPY 5000
Time: 5757.547 ms

=# COPY (SELECT * FROM generate_series(1, 5000)) TO
'/dev/null';
COPY 5000
Time: 5842.524 ms

I've currently implemented this by having nodeFunctionscan.c store
enough state in FunctionScanPerFuncState to continue the ValuePerCall
protocol.  That all seems to work well, without big problems.

The open issue here is whether / how we want to deal with
EXEC_FLAG_REWIND and EXEC_FLAG_BACKWARD. Currently that, with some added
complications, is implemented in nodeFunctionscan.c itself. But for
ValuePerCall SRFs that doesn't directly work anymore.

ISTM that the easiest way here is actually to rip out support for
EXEC_FLAG_REWIND/EXEC_FLAG_BACKWARD from nodeFunctionscan.c. If the plan
requires that, the planner will slap a Material node on-top. Which will
even be more efficient when ROWS FROM for multiple SRFs, or WITH
ORDINALITY, are used.  Alternatively we can continue to create a
tuplestore for ValuePerCall when eflags indicates that's required. But
personally I don't see that as an advantageous course.

Comments?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-22 Thread Andres Freund

Hi,

On 2016-08-22 16:20:58 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-08-17 17:41:28 -0700, Andres Freund wrote:
> >> Tom, do you think this is roughly going in the right direction?
> 
> I've not had time to look at this patch, I'm afraid.  If you still
> want me to, I can make time in a day or so.

That'd greatly be appreciated. I think polishing the POC up to
committable patch will be a considerable amount of work, and I'd like
design feedback before that.


> > I'm working on these. Atm ExecMakeTableFunctionResult() resides in
> > execQual.c - I'm inlining it into nodeFunctionscan.c now, because
> > there's no other callers, and having it separate seems to bring no
> > benefit.
> 
> > Please speak soon up if you disagree.
> 
> I think ExecMakeTableFunctionResult was placed in execQual.c because
> it seemed to belong there alongside the support for SRFs in tlists.
> If that's going away then there's no good reason not to move the logic
> to where it's used.

Cool, then we agree.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-22 Thread Tom Lane

Andres Freund  writes:
> On 2016-08-17 17:41:28 -0700, Andres Freund wrote:
>> Tom, do you think this is roughly going in the right direction?

I've not had time to look at this patch, I'm afraid.  If you still
want me to, I can make time in a day or so.

> I'm working on these. Atm ExecMakeTableFunctionResult() resides in
> execQual.c - I'm inlining it into nodeFunctionscan.c now, because
> there's no other callers, and having it separate seems to bring no
> benefit.

> Please speak soon up if you disagree.

I think ExecMakeTableFunctionResult was placed in execQual.c because
it seemed to belong there alongside the support for SRFs in tlists.
If that's going away then there's no good reason not to move the logic
to where it's used.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-22 Thread Andres Freund

Hi,

On 2016-05-23 09:26:03 +0800, Craig Ringer wrote:
> SRFs-in-tlist are a lot faster for lockstep iteration etc. They're also
> much simpler to write, though if the result result rowcount differs
> unexpectedly between the functions you get exciting and unexpected
> behaviour.
> 
> WITH ORDINALITY provides what I think is the last of the functionality
> needed to replace SRFs-in-from, but at a syntatactic complexity and
> performance cost. The following example demonstrates that, though it
> doesn't do anything that needs LATERAL etc. I'm aware the following aren't
> semantically identical if the rowcounts differ.

I think here you're just missing ROWS FROM (generate_series(..), 
generate_series(...))

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-22 Thread Andres Freund

On 2016-08-17 17:41:28 -0700, Andres Freund wrote:
> Tom, do you think this is roughly going in the right direction? My plan
> here is to develop two patches, to come before this:
> 
> a) Allow to avoid using a tuplestore for SRF_PERCALL SRFs in ROWS FROM -
>otherwise our performance would regress noticeably in some cases.
> b) Allow ROWS FROM() to return SETOF RECORD type SRFs as one column,
>instead of expanded. That's important to be able move SETOF RECORD
>returning functions in the targetlist into ROWS FROM, which otherwise
>requires an explicit column list.

I'm working on these. Atm ExecMakeTableFunctionResult() resides in
execQual.c - I'm inlining it into nodeFunctionscan.c now, because
there's no other callers, and having it separate seems to bring no
benefit.

Please speak soon up if you disagree.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-17 Thread Andres Freund

On 2016-08-03 20:22:03 -0700, Andres Freund wrote:
> On 2016-08-02 16:30:55 -0700, Andres Freund wrote:
> > > > Besides that I'm structurally wondering whether turning the original
> > > > query into a subquery is the right thing to do. It requires some kind of
> > > > ugly munching of Query->*, and has the above problem.
> > >
> > > It does not seem like it should be that hard, certainly no worse than
> > > subquery pullup.  Want to show code?
> >
> > It's not super hard, there's some stuff like pushing/not-pushing
> > various sortgrouprefs to the subquery. But I think we can live with it.
> >
> > Let me clean up the code some, hope to have something today or
> > tomorrow.
> 
> Here we go.  This *clearly* is a POC, not more.  But it mostly works.
> 
> 
> 0001 - adds some test, some of those change after the later patches
> 0002 - main SRF via ROWS FROM () implementation
> 0003 - Large patch removing now unused code. Most satisfying.
> 
> 
> The interesting bit is obviously 0002. What it basically does is, at the 
> beginning
> of subquery_planner():
> 1) unsrfify:
>move the jointree into a subquery
> 2) unsrfify_reference_subquery_mutator:
>process the old targetlist to reference the new subquery. If a
>TargetEntry doesn't contain a set, it's entirely moved into the
>subquery. Otherwise all Vars/Aggrefs/... it references are moved to
>the subquery, and referenced in the outer query's target list.
> 3) unsrfify_implement_srfs_mutator:
>Replace set returning functions in the targetlist with references to
>a new FUNCTION RTE. All non-nested tSRFs are part of the same RTE
>(i.e. the least common multiple behaviour is gone). all tSRFs in
>arguments are implemented as another FUNCTION RTE.
> 
> I discovered that we allow SRFs in UPDATE target lists. It's not clear
> to me what that's supposed to mean. Nor how exactly to implement that,
> given expand_targetlist(). Right now that fails with the patch, because
> it re-inserts Var's for the relation replaced by the subquery.
> 
> Note that I've not bothered to fix up the regression test output - I'm
> certain that explain output and such will still change.
> 
> Biggest questions / tasks:
> * General approach
> * DML handling
> * Operator implementation
> * SETOF record handling
> * correct handling of lateral dependency from RTE to subquery to force
>   evaluation order, instead of my RangeTblEntry->deps hack.
> * lot of cleanup
> 
> Comments?

Tom, do you think this is roughly going in the right direction? My plan
here is to develop two patches, to come before this:

a) Allow to avoid using a tuplestore for SRF_PERCALL SRFs in ROWS FROM -
   otherwise our performance would regress noticeably in some cases.
b) Allow ROWS FROM() to return SETOF RECORD type SRFs as one column,
   instead of expanded. That's important to be able move SETOF RECORD
   returning functions in the targetlist into ROWS FROM, which otherwise
   requires an explicit column list.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-02 Thread Andres Freund

On 2016-08-02 19:02:38 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I've an implementation that
>
> > 1) turns all targetlist SRF (tSRF from now on) into ROWS FROM
> >expressions. If there's tSRFs in the argument of a tSRF those becomes
> >a separate, lateral, ROWS FROM expression.
>
> > 2) If grouping/window functions are present, the entire query is wrapped
> >in a subquery RTE, except for the set-returning function. All
> >referenced Var|Aggref|GroupingFunc|WindowFunc|Param nodes in the
> >original targetlist are made to reference that subquery, which gets a
> >TargetEntry for them.
>
> FWIW, I'd be inclined to do the subquery RTE all the time,

Yea, that's what I ended up doing.

> adding some
> optimization fence to ensure it doesn't get folded back.  That fixes
> your problem here:

> > So far I have one problem without an easy solution: Historically queries
> > like
> > =# SELECT id, generate_series(1,2) FROM (VALUES(1),(2)) few(id);
> > ┌┬─┐
> > │ id │ generate_series │
> > ├┼─┤
> > │  1 │   1 │
> > │  1 │   2 │
> > │  2 │   1 │
> > │  2 │   2 │
> > └┴─┘
> > have preserved the SRF output ordering. But by turning the SRF into a
> > ROWS FROM, there's no guarantee that the cross join between "few" and
> > generate_series(1,3) above is implemented in that order.

But I don't see how that fixes the above problem?  The join, on the
top-level because of aggregates, still can be implemented as
subquery join srf or as srf join subquery, with the different output order
that implies.  I've duct-taped together a solution for that, by forcing
the lateral machinery to always see a dependency from the SRF to the
subquery; but that probably needs a nicer fix than a RangeTblEntry->deps
field which is processed in extract_lateral_references() ;)

> > Besides that I'm structurally wondering whether turning the original
> > query into a subquery is the right thing to do. It requires some kind of
> > ugly munching of Query->*, and has the above problem.
>
> It does not seem like it should be that hard, certainly no worse than
> subquery pullup.  Want to show code?

It's not super hard, there's some stuff like pushing/not-pushing
various sortgrouprefs to the subquery. But I think we can live with it.

Let me clean up the code some, hope to have something today or tomorrow.

> > An alternative approach would be to do this during parse-analysis, but I
> > think that might end up being confusing, because stored rules would
> > suddenly have a noticeably different structure, and it'd tie us a lot
> > more to the details of that transformation than I'd like.
>
> -1 on that; we do not want this transformation visible in stored rules.

Agreed.

> > Besides the removal of the least-common-multiple behaviour of tSRF queries,
> > there's some other consequences that using function scans have:
> > Previously if a tSRF was never evaluated, it didn't cause the number of
> > rows from being increased. E.g.
> > SELECT id, COALESCE(1, generate_series(1,2)) FROM (VALUES(1),(2)) few(id);
> > only produced two rows.  But using joins means that a simple
> > implementation of using ROWS FROM returns four rows.
>
> Hmm.  I don't mind changing behavior in that sort of corner case.
> If we're prepared to discard the LCM behavior, this seems at least
> an order of magnitude less likely to be worth worrying about.

I think it's fine, and potentially less confusing.

> Would it be worth detecting SRFs below CASE/COALESCE/etc and throwing
> an error?  It would be easier to sell throwing an error than silently
> changing behavior, I think.

Hm. We could, but I think the new behaviour would actually make sense in
the long run. Interpreting the coalesce to run on the output of the SRF
doesn't seem bad to me.

I found another edgecase, which we need to make a decision about.
'record' returning SRFs can't be transformed easily into a ROWS
FROM. Consider e.g. the following from the regression tests:

create function array_to_set(anyarray) returns setof record as $$
  select i AS "index", $1[i] AS "value" from generate_subscripts($1, 1)  i
$$ language sql strict immutable;

select array_to_set(array['one', 'two']);
┌──┐
│ array_to_set │
├──┤
│ (1,one)  │
│ (2,two)  │
└──┘
(2 rows)

which currently works. That currently can't be modeled as ROWS FROM()
directly, because that desperately wants to return the columns as
columns, which we can't do for 'record' returning things, because they
don't have defined columns.  For composite returning SRFs I've currently
implemented that by generating a ROWS() expression, but that doesn't
work for record.

So it seems like we need some, not necessarily user exposed, way of
making nodeFunctionscan.c return the return value as one datum.  One
way, as suggested by Andrew G. on IRC, was to interpret empty column
definition in ROWS FROM

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-02 Thread Tom Lane

Andres Freund  writes:
> I've an implementation that

> 1) turns all targetlist SRF (tSRF from now on) into ROWS FROM
>expressions. If there's tSRFs in the argument of a tSRF those becomes
>a separate, lateral, ROWS FROM expression.

> 2) If grouping/window functions are present, the entire query is wrapped
>in a subquery RTE, except for the set-returning function. All
>referenced Var|Aggref|GroupingFunc|WindowFunc|Param nodes in the
>original targetlist are made to reference that subquery, which gets a
>TargetEntry for them.

FWIW, I'd be inclined to do the subquery RTE all the time, adding some
optimization fence to ensure it doesn't get folded back.  That fixes
your problem here:

> So far I have one problem without an easy solution: Historically queries
> like
> =# SELECT id, generate_series(1,2) FROM (VALUES(1),(2)) few(id);
> ââââââ¬ââââââââââââââââââ
> â id â generate_series â
> ââââââ¼ââââââââââââââââââ¤
> â  1 â   1 â
> â  1 â   2 â
> â  2 â   1 â
> â  2 â   2 â
> ââââââ´ââââââââââââââââââ
> have preserved the SRF output ordering. But by turning the SRF into a
> ROWS FROM, there's no guarantee that the cross join between "few" and
> generate_series(1,3) above is implemented in that order.


> Besides that I'm structurally wondering whether turning the original
> query into a subquery is the right thing to do. It requires some kind of
> ugly munching of Query->*, and has the above problem.

It does not seem like it should be that hard, certainly no worse than
subquery pullup.  Want to show code?

> An alternative approach would be to do this during parse-analysis, but I
> think that might end up being confusing, because stored rules would
> suddenly have a noticeably different structure, and it'd tie us a lot
> more to the details of that transformation than I'd like.

-1 on that; we do not want this transformation visible in stored rules.

> Besides the removal of the least-common-multiple behaviour of tSRF queries,
> there's some other consequences that using function scans have:
> Previously if a tSRF was never evaluated, it didn't cause the number of
> rows from being increased. E.g.
> SELECT id, COALESCE(1, generate_series(1,2)) FROM (VALUES(1),(2)) few(id);
> only produced two rows.  But using joins means that a simple
> implementation of using ROWS FROM returns four rows.

Hmm.  I don't mind changing behavior in that sort of corner case.
If we're prepared to discard the LCM behavior, this seems at least
an order of magnitude less likely to be worth worrying about.

Having said that, I do seem to recall a bug report about misbehavior when
a SRF was present in just one arm of a CASE statement.  That would have
the same type of behavior as you describe here, and evidently there's at
least one person out there depending on it.

Would it be worth detecting SRFs below CASE/COALESCE/etc and throwing
an error?  It would be easier to sell throwing an error than silently
changing behavior, I think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-08-01 Thread Andres Freund

On 2016-05-25 16:55:23 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-05-25 15:20:03 -0400, Tom Lane wrote:
> >> We could certainly make a variant behavior in nodeFunctionscan.c that
> >> emulates that, if we feel that being exactly bug-compatible on the point
> >> is actually what we want.  I'm dubious about that though, not least
> >> because I don't think *anyone* actually believes that that behavior isn't
> >> broken.  Did you read my upthread message suggesting assorted compromise
> >> choices?
>
> > You mean 
> > https://www.postgresql.org/message-id/21076.1464034...@sss.pgh.pa.us ?
> > If so, yes.
>
> > If we go with rewriting this into LATERAL, I'd vote for 2.5 (trailed by
> > option 1), that'd keep most of the functionality, and would break
> > visibly rather than invisibly in the cases where not.
>
> 2.5 would be okay with me.
>
> > I guess you're not planning to work on this?
>
> Well, not right now, as it's clearly too late for 9.6.  I might hack on
> it later if nobody beats me to it.

FWIW, as it's blocking my plans for executor related rework (expression
evaluation, batch processing) I started to hack on this.

I've an implementation that

1) turns all targetlist SRF (tSRF from now on) into ROWS FROM
   expressions. If there's tSRFs in the argument of a tSRF those becomes
   a separate, lateral, ROWS FROM expression.

2) If grouping/window functions are present, the entire query is wrapped
   in a subquery RTE, except for the set-returning function. All
   referenced Var|Aggref|GroupingFunc|WindowFunc|Param nodes in the
   original targetlist are made to reference that subquery, which gets a
   TargetEntry for them.

3) If sortClause does *not* reference any tSRFs the sorting is evaluated
   in a subquery, to preserve the output ordering of SRFs in queries
   like
   SELECT id, generate_series(1,3) FROM (VALUES(1),(2)) d(id) ORDER BY id DESC;
   if in contrast sortClause does reference the tSRF output, it's
   evaluated in the outer SRF.

this seems to generally work, and allows to remove considerable amounts
of code.

So far I have one problem without an easy solution: Historically queries
like
=# SELECT id, generate_series(1,2) FROM (VALUES(1),(2)) few(id);
┌┬─┐
│ id │ generate_series │
├┼─┤
│  1 │   1 │
│  1 │   2 │
│  2 │   1 │
│  2 │   2 │
└┴─┘
have preserved the SRF output ordering. But by turning the SRF into a
ROWS FROM, there's no guarantee that the cross join between "few" and
generate_series(1,3) above is implemented in that order. I.e. we can get
something like
┌┬─┐
│ id │ generate_series │
├┼─┤
│  1 │   1 │
│  2 │   1 │
│  1 │   2 │
│  2 │   2 │
└┴─┘
because it's implemented as
┌──┐
│  QUERY PLAN  │
├──┤
│ Nested Loop  (cost=0.00..35.03 rows=2000 width=8)│
│   ->  Function Scan on generate_series  (cost=0.00..10.00 rows=1000 width=4) │
│   ->  Materialize  (cost=0.00..0.04 rows=2 width=4)  │
│ ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=4)  │
└──┘

I right now see no easy and nice-ish way to constrain that.

Besides that I'm structurally wondering whether turning the original
query into a subquery is the right thing to do. It requires some kind of
ugly munching of Query->*, and has the above problem. One alternative
would be to instead perform the necessary magic in grouping_planner(),
by "manually" adding nestloop joins before/after create_ordered_paths()
(depending on SRFs being referenced in the sort clause).  That'd create
plans we'd not have created so far, by layering NestLoop and
FunctionScan nodes above the normal query - that'd allow us to to easily
force the ordering of SRF evaluation.

If we go the subquery route, I'm wondering about where to tackle the
restructuring. So far I'm doing it very early in subquery_planner() -
otherwise the aggregation/sorting/... behaviour is easier to handle.
Perhaps doing it in standard_planner() itself would be better though.
An alternative approach would be to do this during parse-analysis, but I
think that might end up being confusing, because stored rules would
suddenly have a noticeably different structure, and it'd tie us a lot
more to the details of that transformation than I'd like.

Besides the removal of the least-common-multiple behaviour of tSRF queries,
there's some other consequences that using function scans have:
Previously if a tSRF was never evaluated, it didn't cause the number of
rows from being increased. E.g.
SELECT id, C

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Robert Haas

On Mon, Jun 6, 2016 at 3:26 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> ... I guess I'd prefer #2 to #2.5, #2.5 to #3, and #3 to #1.
>> I really don't like #1 much - I think I'd almost rather do nothing.
>
> FWIW, that's about my evaluation of the alternatives as well.  I fear
> that #1 would get a lot of pushback.  If we think that something like
> "LATERAL ROWS FROM STRICT" is worth having on its own merits, then
> doing #2.5 seems worthwhile to me, but otherwise I'm just as happy
> with #2.  David J. seems to feel that throwing an error (as in #2.5)
> rather than silently behaving incompatibly (as in #2) is important,
> but I'm not convinced.  In a green field I think we'd prefer #2 over
> #2.5, so I'd rather go that direction.

Same here.  That behavior is actually potentially quite useful, right?
 Like, you might want to rely on the NULL-extension thing, if it were
documented as behavior you can count on?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread David G. Johnston

On Mon, Jun 6, 2016 at 3:26 PM, Tom Lane  wrote:

> Robert Haas  writes:
> > ... I guess I'd prefer #2 to #2.5, #2.5 to #3, and #3 to #1.
> > I really don't like #1 much - I think I'd almost rather do nothing.
>
> FWIW, that's about my evaluation of the alternatives as well.  I fear
> that #1 would get a lot of pushback.  If we think that something like
> "LATERAL ROWS FROM STRICT" is worth having on its own merits, then
> doing #2.5 seems worthwhile to me, but otherwise I'm just as happy
> with #2.  David J. seems to feel that throwing an error (as in #2.5)
> rather than silently behaving incompatibly (as in #2) is important,
> but I'm not convinced.  In a green field I think we'd prefer #2 over
> #2.5, so I'd rather go that direction.
>

I suspect the decision to error or not is a one or two line change in
whatever form the final patch takes.  It seems like approach #2 is
acceptable on a theoretical level which implies there is no desire to make
the existing LCM behavior available post-patch.

Assuming it is simple then everyone will have a chance to make their
opinion known on whether the 2.0 or 2.5 variation is preferable for the
final commit.  If a decision needs to be made sooner due to a design
decision I'd hope the author of the patch would make that known so we can
bring this to resolution at that point instead.

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Tom Lane

Robert Haas  writes:
> ... I guess I'd prefer #2 to #2.5, #2.5 to #3, and #3 to #1.
> I really don't like #1 much - I think I'd almost rather do nothing.

FWIW, that's about my evaluation of the alternatives as well.  I fear
that #1 would get a lot of pushback.  If we think that something like
"LATERAL ROWS FROM STRICT" is worth having on its own merits, then
doing #2.5 seems worthwhile to me, but otherwise I'm just as happy
with #2.  David J. seems to feel that throwing an error (as in #2.5)
rather than silently behaving incompatibly (as in #2) is important,
but I'm not convinced.  In a green field I think we'd prefer #2 over
#2.5, so I'd rather go that direction.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Robert Haas

On Mon, Jun 6, 2016 at 2:53 PM, Tom Lane  wrote:
> Now, if we decide to try to rewrite tlist SRFs as LATERAL, it would likely
> behoove us to do that rewrite before expanding * not after, so that we can
> eliminate the multiple evaluation of foo() that happens currently.  (That
> makes it a parser problem not a planner problem.)  And maybe we should
> rewrite non-SRF composite-returning functions this way too, because people
> have definitely complained about the extra evaluations in that context.
> But my point here is that lockstep evaluation does have practical use
> when the SRFs are iterating over matching collections of generated rows.
> And that seems like a pretty common use-case.

Yeah, OK.  I'm not terribly opposed to going that way.  I think the
current behavior sucks badly enough - both because the semantics are
bizarre and because it complicates the whole executor for a niche
feature - that it's worth taking a backward compatibility hit to
change it.  I guess I'd prefer #2 to #2.5, #2.5 to #3, and #3 to #1.
I really don't like #1 much - I think I'd almost rather do nothing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Tom Lane

Alvaro Herrera  writes:
> Robert Haas wrote:
>> On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane  wrote:
>>> No, because then you get the cross-product of multiple SRFs, not the
>>> run-in-lockstep behavior.

>> Oh.  I assumed that was the expected behavior.  But, ah, what do I know?

> Lots, I assume -- but in this case, probably next to nothing, just like
> most of us, because what sane person or application would be really
> relying on the wacko historical behavior, in order to generate some
> collective knowledge?  However, I think that it is possible that
> someone, somewhere has two SRFs-in-targetlist that return the same
> number of rows and that the current implementation works fine for them;

Yes.  Run-in-lockstep is an extremely useful behavior, so much so that
we made a LATERAL variant for it.  I do not see a reason to break such
cases in the targetlist.

> My vote is to raise an error in the case of more than one SRF in targetlist.

Note that that risks breaking cases that the user does not think are "more
than one SRF".  Consider this example using a regression-test table:

regression=# create function foo() returns setof int8_tbl as
regression-# 'select * from int8_tbl' language sql;
CREATE FUNCTION
regression=# select foo();
 foo  
--
 (123,456)
 (123,4567890123456789)
 (4567890123456789,123)
 (4567890123456789,4567890123456789)
 (4567890123456789,-4567890123456789)
(5 rows)

regression=# explain verbose select foo();
  QUERY PLAN  
--
 Result  (cost=0.00..5.25 rows=1000 width=32)
   Output: foo()
(2 rows)

regression=# select (foo()).*;
q1|q2 
--+---
  123 |   456
  123 |  4567890123456789
 4567890123456789 |   123
 4567890123456789 |  4567890123456789
 4567890123456789 | -4567890123456789
(5 rows)

regression=# explain verbose select (foo()).*;
  QUERY PLAN  
--
 Result  (cost=0.00..5.50 rows=1000 width=16)
   Output: (foo()).q1, (foo()).q2
(2 rows)

The reason we can get away with this simplistic treatment of
composite-returning SRFs is precisely the run-in-lockstep behavior.
Otherwise the second query would have returned 25 rows.

Now, if we decide to try to rewrite tlist SRFs as LATERAL, it would likely
behoove us to do that rewrite before expanding * not after, so that we can
eliminate the multiple evaluation of foo() that happens currently.  (That
makes it a parser problem not a planner problem.)  And maybe we should
rewrite non-SRF composite-returning functions this way too, because people
have definitely complained about the extra evaluations in that context.
But my point here is that lockstep evaluation does have practical use
when the SRFs are iterating over matching collections of generated rows.
And that seems like a pretty common use-case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread David G. Johnston

On Mon, Jun 6, 2016 at 2:31 PM, Alvaro Herrera 
wrote:

> Robert Haas wrote:
> > On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane  wrote:
> > > Robert Haas  writes:
> > >> On Mon, May 23, 2016 at 4:15 PM, Tom Lane  wrote:
> > >>> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
> > >>> have the same behavior as before if the SRFs all return the same
> number
> > >>> of rows, and otherwise would behave differently.
> > >
> > >> I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
> > >> LATERAL ROWS FROM (srf2()), ...
> > >
> > > No, because then you get the cross-product of multiple SRFs, not the
> > > run-in-lockstep behavior.
> >
> > Oh.  I assumed that was the expected behavior.  But, ah, what do I know?
>
> Lots, I assume -- but in this case, probably next to nothing, just like
> most of us, because what sane person or application would be really
> relying on the wacko historical behavior, in order to generate some
> collective knowledge?  However, I think that it is possible that
> someone, somewhere has two SRFs-in-targetlist that return the same
> number of rows and that the current implementation works fine for them;
> if we redefine it to work differently, we would break their application
> silently, which seems a worse problem than breaking it noisily while
> providing an easy way forward (which is to move SRFs to the FROM list)
>
> My vote is to raise an error in the case of more than one SRF in
> targetlist.
>

As long as someone is willing to put in the effort we can make a subset of
these multiple-SRFs-in-targetlist queries work without any change in the
tabular output, though the processing mechanism might change.  Your vote
is essentially #1 up-thread which seems the most draconian.  Assuming a
viable option 2.5 or 3 solution is presented would you vote against it
being committed?  If so I'd like to understand why.  I see #1 as basically
OK only if their are technical barriers we cannot overcome - including
performance.

Link to the definition of the various options Tom proposed:

https://www.postgresql.org/message-id/21076.1464034513%40sss.pgh.pa.us

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Alvaro Herrera

Robert Haas wrote:
> On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> On Mon, May 23, 2016 at 4:15 PM, Tom Lane  wrote:
> >>> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
> >>> have the same behavior as before if the SRFs all return the same number
> >>> of rows, and otherwise would behave differently.
> >
> >> I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
> >> LATERAL ROWS FROM (srf2()), ...
> >
> > No, because then you get the cross-product of multiple SRFs, not the
> > run-in-lockstep behavior.
> 
> Oh.  I assumed that was the expected behavior.  But, ah, what do I know?

Lots, I assume -- but in this case, probably next to nothing, just like
most of us, because what sane person or application would be really
relying on the wacko historical behavior, in order to generate some
collective knowledge?  However, I think that it is possible that
someone, somewhere has two SRFs-in-targetlist that return the same
number of rows and that the current implementation works fine for them;
if we redefine it to work differently, we would break their application
silently, which seems a worse problem than breaking it noisily while
providing an easy way forward (which is to move SRFs to the FROM list)

My vote is to raise an error in the case of more than one SRF in targetlist.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Robert Haas

On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, May 23, 2016 at 4:15 PM, Tom Lane  wrote:
>>> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
>>> have the same behavior as before if the SRFs all return the same number
>>> of rows, and otherwise would behave differently.
>
>> I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
>> LATERAL ROWS FROM (srf2()), ...
>
> No, because then you get the cross-product of multiple SRFs, not the
> run-in-lockstep behavior.

Oh.  I assumed that was the expected behavior.  But, ah, what do I know?

>> The rewrite you propose here seems to NULL-pad rows after the first
>> SRF is exhausted:
>
> Yes.  That's why I said it's not compatible if the SRFs don't all return
> the same number of rows.  It seems like a reasonable definition to me
> though, certainly much more reasonable than the current run-until-LCM
> behavior.

I can't argue with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Tom Lane

"David G. Johnston"  writes:
> If the SRFs return a different number of rows the LCM behavior kicks in and
> you get Robert's second result.

Only if the periods of the SRFs are relatively prime.  That is, neither of
his examples demonstrate the full weirdness of the current behavior; for
that, you need periods that are multiples of each other.  For instance:

SELECT generate_series(1, 2), generate_series(1, 4); 
 generate_series | generate_series 
-+-
   1 |   1
   2 |   2
   1 |   3
   2 |   4
(4 rows)

That doesn't comport with any behavior available from LATERAL.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Vik Fearing

On 06/06/16 18:30, David G. Johnston wrote:
> To clarify, the present behavior is basically a combination of both of
> Robert's results.
> 
> If the SRFs return the same number of rows the first (zippered) result
> is returned without an NULL padding.
> 
> If the SRFs return a different number of rows the LCM behavior kicks in
> and you get Robert's second result.

No.

> SELECT generate_series(1, 4), generate_series(1, 4) ORDER BY 1, 2;
> is the same as
> SELECT * FROM ROWS FROM ( generate_series(1, 4), generate_series(1, 4) );
> 
> BUT
> 
> SELECT generate_series(1, 3), generate_series(1, 4) ORDER BY 1, 2;
> is the same as
> SELECT * FROM ROWS FROM generate_series(1, 3) a, LATERAL ROWS FROM
> generate_series(1, 4) b;

What would you do with:

SELECT generate_series(1, 3), generate_series(1, 6);

?

> Tom's 2.5 proposal basically says we make the former equivalence succeed
> and have the later one fail.
> 
> The rewrite would be unaware of the cardinality of the SRF and so it
> cannot conditionally rewrite the query.  One of the two must be chosen
> and the incompatible behavior turned into an error.


-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread David G. Johnston

On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane  wrote:

> Robert Haas  writes:
> > On Mon, May 23, 2016 at 4:15 PM, Tom Lane  wrote:
> >> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
> >> have the same behavior as before if the SRFs all return the same number
> >> of rows, and otherwise would behave differently.
>
> > I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
> > LATERAL ROWS FROM (srf2()), ...
>
> No, because then you get the cross-product of multiple SRFs, not the
> run-in-lockstep behavior.
>
> > The rewrite you propose here seems to NULL-pad rows after the first
> > SRF is exhausted:
>
> Yes.  That's why I said it's not compatible if the SRFs don't all return
> the same number of rows.  It seems like a reasonable definition to me
> though, certainly much more reasonable than the current run-until-LCM
> behavior.
>

IOW, this is why this mode query has to fail.

>
> > The latter is how I'd expect SRF-in-targetlist to work.
>
> That's not even close to how it works now.  It would break *every*
> existing application that has multiple SRFs in the tlist, not just
> the ones whose SRFs return different numbers of rows.  And I'm not
> convinced that it's a more useful behavior.
>

To clarify, the present behavior is basically a combination of both of
Robert's results.

If the SRFs return the same number of rows the first (zippered) result is
returned without an NULL padding.

If the SRFs return a different number of rows the LCM behavior kicks in and
you get Robert's second result.

SELECT generate_series(1, 4), generate_series(1, 4) ORDER BY 1, 2;
is the same as
SELECT * FROM ROWS FROM ( generate_series(1, 4), generate_series(1, 4) );

BUT

SELECT generate_series(1, 3), generate_series(1, 4) ORDER BY 1, 2;
is the same as
SELECT * FROM ROWS FROM generate_series(1, 3) a, LATERAL ROWS FROM
generate_series(1, 4) b;

Tom's 2.5 proposal basically says we make the former equivalence succeed
and have the later one fail.

The rewrite would be unaware of the cardinality of the SRF and so it cannot
conditionally rewrite the query.  One of the two must be chosen and the
incompatible behavior turned into an error.

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Tom Lane

Robert Haas  writes:
> On Mon, May 23, 2016 at 4:15 PM, Tom Lane  wrote:
>> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
>> have the same behavior as before if the SRFs all return the same number
>> of rows, and otherwise would behave differently.

> I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
> LATERAL ROWS FROM (srf2()), ...

No, because then you get the cross-product of multiple SRFs, not the
run-in-lockstep behavior.

> The rewrite you propose here seems to NULL-pad rows after the first
> SRF is exhausted:

Yes.  That's why I said it's not compatible if the SRFs don't all return
the same number of rows.  It seems like a reasonable definition to me
though, certainly much more reasonable than the current run-until-LCM
behavior.

> The latter is how I'd expect SRF-in-targetlist to work.

That's not even close to how it works now.  It would break *every*
existing application that has multiple SRFs in the tlist, not just
the ones whose SRFs return different numbers of rows.  And I'm not
convinced that it's a more useful behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-06 Thread Robert Haas

On Mon, May 23, 2016 at 4:15 PM, Tom Lane  wrote:
> We should consider single and multiple SRFs in a targetlist as distinct
> use-cases; only the latter has got weird properties.
>
> There are several things we could potentially do with multiple SRFs in
> the same targetlist.  In increasing order of backwards compatibility and
> effort required:
>
> 1. Throw error if there's more than one SRF.
>
> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
> have the same behavior as before if the SRFs all return the same number
> of rows, and otherwise would behave differently.

I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
LATERAL ROWS FROM (srf2()), ...

The rewrite you propose here seems to NULL-pad rows after the first
SRF is exhausted:

rhaas=# select * from dual, lateral rows from (generate_series(1,3),
generate_series(1,4));
   x   | generate_series | generate_series
---+-+-
 dummy |   1 |   1
 dummy |   2 |   2
 dummy |   3 |   3
 dummy | |   4
(4 rows)

...whereas with a separate LATERAL clause for each row you get this:

rhaas=# select * from dual, lateral rows from (generate_series(1,3))
a, lateral rows from (generate_series(1,4)) b;
   x   | a | b
---+---+---
 dummy | 1 | 1
 dummy | 1 | 2
 dummy | 1 | 3
 dummy | 1 | 4
 dummy | 2 | 1
 dummy | 2 | 2
 dummy | 2 | 3
 dummy | 2 | 4
 dummy | 3 | 1
 dummy | 3 | 2
 dummy | 3 | 3
 dummy | 3 | 4
(12 rows)

The latter is how I'd expect SRF-in-targetlist to work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-03 Thread Tom Lane

"David G. Johnston"  writes:
> On Friday, June 3, 2016, Tom Lane  > wrote:
>> Merlin Moncure  writes:
>>> another interesting case today is:
>>> create sequence s;
>>> select generate_series(1,nextval('s')), generate_series(1,nextval('s'));

> If taking the 2.5 approach this one would fail as opposed to being
> rewritten.

Well, it'd be rewritten and then would fail at runtime because of the SRF
calls not producing the same number of rows.  But even option #3 would not
be strictly bug-compatible because it would (I imagine) evaluate the
arguments of each SRF only once.  The reason this case doesn't terminate
in the current implementation is that it re-evaluates the SRF arguments
each time we start a SRF over.  That's just weird ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Changed SRF in targetlist handling

2016-06-03 Thread David G. Johnston

On Friday, June 3, 2016, Tom Lane > wrote:

> Merlin Moncure  writes:
> > On Wed, May 25, 2016 at 3:55 PM, Tom Lane  wrote:
> >> Andres Freund  writes:
> >>> If we go with rewriting this into LATERAL, I'd vote for 2.5 (trailed by
> >>> option 1), that'd keep most of the functionality, and would break
> >>> visibly rather than invisibly in the cases where not.
> >> 2.5 would be okay with me.
>
> > Curious if this approach will also rewrite:
> > select generate_series(1,generate_series(1,3)) s;
> > ...into
> > select s from generate_series(1,3) x cross join lateral
> generate_series(1,x) s;
>
> Yeah, that would be the idea.
>
>
Ok...  It's only a single srf as far as the outer query is concerned so
while it is odd the behavior is well defined and can be transformed while
giving the same result.


> > another interesting case today is:
> > create sequence s;
> > select generate_series(1,nextval('s')), generate_series(1,nextval('s'));
>
> > this statement never terminates.  a lateral rewrite of this query
> > would always terminate with much better defined and well understood
> > behaviors -- this is good.
>
> Interesting example demonstrating that 100% bug compatibility is not
> possible.  But as you say, most people would probably prefer the other
> behavior anyhow.
>
>
If taking the 2.5 approach this one would fail as opposed to being
rewritten.

This could be an exception to the policy in #3 and would be ok in #2.  It
would fail in #1.

Given the apparent general consensus for 2.5 and the lack of working field
versions of this form the error seems like a no brainer.

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-03 Thread Tom Lane

Merlin Moncure  writes:
> On Wed, May 25, 2016 at 3:55 PM, Tom Lane  wrote:
>> Andres Freund  writes:
>>> If we go with rewriting this into LATERAL, I'd vote for 2.5 (trailed by
>>> option 1), that'd keep most of the functionality, and would break
>>> visibly rather than invisibly in the cases where not.
>> 2.5 would be okay with me.

> Curious if this approach will also rewrite:
> select generate_series(1,generate_series(1,3)) s;
> ...into
> select s from generate_series(1,3) x cross join lateral generate_series(1,x) 
> s;

Yeah, that would be the idea.

> another interesting case today is:
> create sequence s;
> select generate_series(1,nextval('s')), generate_series(1,nextval('s'));

> this statement never terminates.  a lateral rewrite of this query
> would always terminate with much better defined and well understood
> behaviors -- this is good.

Interesting example demonstrating that 100% bug compatibility is not
possible.  But as you say, most people would probably prefer the other
behavior anyhow.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-06-03 Thread Merlin Moncure

On Wed, May 25, 2016 at 3:55 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2016-05-25 15:20:03 -0400, Tom Lane wrote:
>>> We could certainly make a variant behavior in nodeFunctionscan.c that
>>> emulates that, if we feel that being exactly bug-compatible on the point
>>> is actually what we want.  I'm dubious about that though, not least
>>> because I don't think *anyone* actually believes that that behavior isn't
>>> broken.  Did you read my upthread message suggesting assorted compromise
>>> choices?
>
>> You mean 
>> https://www.postgresql.org/message-id/21076.1464034...@sss.pgh.pa.us ?
>> If so, yes.
>
>> If we go with rewriting this into LATERAL, I'd vote for 2.5 (trailed by
>> option 1), that'd keep most of the functionality, and would break
>> visibly rather than invisibly in the cases where not.
>
> 2.5 would be okay with me.
>
>> I guess you're not planning to work on this?
>
> Well, not right now, as it's clearly too late for 9.6.  I might hack on
> it later if nobody beats me to it.

Curious if this approach will also rewrite:
select generate_series(1,generate_series(1,3)) s;

...into
select s from generate_series(1,3) x cross join lateral generate_series(1,x) s;

another interesting case today is:
create sequence s;
select generate_series(1,nextval('s')), generate_series(1,nextval('s'));

this statement never terminates.  a lateral rewrite of this query
would always terminate with much better defined and well understood
behaviors -- this is good.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-25 Thread Tom Lane

Andres Freund  writes:
> On 2016-05-25 15:20:03 -0400, Tom Lane wrote:
>> We could certainly make a variant behavior in nodeFunctionscan.c that
>> emulates that, if we feel that being exactly bug-compatible on the point
>> is actually what we want.  I'm dubious about that though, not least
>> because I don't think *anyone* actually believes that that behavior isn't
>> broken.  Did you read my upthread message suggesting assorted compromise
>> choices?

> You mean https://www.postgresql.org/message-id/21076.1464034...@sss.pgh.pa.us 
> ?
> If so, yes.

> If we go with rewriting this into LATERAL, I'd vote for 2.5 (trailed by
> option 1), that'd keep most of the functionality, and would break
> visibly rather than invisibly in the cases where not.

2.5 would be okay with me.

> I guess you're not planning to work on this?

Well, not right now, as it's clearly too late for 9.6.  I might hack on
it later if nobody beats me to it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-25 Thread Andres Freund

On 2016-05-25 15:20:03 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-05-25 15:02:23 -0400, Tom Lane wrote:
> >> [ shrug... ]  That seems like it's morally equivalent to (but uglier than)
> >> what I wanted to do, which is to teach the planner to rewrite the query to
> >> put the SRFs into a lateral FROM item.  Splitting the tlist into two
> >> levels will work out to be exactly the same rewriting problem.
> 
> > I think that depends on how bug compatible we want to be. It seems
> > harder to get the (rather odd!) lockstep iteration behaviour between two
> > SRFS with the LATERAL approach?
> 
> We could certainly make a variant behavior in nodeFunctionscan.c that
> emulates that, if we feel that being exactly bug-compatible on the point
> is actually what we want.  I'm dubious about that though, not least
> because I don't think *anyone* actually believes that that behavior isn't
> broken.  Did you read my upthread message suggesting assorted compromise
> choices?

You mean https://www.postgresql.org/message-id/21076.1464034...@sss.pgh.pa.us ?
If so, yes.

If we go with rewriting this into LATERAL, I'd vote for 2.5 (trailed by
option 1), that'd keep most of the functionality, and would break
visibly rather than invisibly in the cases where not.

I guess you're not planning to work on this?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-25 Thread Tom Lane

Andres Freund  writes:
> On 2016-05-25 15:02:23 -0400, Tom Lane wrote:
>> [ shrug... ]  That seems like it's morally equivalent to (but uglier than)
>> what I wanted to do, which is to teach the planner to rewrite the query to
>> put the SRFs into a lateral FROM item.  Splitting the tlist into two
>> levels will work out to be exactly the same rewriting problem.

> I think that depends on how bug compatible we want to be. It seems
> harder to get the (rather odd!) lockstep iteration behaviour between two
> SRFS with the LATERAL approach?

We could certainly make a variant behavior in nodeFunctionscan.c that
emulates that, if we feel that being exactly bug-compatible on the point
is actually what we want.  I'm dubious about that though, not least
because I don't think *anyone* actually believes that that behavior isn't
broken.  Did you read my upthread message suggesting assorted compromise
choices?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-25 Thread Andres Freund

On 2016-05-25 15:02:23 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-05-23 13:10:29 -0400, Tom Lane wrote:
> >> Would that not lead to, in effect, duplicating all of execQual.c?  The new
> >> executor node would still have to be prepared to process all expression
> >> node types.
> 
> > I don't think it necessarily has to. ISTM that if we add a version of
> > ExecProject()/ExecTargetList() that continues returning multiple rows,
> > we can make the knowledge about the one type of expression we allow to
> > return multiple rows.  That'd require a bit of uglyness to implement
> > stuff like
> > SELECT generate_series(1, 2)::text, generate_series(1, 2) * 5;
> > etc. It seems we'd basically have to do one projection step for the
> > SRFs, and then another for the rest.  I'm inclined to think that's
> > acceptable to get rid of a lot of the related uglyness.
> 
> [ shrug... ]  That seems like it's morally equivalent to (but uglier than)
> what I wanted to do, which is to teach the planner to rewrite the query to
> put the SRFs into a lateral FROM item.  Splitting the tlist into two
> levels will work out to be exactly the same rewriting problem.

I think that depends on how bug compatible we want to be. It seems
harder to get the (rather odd!) lockstep iteration behaviour between two
SRFS with the LATERAL approach?

tpch[6098][1]=# SELECT generate_series(1, 3), generate_series(1,3);
┌─┬─┐
│ generate_series │ generate_series │
├─┼─┤
│   1 │   1 │
│   2 │   2 │
│   3 │   3 │
└─┴─┘
(3 rows)

tpch[6098][1]=# SELECT generate_series(1, 3), generate_series(1,4);
┌─┬─┐
│ generate_series │ generate_series │
├─┼─┤
│   1 │   1 │
│   2 │   2 │
│   3 │   3 │
│   1 │   4 │
│   2 │   1 │
│   3 │   2 │
│   1 │   3 │
│   2 │   4 │
│   3 │   1 │
│   1 │   2 │
│   2 │   3 │
│   3 │   4 │
└─┴─┘
(12 rows)


Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-25 Thread Tom Lane

Andres Freund  writes:
> On 2016-05-23 13:10:29 -0400, Tom Lane wrote:
>> Would that not lead to, in effect, duplicating all of execQual.c?  The new
>> executor node would still have to be prepared to process all expression
>> node types.

> I don't think it necessarily has to. ISTM that if we add a version of
> ExecProject()/ExecTargetList() that continues returning multiple rows,
> we can make the knowledge about the one type of expression we allow to
> return multiple rows.  That'd require a bit of uglyness to implement
> stuff like
> SELECT generate_series(1, 2)::text, generate_series(1, 2) * 5;
> etc. It seems we'd basically have to do one projection step for the
> SRFs, and then another for the rest.  I'm inclined to think that's
> acceptable to get rid of a lot of the related uglyness.

[ shrug... ]  That seems like it's morally equivalent to (but uglier than)
what I wanted to do, which is to teach the planner to rewrite the query to
put the SRFs into a lateral FROM item.  Splitting the tlist into two
levels will work out to be exactly the same rewriting problem.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-25 Thread Andres Freund

On 2016-05-23 13:10:29 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > One idea I circulated was to fix that by interjecting a special executor
> > node to process SRF containing targetlists (reusing Result possibly?).
> > That'd allow to remove the isDone argument from ExecEval*/ExecProject*
> > and get rid of ps_TupFromTlist which is fairly ugly.
> 
> Would that not lead to, in effect, duplicating all of execQual.c?  The new
> executor node would still have to be prepared to process all expression
> node types.

I don't think it necessarily has to. ISTM that if we add a version of
ExecProject()/ExecTargetList() that continues returning multiple rows,
we can make the knowledge about the one type of expression we allow to
return multiple rows.  That'd require a bit of uglyness to implement
stuff like
SELECT generate_series(1, 2)::text, generate_series(1, 2) * 5;
etc. It seems we'd basically have to do one projection step for the
SRFs, and then another for the rest.  I'm inclined to think that's
acceptable to get rid of a lot of the related uglyness.

> > One issue with removing targetlist SRFs is that they're currently
> > considerably faster than SRFs in FROM:
> 
> I suspect that depends greatly on your test case.  But in any case
> we could put more effort into optimizing nodeFunctionscan.

I doubt you'll find cases where it's significantly the other way round
for percall SRFs. The fundamental issue is that targetlist SRFs don't
have to spill to a tuplestore, whereas nodeFunctionscan ones have to
(even if they're percall).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Joe Conway

On 05/23/2016 02:37 PM, David G. Johnston wrote:
> But then I don't get Joe's point - if its an implementation detail why
> should it matter if rewriting the SRF-in-tlist to be laterals changes
> execution from a serial to an interleaved implementation.  Plus, Joe's
> claim: "the capability to pipeline results is still only available in
> the target list", and yours above are at odds since you claim the
> rewritten behavior is the same today.  Is there a disconnect in
> knowledge or are you talking about different things?

Unless there have been recent changes which I missed, ValuePerCall SRFs
are still run to completion in one go, when executed in the FROM clause,
but they project one-row-at-a-time in the target list. If your SRF
returns many-many rows, the problem with the former case is that the
entire thing has to be materialized in memory.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David G. Johnston

On Mon, May 23, 2016 at 4:42 PM, Tom Lane  wrote:

> "David G. Johnston"  writes:
> > On Mon, May 23, 2016 at 4:24 PM, Alvaro Herrera <
> alvhe...@2ndquadrant.com>
> > wrote:
> >> Ah, so that's what "pipeline results" mean!  I hadn't gotten that.  I
> >> agree; Abhijit had a patch or a plan for this, a long time ago ...
>
> > Is this sidebar strictly an implementation detail, not user visible?
>
> Hmm.  It could be visible in the sense that the execution of multiple
> functions in one ROWS FROM() construct could be interleaved, while
> (I think) the current implementation runs each one to completion
> serially.  But if you're writing code that assumes that, I think you
> should not be very surprised when we break it.  In any case, that
> would not affect the proposed translation for SRFs-in-tlist, since
> those have that behavior today.
>

Thanks

Sounds like "zipper results" would be a better term for it...but, yes, if
that's the general context it falls into implementation from my
perspective.

But then I don't get Joe's point - if its an implementation detail why
should it matter if rewriting the SRF-in-tlist to be laterals changes
execution from a serial to an interleaved implementation.  Plus, Joe's
claim: "the capability to pipeline results is still only available in the
target list", and yours above are at odds since you claim the rewritten
behavior is the same today.  Is there a disconnect in knowledge or are you
talking about different things?

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Tom Lane

"David G. Johnston"  writes:
> On Mon, May 23, 2016 at 4:24 PM, Alvaro Herrera 
> wrote:
>> Ah, so that's what "pipeline results" mean!  I hadn't gotten that.  I
>> agree; Abhijit had a patch or a plan for this, a long time ago ...

> âIs this sidebar strictly an implementation detail, not user visible?

Hmm.  It could be visible in the sense that the execution of multiple
functions in one ROWS FROM() construct could be interleaved, while
(I think) the current implementation runs each one to completion
serially.  But if you're writing code that assumes that, I think you
should not be very surprised when we break it.  In any case, that
would not affect the proposed translation for SRFs-in-tlist, since
those have that behavior today.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David G. Johnston

On Mon, May 23, 2016 at 4:24 PM, Alvaro Herrera 
wrote:

> Tom Lane wrote:
> > Joe Conway  writes:
>
> > > I'll also note that, unless I missed something, we also have to
> consider
> > > that the capability to pipeline results is still only available in the
> > > target list.
> >
> > Yes, we would definitely want to improve nodeFunctionscan.c to perform
> > better for ValuePerCall SRFs.  But that has value independently of this.
>
> Ah, so that's what "pipeline results" mean!  I hadn't gotten that.  I
> agree; Abhijit had a patch or a plan for this, a long time ago ...
>
>
Is this sidebar strictly an implementation detail, not user visible?

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David G. Johnston

On Mon, May 23, 2016 at 4:15 PM, Tom Lane  wrote:

> Merlin Moncure  writes:
> > On Mon, May 23, 2016 at 2:13 PM, David Fetter  wrote:
> >> On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
> >>> +1 on removing LCM.
>
> >> As a green field project, that would make total sense.  As a thing
> >> decades in, it's not clear to me that that would break less stuff or
> >> break it worse than simply disallowing SRFs in the target list, which
> >> has been rejected on bugward-compatibility grounds.  I suspect it
> >> would be even worse because disallowing SRFs in target lists would at
> >> least be obvious and localized when it broke code.
>
> > If I'm reading this correctly, it sounds to me like you are making the
> > case that removing target list SRF completely would somehow cause less
> > breakage than say, rewriting it to a LATERAL based implementation for
> > example.  With more than a little forbearance, let's just say I don't
> > agree.
>
> We should consider single and multiple SRFs in a targetlist as distinct
> use-cases; only the latter has got weird properties.
>
> There are several things we could potentially do with multiple SRFs in
> the same targetlist.  In increasing order of backwards compatibility and
> effort required:
>
> 1. Throw error if there's more than one SRF.
>
> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
> have the same behavior as before if the SRFs all return the same number
> of rows, and otherwise would behave differently.
>
> 3. Rewrite into some other construct that still ends up as a FunctionScan
> RTE node, but has the old LCM behavior if the SRFs produce different
> numbers of rows.  (Perhaps we would not need to expose this construct
> as something directly SQL-visible.)
>
> It's certainly arguable that the common use-cases for SRF-in-tlist
> don't have more than one SRF per tlist, and thus that implementing #1
> would be an appropriate amount of effort.  It's worth noting also that
> the LCM behavior has been repeatedly reported as a bug, and therefore
> that if we do #3 we'll be expending very substantial effort to be
> literally bug-compatible with ancient behavior that no one in the
> current development group thinks is well-designed.  As far as #2 goes,
> it would have the advantage that code depending on the same-number-of-
> rows case would continue to work as before.  David has a point that it
> would silently break application code that's actually depending on the
> LCM behavior, but how much of that is there likely to be, really?
>
> [ reflects a bit... ]  I guess there is room for an option 2-and-a-half:
>
> 2.5. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...), but decorate
> the FunctionScan RTE to tell the executor to throw an error if the SRFs
> don't all return the same number of rows, rather than silently
> null-padding.  This would have the same behavior as before for the sane
> case, and would be very not-silent about cases where apps actually invoked
> the LCM behavior.  Again, we wouldn't necessarily have to expose such an
> option at the SQL level.  (Though it strikes me that such a restriction
> could have value in its own right, analogous to the STRICT options that
> we've invented in some other places to allow insisting on the expected
> numbers of rows being returned.  ROWS FROM STRICT (srf1(), srf2(), ...),
> anybody?)
>

I'd let the engineers decide between 1, 2.5, and 3 - but if we accomplish
our goals while implementing #3 I'd say that would be the best outcome for
the community as whole.

We don't have the luxury of providing a safe-usage mode where people
writing new queries get the error but pre-existing queries are considered
OK.  We will have to rely upon education and deal with the occasional bug
report but our long-time customers, even if only a minority would be
affected, will appreciate the effort taken to not break code that has been
working for a long time.

The minority is likely small enough to at least make options 1 and 2.5
viable though I'd think we make an effort to avoid #1.

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Alvaro Herrera

Tom Lane wrote:
> Joe Conway  writes:

> > I'll also note that, unless I missed something, we also have to consider
> > that the capability to pipeline results is still only available in the
> > target list.
> 
> Yes, we would definitely want to improve nodeFunctionscan.c to perform
> better for ValuePerCall SRFs.  But that has value independently of this.

Ah, so that's what "pipeline results" mean!  I hadn't gotten that.  I
agree; Abhijit had a patch or a plan for this, a long time ago ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Tom Lane

Joe Conway  writes:
> I would be in favor of rewriting it to a LATERAL, but that would not be
> backwards compatible entirely either IIUC.

It could be made so, I think, but it may be more trouble than it's worth;
see my previous message.

> I'll also note that, unless I missed something, we also have to consider
> that the capability to pipeline results is still only available in the
> target list.

Yes, we would definitely want to improve nodeFunctionscan.c to perform
better for ValuePerCall SRFs.  But that has value independently of this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David G. Johnston

On Mon, May 23, 2016 at 4:05 PM, David Fetter  wrote:

> On Mon, May 23, 2016 at 02:39:54PM -0500, Merlin Moncure wrote:
> > On Mon, May 23, 2016 at 2:13 PM, David Fetter  wrote:
> > > On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
> > >> On Mon, May 23, 2016 at 12:10 PM, Tom Lane  wrote:
> > >> > Andres Freund  writes:
> > >> >> discussing executor performance with a number of people at pgcon,
> > >> >> several hackers - me included - complained about the additional
> > >> >> complexity, both code and runtime, required to handle SRFs in the
> target
> > >> >> list.
> > >> >
> > >> > Yeah, this has been an annoyance for a long time.
> > >> >
> > >> >> One idea I circulated was to fix that by interjecting a special
> executor
> > >> >> node to process SRF containing targetlists (reusing Result
> possibly?).
> > >> >> That'd allow to remove the isDone argument from
> ExecEval*/ExecProject*
> > >> >> and get rid of ps_TupFromTlist which is fairly ugly.
> > >> >
> > >> > Would that not lead to, in effect, duplicating all of execQual.c?
> The new
> > >> > executor node would still have to be prepared to process all
> expression
> > >> > node types.
> > >> >
> > >> >> Robert suggested - IIRC mentioning previous on-list discussion - to
> > >> >> instead rewrite targetlist SRFs into lateral joins. My gut feeling
> is
> > >> >> that that'd be a larger undertaking, with significant semantics
> changes.
> > >> >
> > >> > Yes, this was discussed on-list awhile back (I see David found a
> reference
> > >> > already).  I think it's feasible, although we'd first have to agree
> > >> > whether we want to remain bug-compatible with the old
> > >> > least-common-multiple-of-the-periods behavior.  I would vote for
> not,
> > >> > but it's certainly a debatable thing.
> > >>
> > >> +1 on removing LCM.
> > >
> > > As a green field project, that would make total sense.  As a thing
> > > decades in, it's not clear to me that that would break less stuff or
> > > break it worse than simply disallowing SRFs in the target list, which
> > > has been rejected on bugward-compatibility grounds.  I suspect it
> > > would be even worse because disallowing SRFs in target lists would at
> > > least be obvious and localized when it broke code.
> >
> > If I'm reading this correctly, it sounds to me like you are making the
> > case that removing target list SRF completely would somehow cause less
> > breakage than say, rewriting it to a LATERAL based implementation for
> > example.
>
> Yes.
>
> Making SRFs in target lists throw an error is a thing that will be
> pretty straightforward to deal with in extant code bases, whatever
> size of pain in the neck it might be.  The line of code that caused
> the error would be very clear, and the fix would be very obvious.
>
> Making their behavior different in some way that throws no warnings is
> guaranteed to cause subtle and hard to track bugs in extant code
> bases.


I'm advocating that if a presently allowed SRF-in-target-list is allowed
to remain it executes using the same semantics it has today.  In all other
cases, including LCM, if the present behavior is undesirable to maintain we
throw an error.  I'd hope that such an error can be written in such a way
as to name the offending function or functions.

If the user of a complex query doesn't want to expend the effort to locate
the specific instance of SRF that is in violation they will still have the
option to rewrite all of their uses in that particular query.

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Tom Lane

Merlin Moncure  writes:
> On Mon, May 23, 2016 at 2:13 PM, David Fetter  wrote:
>> On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
>>> +1 on removing LCM.

>> As a green field project, that would make total sense.  As a thing
>> decades in, it's not clear to me that that would break less stuff or
>> break it worse than simply disallowing SRFs in the target list, which
>> has been rejected on bugward-compatibility grounds.  I suspect it
>> would be even worse because disallowing SRFs in target lists would at
>> least be obvious and localized when it broke code.

> If I'm reading this correctly, it sounds to me like you are making the
> case that removing target list SRF completely would somehow cause less
> breakage than say, rewriting it to a LATERAL based implementation for
> example.  With more than a little forbearance, let's just say I don't
> agree.

We should consider single and multiple SRFs in a targetlist as distinct
use-cases; only the latter has got weird properties.

There are several things we could potentially do with multiple SRFs in
the same targetlist.  In increasing order of backwards compatibility and
effort required:

1. Throw error if there's more than one SRF.

2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
have the same behavior as before if the SRFs all return the same number
of rows, and otherwise would behave differently.

3. Rewrite into some other construct that still ends up as a FunctionScan
RTE node, but has the old LCM behavior if the SRFs produce different
numbers of rows.  (Perhaps we would not need to expose this construct
as something directly SQL-visible.)

It's certainly arguable that the common use-cases for SRF-in-tlist
don't have more than one SRF per tlist, and thus that implementing #1
would be an appropriate amount of effort.  It's worth noting also that
the LCM behavior has been repeatedly reported as a bug, and therefore
that if we do #3 we'll be expending very substantial effort to be
literally bug-compatible with ancient behavior that no one in the
current development group thinks is well-designed.  As far as #2 goes,
it would have the advantage that code depending on the same-number-of-
rows case would continue to work as before.  David has a point that it
would silently break application code that's actually depending on the
LCM behavior, but how much of that is there likely to be, really?

[ reflects a bit... ]  I guess there is room for an option 2-and-a-half:

2.5. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...), but decorate
the FunctionScan RTE to tell the executor to throw an error if the SRFs
don't all return the same number of rows, rather than silently
null-padding.  This would have the same behavior as before for the sane
case, and would be very not-silent about cases where apps actually invoked
the LCM behavior.  Again, we wouldn't necessarily have to expose such an
option at the SQL level.  (Though it strikes me that such a restriction
could have value in its own right, analogous to the STRICT options that
we've invented in some other places to allow insisting on the expected
numbers of rows being returned.  ROWS FROM STRICT (srf1(), srf2(), ...),
anybody?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David Fetter

On Mon, May 23, 2016 at 02:39:54PM -0500, Merlin Moncure wrote:
> On Mon, May 23, 2016 at 2:13 PM, David Fetter  wrote:
> > On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
> >> On Mon, May 23, 2016 at 12:10 PM, Tom Lane  wrote:
> >> > Andres Freund  writes:
> >> >> discussing executor performance with a number of people at pgcon,
> >> >> several hackers - me included - complained about the additional
> >> >> complexity, both code and runtime, required to handle SRFs in the target
> >> >> list.
> >> >
> >> > Yeah, this has been an annoyance for a long time.
> >> >
> >> >> One idea I circulated was to fix that by interjecting a special executor
> >> >> node to process SRF containing targetlists (reusing Result possibly?).
> >> >> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
> >> >> and get rid of ps_TupFromTlist which is fairly ugly.
> >> >
> >> > Would that not lead to, in effect, duplicating all of execQual.c?  The 
> >> > new
> >> > executor node would still have to be prepared to process all expression
> >> > node types.
> >> >
> >> >> Robert suggested - IIRC mentioning previous on-list discussion - to
> >> >> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
> >> >> that that'd be a larger undertaking, with significant semantics changes.
> >> >
> >> > Yes, this was discussed on-list awhile back (I see David found a 
> >> > reference
> >> > already).  I think it's feasible, although we'd first have to agree
> >> > whether we want to remain bug-compatible with the old
> >> > least-common-multiple-of-the-periods behavior.  I would vote for not,
> >> > but it's certainly a debatable thing.
> >>
> >> +1 on removing LCM.
> >
> > As a green field project, that would make total sense.  As a thing
> > decades in, it's not clear to me that that would break less stuff or
> > break it worse than simply disallowing SRFs in the target list, which
> > has been rejected on bugward-compatibility grounds.  I suspect it
> > would be even worse because disallowing SRFs in target lists would at
> > least be obvious and localized when it broke code.
> 
> If I'm reading this correctly, it sounds to me like you are making the
> case that removing target list SRF completely would somehow cause less
> breakage than say, rewriting it to a LATERAL based implementation for
> example.

Yes.

Making SRFs in target lists throw an error is a thing that will be
pretty straightforward to deal with in extant code bases, whatever
size of pain in the neck it might be.  The line of code that caused
the error would be very clear, and the fix would be very obvious.

Making their behavior different in some way that throws no warnings is
guaranteed to cause subtle and hard to track bugs in extant code
bases.  We lost not a few existing users when we caused similar
knock-ons in 8.3 by removing automated casts to text.

I am no longer advocating for removing the functionality.  I am just
pointing out that the knock-on effects of changing the functionality
may well cause more pain than the ones from removing it entirely.

> With more than a little forbearance, let's just say I don't agree.

If you'd be so kind as to explain your reasons, I think we'd all
benefit.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Joe Conway

On 05/23/2016 12:39 PM, Merlin Moncure wrote:
> On Mon, May 23, 2016 at 2:13 PM, David Fetter  wrote:
>> On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
>>> On Mon, May 23, 2016 at 12:10 PM, Tom Lane  wrote:
 Andres Freund  writes:
> discussing executor performance with a number of people at pgcon,
> several hackers - me included - complained about the additional
> complexity, both code and runtime, required to handle SRFs in the target
> list.

 Yeah, this has been an annoyance for a long time.

> One idea I circulated was to fix that by interjecting a special executor
> node to process SRF containing targetlists (reusing Result possibly?).
> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
> and get rid of ps_TupFromTlist which is fairly ugly.

 Would that not lead to, in effect, duplicating all of execQual.c?  The new
 executor node would still have to be prepared to process all expression
 node types.

> Robert suggested - IIRC mentioning previous on-list discussion - to
> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
> that that'd be a larger undertaking, with significant semantics changes.

 Yes, this was discussed on-list awhile back (I see David found a reference
 already).  I think it's feasible, although we'd first have to agree
 whether we want to remain bug-compatible with the old
 least-common-multiple-of-the-periods behavior.  I would vote for not,
 but it's certainly a debatable thing.
>>>
>>> +1 on removing LCM.
>>
>> As a green field project, that would make total sense.  As a thing
>> decades in, it's not clear to me that that would break less stuff or
>> break it worse than simply disallowing SRFs in the target list, which
>> has been rejected on bugward-compatibility grounds.  I suspect it
>> would be even worse because disallowing SRFs in target lists would at
>> least be obvious and localized when it broke code.
> 
> If I'm reading this correctly, it sounds to me like you are making the
> case that removing target list SRF completely would somehow cause less
> breakage than say, rewriting it to a LATERAL based implementation for
> example.  With more than a little forbearance, let's just say I don't
> agree.

I'm not necessarily saying that we should totally remove target list
SRFs, but I will point out it has been deprecated ever since SRFs were
first introduced:

http://www.postgresql.org/docs/7.3/static/xfunc-sql.html

  "Currently, functions returning sets may also be called in the target
   list of a SELECT query. For each row that the SELECT generates by
   itself, the function returning set is invoked, and an output row is
   generated for each element of the function's result set. Note,
   however, that this capability is deprecated and may be removed in
   future releases."

I would be in favor of rewriting it to a LATERAL, but that would not be
backwards compatible entirely either IIUC.

I'll also note that, unless I missed something, we also have to consider
that the capability to pipeline results is still only available in the
target list.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Merlin Moncure

On Mon, May 23, 2016 at 2:13 PM, David Fetter  wrote:
> On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
>> On Mon, May 23, 2016 at 12:10 PM, Tom Lane  wrote:
>> > Andres Freund  writes:
>> >> discussing executor performance with a number of people at pgcon,
>> >> several hackers - me included - complained about the additional
>> >> complexity, both code and runtime, required to handle SRFs in the target
>> >> list.
>> >
>> > Yeah, this has been an annoyance for a long time.
>> >
>> >> One idea I circulated was to fix that by interjecting a special executor
>> >> node to process SRF containing targetlists (reusing Result possibly?).
>> >> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
>> >> and get rid of ps_TupFromTlist which is fairly ugly.
>> >
>> > Would that not lead to, in effect, duplicating all of execQual.c?  The new
>> > executor node would still have to be prepared to process all expression
>> > node types.
>> >
>> >> Robert suggested - IIRC mentioning previous on-list discussion - to
>> >> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
>> >> that that'd be a larger undertaking, with significant semantics changes.
>> >
>> > Yes, this was discussed on-list awhile back (I see David found a reference
>> > already).  I think it's feasible, although we'd first have to agree
>> > whether we want to remain bug-compatible with the old
>> > least-common-multiple-of-the-periods behavior.  I would vote for not,
>> > but it's certainly a debatable thing.
>>
>> +1 on removing LCM.
>
> As a green field project, that would make total sense.  As a thing
> decades in, it's not clear to me that that would break less stuff or
> break it worse than simply disallowing SRFs in the target list, which
> has been rejected on bugward-compatibility grounds.  I suspect it
> would be even worse because disallowing SRFs in target lists would at
> least be obvious and localized when it broke code.

If I'm reading this correctly, it sounds to me like you are making the
case that removing target list SRF completely would somehow cause less
breakage than say, rewriting it to a LATERAL based implementation for
example.  With more than a little forbearance, let's just say I don't
agree.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David Fetter

On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
> On Mon, May 23, 2016 at 12:10 PM, Tom Lane  wrote:
> > Andres Freund  writes:
> >> discussing executor performance with a number of people at pgcon,
> >> several hackers - me included - complained about the additional
> >> complexity, both code and runtime, required to handle SRFs in the target
> >> list.
> >
> > Yeah, this has been an annoyance for a long time.
> >
> >> One idea I circulated was to fix that by interjecting a special executor
> >> node to process SRF containing targetlists (reusing Result possibly?).
> >> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
> >> and get rid of ps_TupFromTlist which is fairly ugly.
> >
> > Would that not lead to, in effect, duplicating all of execQual.c?  The new
> > executor node would still have to be prepared to process all expression
> > node types.
> >
> >> Robert suggested - IIRC mentioning previous on-list discussion - to
> >> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
> >> that that'd be a larger undertaking, with significant semantics changes.
> >
> > Yes, this was discussed on-list awhile back (I see David found a reference
> > already).  I think it's feasible, although we'd first have to agree
> > whether we want to remain bug-compatible with the old
> > least-common-multiple-of-the-periods behavior.  I would vote for not,
> > but it's certainly a debatable thing.
> 
> +1 on removing LCM.

As a green field project, that would make total sense.  As a thing
decades in, it's not clear to me that that would break less stuff or
break it worse than simply disallowing SRFs in the target list, which
has been rejected on bugward-compatibility grounds.  I suspect it
would be even worse because disallowing SRFs in target lists would at
least be obvious and localized when it broke code.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David G. Johnston

On Mon, May 23, 2016 at 1:44 PM, David Fetter  wrote:

> On Mon, May 23, 2016 at 01:36:57PM -0400, Tom Lane wrote:
> > David Fetter  writes:
> > > On Mon, May 23, 2016 at 01:10:29PM -0400, Tom Lane wrote:
> > >> This seems a bridge too far to me.  It's just way too common to do
> > >> "select generate_series(1,n)".  We could tell people they have to
> > >> rewrite to "select * from generate_series(1,n)", but it would be far
> > >> more polite to do that for them.
> >
> > > How about making "TABLE generate_series(1,n)" work?  It's even
> > > shorter in exchange for some cognitive load.
> >
> > No thanks --- the word after TABLE ought to be a table name, not some
> > arbitrary expression.  That's way too much mess to save one keystroke.
>
> It's not just about saving a keystroke.  This change would go with
> removing the ability to do SRFs in the target list of a SELECT
> query.
>

If you want to make an argument for doing this regardless of the target
list SRF change by all means - but it does absolutely nothing to mitigate
the breakage that would result if we choose this path.

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Merlin Moncure

On Mon, May 23, 2016 at 12:10 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> discussing executor performance with a number of people at pgcon,
>> several hackers - me included - complained about the additional
>> complexity, both code and runtime, required to handle SRFs in the target
>> list.
>
> Yeah, this has been an annoyance for a long time.
>
>> One idea I circulated was to fix that by interjecting a special executor
>> node to process SRF containing targetlists (reusing Result possibly?).
>> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
>> and get rid of ps_TupFromTlist which is fairly ugly.
>
> Would that not lead to, in effect, duplicating all of execQual.c?  The new
> executor node would still have to be prepared to process all expression
> node types.
>
>> Robert suggested - IIRC mentioning previous on-list discussion - to
>> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
>> that that'd be a larger undertaking, with significant semantics changes.
>
> Yes, this was discussed on-list awhile back (I see David found a reference
> already).  I think it's feasible, although we'd first have to agree
> whether we want to remain bug-compatible with the old
> least-common-multiple-of-the-periods behavior.  I would vote for not,
> but it's certainly a debatable thing.

+1 on removing LCM.

The behavior of multiple targetlist SRF is so bizarre that it's
incredible to believe anyone would reasonably expect it to work that
way. Agree also that casual sane usage of target list SRF is
incredibly common via generate_series() and unnest() etc is
exceptionally common...better not to break those cases without a
better justification than code simplicity.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Tom Lane

David Fetter  writes:
> On Mon, May 23, 2016 at 01:36:57PM -0400, Tom Lane wrote:
>> David Fetter  writes:
>>> How about making "TABLE generate_series(1,n)" work?  It's even
>>> shorter in exchange for some cognitive load.

>> No thanks --- the word after TABLE ought to be a table name, not some
>> arbitrary expression.  That's way too much mess to save one keystroke.

> It's not just about saving a keystroke.  This change would go with
> removing the ability to do SRFs in the target list of a SELECT
> query.

I guess you did not understand that I was rejecting doing that.
Telling people they have to modify existing code that does this and
works fine is exactly what I felt we can't do.  We might be able to
blow off complicated cases, but I think simpler cases are too common
in the field.

I'm on board with fixing things so that the *implementation* doesn't
support SRF-in-tlist.  But we can't just remove it from the language.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David Fetter

On Mon, May 23, 2016 at 01:36:57PM -0400, Tom Lane wrote:
> David Fetter  writes:
> > On Mon, May 23, 2016 at 01:10:29PM -0400, Tom Lane wrote:
> >> This seems a bridge too far to me.  It's just way too common to do
> >> "select generate_series(1,n)".  We could tell people they have to
> >> rewrite to "select * from generate_series(1,n)", but it would be far
> >> more polite to do that for them.
> 
> > How about making "TABLE generate_series(1,n)" work?  It's even
> > shorter in exchange for some cognitive load.
> 
> No thanks --- the word after TABLE ought to be a table name, not some
> arbitrary expression.  That's way too much mess to save one keystroke.

It's not just about saving a keystroke.  This change would go with
removing the ability to do SRFs in the target list of a SELECT
query.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Tom Lane

David Fetter  writes:
> On Mon, May 23, 2016 at 01:10:29PM -0400, Tom Lane wrote:
>> This seems a bridge too far to me.  It's just way too common to do
>> "select generate_series(1,n)".  We could tell people they have to
>> rewrite to "select * from generate_series(1,n)", but it would be far
>> more polite to do that for them.

> How about making "TABLE generate_series(1,n)" work?  It's even
> shorter in exchange for some cognitive load.

No thanks --- the word after TABLE ought to be a table name, not some
arbitrary expression.  That's way too much mess to save one keystroke.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread David Fetter

On Mon, May 23, 2016 at 01:10:29PM -0400, Tom Lane wrote:
> This seems a bridge too far to me.  It's just way too common to do
> "select generate_series(1,n)".  We could tell people they have to
> rewrite to "select * from generate_series(1,n)", but it would be far
> more polite to do that for them.

How about making "TABLE generate_series(1,n)" work?  It's even
shorter in exchange for some cognitive load.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-23 Thread Tom Lane

Andres Freund  writes:
> discussing executor performance with a number of people at pgcon,
> several hackers - me included - complained about the additional
> complexity, both code and runtime, required to handle SRFs in the target
> list.

Yeah, this has been an annoyance for a long time.

> One idea I circulated was to fix that by interjecting a special executor
> node to process SRF containing targetlists (reusing Result possibly?).
> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
> and get rid of ps_TupFromTlist which is fairly ugly.

Would that not lead to, in effect, duplicating all of execQual.c?  The new
executor node would still have to be prepared to process all expression
node types.

> Robert suggested - IIRC mentioning previous on-list discussion - to
> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
> that that'd be a larger undertaking, with significant semantics changes.

Yes, this was discussed on-list awhile back (I see David found a reference
already).  I think it's feasible, although we'd first have to agree
whether we want to remain bug-compatible with the old
least-common-multiple-of-the-periods behavior.  I would vote for not,
but it's certainly a debatable thing.

> If we accept bigger semantical changes, I'm inclined to instead just get
> rid of targetlist SRFs in total; they're really weird and not needed
> anymore.

This seems a bridge too far to me.  It's just way too common to do
"select generate_series(1,n)".  We could tell people they have to
rewrite to "select * from generate_series(1,n)", but it would be far
more polite to do that for them.

> One issue with removing targetlist SRFs is that they're currently
> considerably faster than SRFs in FROM:

I suspect that depends greatly on your test case.  But in any case
we could put more effort into optimizing nodeFunctionscan.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-22 Thread David G. Johnston

tl;dr

Semantic changes to SRF-in-target-list processing are undesirable when they
are all but deprecated.

I'd accept a refactoring that trades a performance gain for unaffected
queries for a reasonable performance hit of those afflicted.

Preamble...

Most recent thread that I can recall seeing on the topic - and where I
believe the rewrite idea was first presented.

http://www.postgresql.org/message-id/flat/25750.1458767...@sss.pgh.pa.us#25750.1458767...@sss.pgh.pa.us

On Sun, May 22, 2016 at 8:53 PM, Andres Freund  wrote:

> Hi,
>
> discussing executor performance with a number of people at pgcon,
> several hackers - me included - complained about the additional
> complexity, both code and runtime, required to handle SRFs in the target
> list.
>
> One idea I circulated was to fix that by interjecting a special executor
> node to process SRF containing targetlists (reusing Result possibly?).
> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
> and get rid of ps_TupFromTlist which is fairly ugly.
>

Conceptually I'm all for minimizing the impact on queries of this form.
It seems to be the most likely to get written and committed and the least
likely to cause unforeseen issues.

> Robert suggested - IIRC mentioning previous on-list discussion - to
> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
> that that'd be a larger undertaking, with significant semantics changes.
>
[...]

> If we accept bigger semantical changes, I'm inclined to instead just get
> rid of targetlist SRFs in total; they're really weird and not needed
> anymore.
>

I cannot see these, in isolation, being a good option.  Nonetheless, I
don't think any semantic change should happen before 9.2 becomes no longer
supported.  I'd be inclined to take a similar approach as with
standard_conforming_strings (minus the execution guc, just the warning one)
with whatever after-the-fact learning taken into account.

Its worth considering query rewrite and making it forbidden as a joint goal.

For something like a canonical version of this, especially for
composite-returning SRF:

WITH func_call (
SELECT func(tbl.col)
FROM tbl
)
SELECT (func_call.func).*
FROM func_call;

If we can rewrite the CTE portion into a lateral - with the exact same
semantics (specifically, returning the single-column composite) then check
the rewritten query the select list SRF would not longer be present and no
error would be thrown.

For situations where a rewrite cannot be made to behave properly we leave
the construct alone and let the query raise an error.

In considering what I just wrote I'm not particularly enamored with
it...hence my overall conclusion.  Can't say I hate it and after re-reading
the aforementioned thread I'm inclined to like it for cases where, for
instance, we are susceptible to a LCM evaluation.

David J.

Re: [HACKERS] Changed SRF in targetlist handling

2016-05-22 Thread Craig Ringer

On 23 May 2016 at 08:53, Andres Freund  wrote:

> Hi,
>
> discussing executor performance with a number of people at pgcon,
> several hackers - me included - complained about the additional
> complexity, both code and runtime, required to handle SRFs in the target
> list.
>
> One idea I circulated was to fix that by interjecting a special executor
> node to process SRF containing targetlists (reusing Result possibly?).
> That'd allow to remove the isDone argument from ExecEval*/ExecProject*
> and get rid of ps_TupFromTlist which is fairly ugly.
>
>
> Robert suggested - IIRC mentioning previous on-list discussion - to
> instead rewrite targetlist SRFs into lateral joins. My gut feeling is
> that that'd be a larger undertaking, with significant semantics changes.
>
> If we accept bigger semantical changes, I'm inclined to instead just get
> rid of targetlist SRFs in total; they're really weird and not needed
> anymore.
>
> One issue with removing targetlist SRFs is that they're currently
> considerably faster than SRFs in FROM:
> tpch[14693][1]=# COPY (SELECT * FROM generate_series(1, 1000)) TO
> '/dev/null';
> COPY 1000
> Time: 2217.167 ms
> tpch[14693][1]=# COPY (SELECT generate_series(1, 1000)) TO '/dev/null';
> COPY 1000
> Time: 1355.929 ms
> tpch[14693][1]=#
>
> I'm no tto concerned about that, and we could probably fixing by
> removing forced materialization from the relevant code path.
>
> Comments?
>
>
SRFs-in-tlist are a lot faster for lockstep iteration etc. They're also
much simpler to write, though if the result result rowcount differs
unexpectedly between the functions you get exciting and unexpected
behaviour.

WITH ORDINALITY provides what I think is the last of the functionality
needed to replace SRFs-in-from, but at a syntatactic complexity and
performance cost. The following example demonstrates that, though it
doesn't do anything that needs LATERAL etc. I'm aware the following aren't
semantically identical if the rowcounts differ.


craig=> EXPLAIN ANALYZE SELECT generate_series(1,100) x,
generate_series(1,100) y;
  QUERY PLAN

--
 Result  (cost=0.00..5.01 rows=1000 width=0) (actual time=0.024..92.845
rows=100 loops=1)
 Planning time: 0.039 ms
 Execution time: 123.123 ms
(3 rows)

Time: 123.719 ms


craig=> EXPLAIN ANALYZE SELECT x, y FROM generate_series(1,100) WITH
ORDINALITY AS x(i, n) INNER JOIN generate_series(1,100) WITH ORDINALITY
AS y(i, n) ON (x.n = y.n);
QUERY PLAN

--
 Merge Join  (cost=0.01..97.50 rows=5000 width=64) (actual
time=179.863..938.375 rows=100 loops=1)
   Merge Cond: (x.n = y.n)
   ->  Function Scan on generate_series x  (cost=0.00..10.00 rows=1000
width=40) (actual time=108.813..303.690 rows=100 loops=1)
   ->  Materialize  (cost=0.00..12.50 rows=1000 width=40) (actual
time=71.043..372.880 rows=100 loops=1)
 ->  Function Scan on generate_series y  (cost=0.00..10.00
rows=1000 width=40) (actual time=71.039..266.209 rows=100 loops=1)
 Planning time: 0.184 ms
 Execution time: 970.744 ms
(7 rows)

Time: 971.706 ms


I get the impression the with-ordinality case could perform just as well if
the optimiser recognised a join on the ordinality column and iterated the
functions in lockstep to populate the result row directly. Though that
could perform _worse_ if the function is computationally costly and
benefits significantly from the CPU cache, where we're better off
materializing it or at least executing it in chunks/batches...


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[HACKERS] Changed SRF in targetlist handling

2016-05-22 Thread Andres Freund

Hi,

discussing executor performance with a number of people at pgcon,
several hackers - me included - complained about the additional
complexity, both code and runtime, required to handle SRFs in the target
list.

One idea I circulated was to fix that by interjecting a special executor
node to process SRF containing targetlists (reusing Result possibly?).
That'd allow to remove the isDone argument from ExecEval*/ExecProject*
and get rid of ps_TupFromTlist which is fairly ugly.


Robert suggested - IIRC mentioning previous on-list discussion - to
instead rewrite targetlist SRFs into lateral joins. My gut feeling is
that that'd be a larger undertaking, with significant semantics changes.

If we accept bigger semantical changes, I'm inclined to instead just get
rid of targetlist SRFs in total; they're really weird and not needed
anymore.

One issue with removing targetlist SRFs is that they're currently
considerably faster than SRFs in FROM:
tpch[14693][1]=# COPY (SELECT * FROM generate_series(1, 1000)) TO 
'/dev/null';
COPY 1000
Time: 2217.167 ms
tpch[14693][1]=# COPY (SELECT generate_series(1, 1000)) TO '/dev/null';
COPY 1000
Time: 1355.929 ms
tpch[14693][1]=#

I'm no tto concerned about that, and we could probably fixing by
removing forced materialization from the relevant code path.

Comments?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

58 matches

Mail list logo