Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-24 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 3:18 PM, Stephen Frost  wrote:
> I have to side with what I believe is Tom's position on this one.  I do
> like the notion of throwing an error in cases where someone sent us
> something that we're pretty sure is wrong, but I don't agree that we
> should continue to carry on bug-compatibility with things that are
> already one foot in the grave and really just need to be shoved all the
> way in.

Well, the fact that it turns out to be 2+ SRF, not just 1 as a trigger
has significantly lowered my alarm.  I agree that such usages are tiny
and the LCM way of determining rows is definitely bizarre. Don't
believe me?  Try figuring out when

select generate_series(1,nextval('s')), generate_series(1,nextval('s'));

terminates.  (hint: it doesn't).   Another cute multiple SRF
invocation: 
http://merlinmoncure.blogspot.com/2007/12/12-days-of-christmas-postgresql-style.html
:-)

> This isn't the only break in backwards compatibility we've had over the
> years and is pretty far from the largest (string escaping, anyone?  or
> removing implicit casts?) and I'd argue we're better off for it.

String escaping was an unmitigated disaster. Implict cast removal cost
my company a couple of hundred thousand bucks and came within a hair
of pushing postgres out completely (not that I'm complaining, we're
the better for that but these decisions must not be taken lightly).
Things are different now.

On Wed, Mar 23, 2016 at 5:34 PM, Tom Lane  wrote:
> (The SRF invocations might be buried inside expressions, but we'd find
> them and convert them anyway.  Also, we'd merge textually-identical SRF
> invocations into a single ROWS FROM entry to avoid multiple evaluation,
> at least for SRFs not marked volatile.)  Having done that, the executor's
> support for SRFs in general expressions could be removed, a significant
> savings.

That's just brilliant -- I'd be on board with that FWIW.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-24 Thread Tom Lane
Robert Haas  writes:
> On Wed, Mar 23, 2016 at 6:34 PM, Tom Lane  wrote:
>> A possibly larger problem is that it causes the SRFs to be evaluated
>> before sorting/ordering/limiting.

> I'm not sure I understand quite what the problem is here.

If you have "SELECT srf(x), y FROM ... ORDER BY y LIMIT n", we go
to some lengths as of 9118d03a8 to ensure that the ordering happens
before the SRF is expanded, meaning in particular that the SRF is
expanded only far enough to satisfy the limit.  That is not the case
for SRF-in-FROM, and can't be if for instance there's a GROUP BY
involved.  As-is, the proposed transformation obviously breaks the
semantics if grouping/aggregation/windowing is involved, and my point
is that that's true for ORDER BY/LIMIT as well.  We'd need to fix
things so that the apparent order of performing those steps stays
the same.

I think that's probably doable in most cases by making a sub-select,
but I'm not sure if that works for every case --- in particular, what
do we do with a SRF referenced in ORDER BY or GROUP BY?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-24 Thread Robert Haas
On Wed, Mar 23, 2016 at 6:34 PM, Tom Lane  wrote:
> I wrote:
>> ... I'd love to
>> toss the entire SRF-in-tlist feature overboard one of these years,
>> both because of semantic issues like these and because a fair amount
>> of code and overhead could be ripped out if it were disallowed.
>> But I don't know how we get to that --- as Merlin says, there's a
>> pretty large amount of application code depending on the feature.
>
> BTW, after further thought I seem to recall some discussion about whether
> we could mechanically transform SRF-in-tlist into a LATERAL query.
> That is, we could make the rewriter or planner convert
>
> SELECT srf1(...), srf2(...)
>   FROM ...
>   WHERE ...
>
> into
>
> SELECT u.s1, u.s2
>   FROM ...,  LATERAL ROWS FROM(srf1(...), srf2(...)) AS u(s1,s2)
>   WHERE ...

I think *that* would be grand.  If I'm not wrong, that's the behavior
that anybody would naively expect.

> (The SRF invocations might be buried inside expressions, but we'd find
> them and convert them anyway.  Also, we'd merge textually-identical SRF
> invocations into a single ROWS FROM entry to avoid multiple evaluation,
> at least for SRFs not marked volatile.)  Having done that, the executor's
> support for SRFs in general expressions could be removed, a significant
> savings.

That is a highly appealing fringe benefit.

> One problem with this is that it only duplicates the current behavior
> in cases where the SRFs all have the same period.  But you could argue
> that the current behavior in cases where they don't is widely considered
> a bug anyway.

I would so argue.  Also, wouldn't this fix the problem that
(srf(blah)).* causes multiple evaluation?  That would be *awesome*.

> A possibly larger problem is that it causes the SRFs to be evaluated
> before sorting/ordering/limiting.

I'm not sure I understand quite what the problem is here.  If there's
a LIMIT, then the proposed transformation would presumably run the SRF
only enough times to fill the limit.  I'm not sure you have any
guarantees about ordering anyway - doesn't that depend on whether the
planner can find a way to produce presorted output via an index scan,
merge join, etc.?

> In view of the discussions that led
> up to 9118d03a8, that could be a fatal objection :-(.  Maybe we could
> get around it with a different transformation, into a two-level query?
> The above sketch doesn't really consider what to do with GROUP BY,
> ORDER BY, etc, but maybe we could push those down into a sub-select
> that becomes the first FROM item.

That seems like it might work.

> Anyway, that's food for thought not something that could realistically
> happen right now.

Ah, bummer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Regina Obe

> I'm something of a backwards compatibility zealot, but I've become one for 
> very good reasons.  Personally, I'd rather we'd define precisely the usages 
> that are deprecated (I guess SRF-tlist in the presence of
> FROM) and force them to error out with an appropriate HINT rather than give a 
> different answer than they used to.  The problem here is that LATERAL is 
> still fairly new and there is a huge body of code out there leveraging the 
> 'bad' way, as it was for years > and years the only way to do a number of 
> useful things.

> merlin

FWIW: I prefer Merlin's solution of erroring out rather than returning an 
unexpected answer and if it's a buggy behavior it should be eradicated.

The reason is this.  For many  the (..).* ORDER BY .. looks equivalent to the 
lateral.  
More than a trivial amount of my time has been spent explaining to people why 
their raster queries are so slow because the SRF is called multiple times and 
they should switch to LATERAL usage.

So if the old solution is still going to have the same penalty's I would much 
assume just scrap it and break people's code in a clear and noticeable way they 
can't ignore.

There is nothing more frustrating than code that still works but gives you an 
answer different than what you are expecting.  Those kind of bugs stay buried 
for a while.

I think as long as it's noted in the release notes of the breaking change it's 
fine and called for if it makes your code cleaner and more manageable.

Thanks,
Regina




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
I wrote:
> ... I'd love to
> toss the entire SRF-in-tlist feature overboard one of these years,
> both because of semantic issues like these and because a fair amount
> of code and overhead could be ripped out if it were disallowed.
> But I don't know how we get to that --- as Merlin says, there's a
> pretty large amount of application code depending on the feature.

BTW, after further thought I seem to recall some discussion about whether
we could mechanically transform SRF-in-tlist into a LATERAL query.
That is, we could make the rewriter or planner convert

SELECT srf1(...), srf2(...)
  FROM ...
  WHERE ...

into

SELECT u.s1, u.s2
  FROM ...,  LATERAL ROWS FROM(srf1(...), srf2(...)) AS u(s1,s2)
  WHERE ...

(The SRF invocations might be buried inside expressions, but we'd find
them and convert them anyway.  Also, we'd merge textually-identical SRF
invocations into a single ROWS FROM entry to avoid multiple evaluation,
at least for SRFs not marked volatile.)  Having done that, the executor's
support for SRFs in general expressions could be removed, a significant
savings.

One problem with this is that it only duplicates the current behavior
in cases where the SRFs all have the same period.  But you could argue
that the current behavior in cases where they don't is widely considered
a bug anyway.

A possibly larger problem is that it causes the SRFs to be evaluated
before sorting/ordering/limiting.  In view of the discussions that led
up to 9118d03a8, that could be a fatal objection :-(.  Maybe we could
get around it with a different transformation, into a two-level query?
The above sketch doesn't really consider what to do with GROUP BY,
ORDER BY, etc, but maybe we could push those down into a sub-select
that becomes the first FROM item.

Anyway, that's food for thought not something that could realistically
happen right now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Stephen Frost
* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, Mar 23, 2016 at 2:11 PM, Tom Lane  wrote:
> > In the meantime I suppose there's a case to be made for preserving
> > bug compatibility as much as possible.
> >
> > So anyway the question is whether to commit the attached or not.
> 
> +1 for commit - I'll trust Tom on the quality of the patch :)

I'm not going to object to it.  All-in-all, I suppose '+0' from me.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread David G. Johnston
On Wed, Mar 23, 2016 at 2:11 PM, Tom Lane  wrote:

>
> ​​
> In the meantime I suppose there's a case to be made for preserving
> bug compatibility as much as possible.
>
> So anyway the question is whether to commit the attached or not.


​+1 for commit - I'll trust Tom on the quality of the patch :)

David J.


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
Stephen Frost  writes:
> * Merlin Moncure (mmonc...@gmail.com) wrote:
>> I'm something of a backwards compatibility zealot, but I've become one
>> for very good reasons.  Personally, I'd rather we'd define precisely
>> the usages that are deprecated (I guess SRF-tlist in the presence of
>> FROM) and force them to error out with an appropriate HINT rather than
>> give a different answer than they used to.  The problem here is that
>> LATERAL is still fairly new and there is a huge body of code out there
>> leveraging the 'bad' way, as it was for years and years the only way
>> to do a number of useful things.

> I have to side with what I believe is Tom's position on this one.  I do
> like the notion of throwing an error in cases where someone sent us
> something that we're pretty sure is wrong, but I don't agree that we
> should continue to carry on bug-compatibility with things that are
> already one foot in the grave and really just need to be shoved all the
> way in.

FWIW, I think the case that we're looking at only comes up if you have
more than one SRF expression in the tlist (Regina's example has that
after *-expansion, though it might not've looked so to start with),
and some of them are sort/group columns while others are not.

So the question at hand is whether that falls within the scope of "huge
body of code leveraging the old way" or whether it's too much of a corner
case to want to tie ourselves down to.

I wrote a patch that fixes this for the sort-column case, thus getting us
back to the historical behavior; see attached.  If we really wanted to be
consistent, we'd have to look at pushing all SRFs clear back to the
scanjoin_target list if any SRFs appear in grouping columns.  That would
be significantly more code and it would deviate from the historical
behavior, as I illustrated upthread, so I'm not really inclined to do it.
But the historical behavior in this area is pretty unprincipled.

It's also worth noting that we've had multiple complaints about the
ExecTargetList least-common-multiple behavior over the years; it seems
sane enough in examples like these where all the SRFs have the same
period, but it gets way less so as soon as they don't.  I'd love to
toss the entire SRF-in-tlist feature overboard one of these years,
both because of semantic issues like these and because a fair amount
of code and overhead could be ripped out if it were disallowed.
But I don't know how we get to that --- as Merlin says, there's a
pretty large amount of application code depending on the feature.
In the meantime I suppose there's a case to be made for preserving
bug compatibility as much as possible.

So anyway the question is whether to commit the attached or not.

regards, tom lane

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5229c84..db347b8 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** make_pathkeys_for_window(PlannerInfo *ro
*** 4615,4625 
   *
   * Our current policy is to postpone volatile expressions till after the sort
   * unconditionally (assuming that that's possible, ie they are in plain tlist
!  * columns and not ORDER BY/GROUP BY/DISTINCT columns).  We also postpone
!  * set-returning expressions unconditionally (if possible), because running
!  * them beforehand would bloat the sort dataset, and because it might cause
!  * unexpected output order if the sort isn't stable.  Expensive expressions
!  * are postponed if there is a LIMIT, or if root->tuple_fraction shows that
   * partial evaluation of the query is possible (if neither is true, we expect
   * to have to evaluate the expressions for every row anyway), or if there are
   * any volatile or set-returning expressions (since once we've put in a
--- 4615,4630 
   *
   * Our current policy is to postpone volatile expressions till after the sort
   * unconditionally (assuming that that's possible, ie they are in plain tlist
!  * columns and not ORDER BY/GROUP BY/DISTINCT columns).  We also prefer to
!  * postpone set-returning expressions, because running them beforehand would
!  * bloat the sort dataset, and because it might cause unexpected output order
!  * if the sort isn't stable.  However there's a constraint on that: all SRFs
!  * in the tlist should be evaluated at the same plan step, so that they can
!  * run in sync in ExecTargetList.  So if any SRFs are in sort columns, we
!  * mustn't postpone any SRFs.  (Note that in principle that policy should
!  * probably get applied to the group/window input targetlists too, but we
!  * have not done that historically.)  Lastly, expensive expressions are
!  * postponed if there is a LIMIT, or if root->tuple_fraction shows that
   * partial evaluation of the query is possible (if neither is true, we expect
   * to have to evaluate the expressions for every row anyway), or if there are
   * any volatile or set-returning 

Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Stephen Frost
* Merlin Moncure (mmonc...@gmail.com) wrote:
> On Wed, Mar 23, 2016 at 12:37 PM, Tom Lane  wrote:
> > which is both SQL-standard semantics and much more efficient than
> > SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
> > introduced LATERAL in 9.3.  How much are we willing to do to stay
> > bug-compatible with old behaviors here?
> 
> I think we should, and the fact this was caught so early on the
> release cycle underscores that.  One of the problems is that there are
> reasonable cases (note, not impacted by this bug) of this usage that
> are still commonplace, for example:
> 
> ysanalysis=# select unnest(current_schemas(true));
>unnest
> 
>  pg_catalog
>  public
> 
> I'm something of a backwards compatibility zealot, but I've become one
> for very good reasons.  Personally, I'd rather we'd define precisely
> the usages that are deprecated (I guess SRF-tlist in the presence of
> FROM) and force them to error out with an appropriate HINT rather than
> give a different answer than they used to.  The problem here is that
> LATERAL is still fairly new and there is a huge body of code out there
> leveraging the 'bad' way, as it was for years and years the only way
> to do a number of useful things.

I have to side with what I believe is Tom's position on this one.  I do
like the notion of throwing an error in cases where someone sent us
something that we're pretty sure is wrong, but I don't agree that we
should continue to carry on bug-compatibility with things that are
already one foot in the grave and really just need to be shoved all the
way in.

This isn't the only break in backwards compatibility we've had over the
years and is pretty far from the largest (string escaping, anyone?  or
removing implicit casts?) and I'd argue we're better off for it.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
"David G. Johnston"  writes:
> On Wednesday, March 23, 2016, Tom Lane  wrote:
>> ...  We've more or less deprecated SRF-in-tlist since we
>> introduced LATERAL in 9.3.  How much are we willing to do to stay
>> bug-compatible with old behaviors here?

> My gut reaction is that this is an unnecessary regression for the sake of a
> performance optimization that is likely drowned out in the usage presented
> anyway.
> The pivot for me would be how hard would it be to maintain the old behavior
> in this "more or less deprecated" scenario.  I have no way to judge that.

Well, it'd make make_sort_input_target() more complicated and a bit
slower, mainly because it would have to check SRF-ness of sorting columns
that it currently has no need to inspect at all.  My concern here is not
really with that, it's with trying to draw a boundary around what behavior
we're going to promise bug-compatibility with.  To illustrate, you really
can get the multiple-expansions behavior with existing releases, eg in
9.4

regression=# SELECT (dumpset(f.test, 'hello world' || f.test)).*, f.test
FROM generate_series(1,4) As f(test)
GROUP BY junk2, f.test;
 id | junk1 | junk2 | test 
+---+---+--
  1 | 3hello world3 | 31|3
  2 | 3hello world3 | 31|3
  1 | 4hello world4 | 42|4
  2 | 4hello world4 | 42|4
  1 | 4hello world4 | 41|4
  2 | 4hello world4 | 41|4
  1 | 1hello world1 | 11|1
  2 | 1hello world1 | 11|1
  1 | 3hello world3 | 32|3
  2 | 3hello world3 | 32|3
  1 | 2hello world2 | 21|2
  2 | 2hello world2 | 21|2
  1 | 2hello world2 | 22|2
  2 | 2hello world2 | 22|2
  1 | 1hello world1 | 12|1
  2 | 1hello world1 | 12|1
(16 rows)

which is kind of fun to wrap your head around, but after awhile you
realize that dumpset(...).junk2 is being evaluated before the GROUP BY
and dumpset.(...).id after it, which is how come the grouping behavior
is so obviously violated.

AFAICS, the only way to ensure non-crazy behavior for such examples would
be to force all SRFs in the tlist to be evaluated at the same plan step.
Which we've never done in the past, and if we were to start doing so,
it would cause a behavioral change for examples like this one.

Anyway, my concern is just that we're deciding to stay bug-compatible
with some behaviors that were not very well thought out to start with,
and we don't even have a clear understanding of where the limits of
that compatibility will be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Merlin Moncure
On Wed, Mar 23, 2016 at 12:37 PM, Tom Lane  wrote:
> which is both SQL-standard semantics and much more efficient than
> SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
> introduced LATERAL in 9.3.  How much are we willing to do to stay
> bug-compatible with old behaviors here?

I think we should, and the fact this was caught so early on the
release cycle underscores that.  One of the problems is that there are
reasonable cases (note, not impacted by this bug) of this usage that
are still commonplace, for example:

ysanalysis=# select unnest(current_schemas(true));
   unnest

 pg_catalog
 public

I'm something of a backwards compatibility zealot, but I've become one
for very good reasons.  Personally, I'd rather we'd define precisely
the usages that are deprecated (I guess SRF-tlist in the presence of
FROM) and force them to error out with an appropriate HINT rather than
give a different answer than they used to.  The problem here is that
LATERAL is still fairly new and there is a huge body of code out there
leveraging the 'bad' way, as it was for years and years the only way
to do a number of useful things.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread David G. Johnston
On Wednesday, March 23, 2016, Tom Lane  wrote:

> "Regina Obe" > writes:
> > In the past couple of weeks our PostGIS tests against PostgreSQL 9.6 dev
> > started failing.  I traced the issue down to a behavior change in 9.6
> when
> > dealing with output of set returning functions when used with (func).*
> > syntax.
>
> > CREATE OR REPLACE FUNCTION dumpset(param_num integer, param_text text)
> > RETURNS TABLE(id integer, junk1 text, junk2 text)
> > ...
> > -- Get 16 rows in 9.6, Get 8 rows in 9.5
> > SELECT (dumpset(f.test, 'hello world' || f.test)).*
> > FROM generate_series(1,4) As f(test)
> > ORDER BY junk2;
>
> I think this is a side-effect of 9118d03a8 ("When appropriate, postpone
> SELECT output expressions till after ORDER BY").  Previously, although you
> got N evaluations of the SRF which is pretty horrid, they were all in the
> same targetlist and hence ran in sync and produced only the expected
> number of rows (thanks to the otherwise-indefensible ExecTargetList
> behavior by which multiple SRF tlist expressions produce a number of rows
> equal to the least common multiple of their periods, not the product).
> That commit causes the evaluation of dumpset(...).junk1 to be postponed
> till after the Sort step, but the evaluation of dumpset(...).junk2
> necessarily can't be.  So now you get dumpset(...).junk2 inflating
> the original rowcount 2X, and then dumpset(...).junk1 inflating it
> another 2X after the Sort.
>
> We could remain bug-compatible with the old behavior by adding some
> restriction to keep all the tlist SRFs getting evaluated at the same
> plan step, at least to the extent that we can.  I think you could get
> similar strange behaviors in prior releases if you used GROUP BY or
> another syntax that might result in early evaluation of the sort column,
> and we're not going to be able to fix that.  But we could prevent this
> particular optimization from introducing new strangeness.
>
> But I'm not really sure that we should.  The way that you *should*
> write this query is
>
> SELECT ds.*
> FROM generate_series(1,4) AS f(test),
>  dumpset(f.test, 'hello world' || f.test) AS ds
> ORDER BY junk2;
>
> which is both SQL-standard semantics and much more efficient than
> SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
> introduced LATERAL in 9.3.  How much are we willing to do to stay
> bug-compatible with old behaviors here?
>
>
My gut reaction is that this is an unnecessary regression for the sake of a
performance optimization that is likely drowned out in the usage presented
anyway.

The pivot for me would be how hard would it be to maintain the old behavior
in this "more or less deprecated" scenario.  I have no way to judge that.

David J.


Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Tom Lane
"Regina Obe"  writes:
> In the past couple of weeks our PostGIS tests against PostgreSQL 9.6 dev
> started failing.  I traced the issue down to a behavior change in 9.6 when
> dealing with output of set returning functions when used with (func).*
> syntax.

> CREATE OR REPLACE FUNCTION dumpset(param_num integer, param_text text)
> RETURNS TABLE(id integer, junk1 text, junk2 text)
> ...
> -- Get 16 rows in 9.6, Get 8 rows in 9.5
> SELECT (dumpset(f.test, 'hello world' || f.test)).*
> FROM generate_series(1,4) As f(test)
> ORDER BY junk2;

I think this is a side-effect of 9118d03a8 ("When appropriate, postpone
SELECT output expressions till after ORDER BY").  Previously, although you
got N evaluations of the SRF which is pretty horrid, they were all in the
same targetlist and hence ran in sync and produced only the expected
number of rows (thanks to the otherwise-indefensible ExecTargetList
behavior by which multiple SRF tlist expressions produce a number of rows
equal to the least common multiple of their periods, not the product).
That commit causes the evaluation of dumpset(...).junk1 to be postponed
till after the Sort step, but the evaluation of dumpset(...).junk2
necessarily can't be.  So now you get dumpset(...).junk2 inflating
the original rowcount 2X, and then dumpset(...).junk1 inflating it
another 2X after the Sort.

We could remain bug-compatible with the old behavior by adding some
restriction to keep all the tlist SRFs getting evaluated at the same
plan step, at least to the extent that we can.  I think you could get
similar strange behaviors in prior releases if you used GROUP BY or
another syntax that might result in early evaluation of the sort column,
and we're not going to be able to fix that.  But we could prevent this
particular optimization from introducing new strangeness.

But I'm not really sure that we should.  The way that you *should*
write this query is

SELECT ds.*
FROM generate_series(1,4) AS f(test),
 dumpset(f.test, 'hello world' || f.test) AS ds
ORDER BY junk2;

which is both SQL-standard semantics and much more efficient than
SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
introduced LATERAL in 9.3.  How much are we willing to do to stay
bug-compatible with old behaviors here?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Regina Obe
In the past couple of weeks our PostGIS tests against PostgreSQL 9.6 dev
started failing.  I traced the issue down to a behavior change in 9.6 when
dealing with output of set returning functions when used with (func).*
syntax.

Here is an example not involving PostGIS.  Is this an intentional change in
behavior?

CREATE OR REPLACE FUNCTION dumpset(param_num integer, param_text text)
RETURNS TABLE(id integer, junk1 text, junk2 text)
AS
$$
BEGIN
 RETURN QUERY SELECT id2 As id, $1 || $2::text As junk1, $1 || id2::text AS
junk2
FROM generate_series(1,2) As id2;
END;

$$
language 'plpgsql';

-- Get 16 rows in 9.6, Get 8 rows in 9.5
SELECT (dumpset(f.test, 'hello world' || f.test)).*
FROM generate_series(1,4) As f(test)
ORDER BY junk2;


I know that functions get called multiple times with (..).* and so it's
frowned upon, but before the results would only return once and I suspect
for people who are lazy and also don't mind the penalty cost they might just
use this syntax.
If its intentional I can change the tests to follow the best practice
approach.

I think the tests started failing around March 8th which I thought might
have to do with this commit: 9118d03a8cca3d97327c56bf89a72e328e454e63
(around that time) 
When appropriate, postpone SELECT output expressions till after ORDER
BY.
It is frequently useful for volatile, set-returning, or expensive
functions in a SELECT's targetlist to be postponed till after ORDER BY
and LIMIT are done.

Which involved change in output sort.

I'm not absolutely sure if this has to do with that commit, because we had
another test failing (where the order of the result changed, and putting in
an order by fixed that test). 



Thanks,
Regina






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers