Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-19 Thread Tom Lane
Andres Freund  writes:
> Maybe we ought to remove the paranoia bit about retset
> though - it's not paranoia if it has an effect.

Exactly, and I already did that in my version.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-19 Thread Andres Freund
On 2017-01-19 13:06:20 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-01-18 16:56:46 -0500, Tom Lane wrote:
> >> Andres Freund  writes:
> >> I have not actually looked at 0003 at all yet.  So yeah, please post
> >> for review after you're done rebasing.
>
> > Here's a rebased and lightly massaged version.
>
> I've read through this and made some minor improvements, mostly additional
> comment cleanup.

Thanks!


> One thing I wanted to ask about:
>
> @@ -4303,7 +4303,7 @@ inline_function(Oid funcid, Oid result_type, Oid 
> result_collid,
>
>  /*
>   * Forget it if the function is not SQL-language or has other showstopper
> - * properties.  (The nargs check is just paranoia.)
> + * properties.  (The nargs and retset checks are just paranoia.)
>   */
>  if (funcform->prolang != SQLlanguageId ||
>  funcform->prosecdef ||
>
> I thought this change was simply wrong, and removed it;

Hm. I made that change a while ago.  It might have been a holdover from
the old approach, where it'd indeed have been impossible to see any
tSRFs here.  Or it might have been because we check
querytree->hasTargetSRFs below (which should prevent inlining a function
that actually returns multiple rows).  I agree it's better to leave the
check there. Maybe we ought to remove the paranoia bit about retset
though - it's not paranoia if it has an effect.


> Other than that possible point, I think the attached is committable.

Will do so in a bit, after a s/and retset checks are/check is/. And then
fix that big-endian ordering issue.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-19 Thread Tom Lane
Andres Freund  writes:
> On 2017-01-18 16:56:46 -0500, Tom Lane wrote:
>> Andres Freund  writes:
>> I have not actually looked at 0003 at all yet.  So yeah, please post
>> for review after you're done rebasing.

> Here's a rebased and lightly massaged version.

I've read through this and made some minor improvements, mostly additional
comment cleanup.  One thing I wanted to ask about:

@@ -4303,7 +4303,7 @@ inline_function(Oid funcid, Oid result_type, Oid 
result_collid,
 
 /*
  * Forget it if the function is not SQL-language or has other showstopper
- * properties.  (The nargs check is just paranoia.)
+ * properties.  (The nargs and retset checks are just paranoia.)
  */
 if (funcform->prolang != SQLlanguageId ||
 funcform->prosecdef ||

I thought this change was simply wrong, and removed it; AFAIK it's
perfectly possible to get here for set-returning functions, since
the planner does expression simplification long before it worries
about splitting out SRFs.  Did you have a reason to think differently?

Other than that possible point, I think the attached is committable.

regards, tom lane



no-srfs-in-tlists-cleanup-2.patch.gz
Description: no-srfs-in-tlists-cleanup-2.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Robert Haas  writes:
> So, one of the big reasons I use CASE is to avoid evaluating
> expressions in cases where they might throw an ERROR.  Like, you know:
> CASE WHEN d != 0 THEN n / d ELSE NULL END
> I guess it's not the end of the world if that only works for
> non-set-returning functions, but it's something to think about.

Well, refusing CASE-containing-SRF at all isn't going to make your
life any better in that regard :-(

It's possibly worth noting that this is also true for aggregates and
window functions: wrapping those in a CASE doesn't stop them from being
evaluated, either.  People seem to be generally used to that, although
I think I've seen one or two complaints about it from folks who seemed
unclear on the concept of aggregates.

In the end I think this is mostly about backwards compatibility:
are we sufficiently worried about that that we'd rather throw an
error than have a silent change of behavior?  TBH I'm not sure.
We've certainly got two other silent changes of behavior in this
same patch.  The argument for treating this one differently,
I think, is that it's changing from a less surprising behavior
to a more surprising one whereas the other changes are the reverse,
or at worst neutral.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 7:00 PM, Andres Freund  wrote:
>>So, one of the big reasons I use CASE is to avoid evaluating
>>expressions in cases where they might throw an ERROR.  Like, you know:
>>
>>CASE WHEN d != 0 THEN n / d ELSE NULL END
>>
>>I guess it's not the end of the world if that only works for
>>non-set-returning functions, but it's something to think about.
>
> That's already not reliable in a bunch of cases, particularly evaluation 
> during planning...  Not saying that's good, but it is.

Whee!

:-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund


On January 18, 2017 3:59:00 PM PST, Robert Haas  wrote:
>On Wed, Jan 18, 2017 at 6:19 PM, Andres Freund 
>wrote:
>>>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END
>FROM tab;
>>>
>>>   It might seem that this should produce five repetitions of input
>rows
>>>   that have x > 0, and a single repetition of those that do not; but
>>>   actually it will produce five repetitions of every input row. This
>is
>>>   because generate_series() is run first, and then the CASE
>expression is
>>>   applied to its result rows. The behavior is thus comparable to
>>>
>>>   SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
>>> FROM tab, LATERAL generate_series(1,5) AS g;
>>>
>>>   It would be exactly the same, except that in this specific
>example, the
>>>   planner could choose to put g on the outside of the nestloop join,
>since
>>>   g has no actual lateral dependency on tab. That would result in a
>>>   different output row order. Set-returning functions in the select
>list
>>>   are always evaluated as though they are on the inside of a
>nestloop join
>>>   with the rest of the FROM clause, so that the function(s) are run
>to
>>>   completion before the next row from the FROM clause is considered.
>>>
>>> So is this too ugly to live, or shall we put up with it?
>>
>> I'm very tentatively in favor of living with it.
>
>So, one of the big reasons I use CASE is to avoid evaluating
>expressions in cases where they might throw an ERROR.  Like, you know:
>
>CASE WHEN d != 0 THEN n / d ELSE NULL END
>
>I guess it's not the end of the world if that only works for
>non-set-returning functions, but it's something to think about.

That's already not reliable in a bunch of cases, particularly evaluation during 
planning...  Not saying that's good, but it is.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Robert Haas
On Wed, Jan 18, 2017 at 6:19 PM, Andres Freund  wrote:
>>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
>>
>>   It might seem that this should produce five repetitions of input rows
>>   that have x > 0, and a single repetition of those that do not; but
>>   actually it will produce five repetitions of every input row. This is
>>   because generate_series() is run first, and then the CASE expression is
>>   applied to its result rows. The behavior is thus comparable to
>>
>>   SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
>> FROM tab, LATERAL generate_series(1,5) AS g;
>>
>>   It would be exactly the same, except that in this specific example, the
>>   planner could choose to put g on the outside of the nestloop join, since
>>   g has no actual lateral dependency on tab. That would result in a
>>   different output row order. Set-returning functions in the select list
>>   are always evaluated as though they are on the inside of a nestloop join
>>   with the rest of the FROM clause, so that the function(s) are run to
>>   completion before the next row from the FROM clause is considered.
>>
>> So is this too ugly to live, or shall we put up with it?
>
> I'm very tentatively in favor of living with it.

So, one of the big reasons I use CASE is to avoid evaluating
expressions in cases where they might throw an ERROR.  Like, you know:

CASE WHEN d != 0 THEN n / d ELSE NULL END

I guess it's not the end of the world if that only works for
non-set-returning functions, but it's something to think about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 16:27:53 -0700, David G. Johnston wrote:
> ​I'd rather fail now and allow for the possibility of future implementation
> of the "it might seem that..." behavior.​

That's very unlikely to happen.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread David G. Johnston
On Wed, Jan 18, 2017 at 4:14 PM, Tom Lane  wrote:

> I wrote:
> > I'll try to write something about the SRF-in-CASE issue too.  Seeing
> > whether we can document that adequately seems like an important part
> > of making the decision about whether we need to block it.
>
> Here's what I came up with:
>
>   This behavior also means that set-returning functions will be evaluated
>   even when it might appear that they should be skipped because of a
>   conditional-evaluation construct, such as CASE or COALESCE. For example,
>   consider
>
>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
>
>   It might seem that this should produce five repetitions of input rows
>   that have x > 0, and a single repetition of those that do not; but
>   actually it will produce five repetitions of every input row.
>
> So is this too ugly to live, or shall we put up with it?
>
>
​Disallowing such an unlikely, and un-intuitive, corner-case strikes my
sensibilities.

​I'd rather fail now and allow for the possibility of future implementation
of the "it might seem that..." behavior.​

David J.


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 18:14:26 -0500, Tom Lane wrote:
> I wrote:
> > I'll try to write something about the SRF-in-CASE issue too.  Seeing
> > whether we can document that adequately seems like an important part
> > of making the decision about whether we need to block it.
> 
> Here's what I came up with:
> 
>   This behavior also means that set-returning functions will be evaluated
>   even when it might appear that they should be skipped because of a
>   conditional-evaluation construct, such as CASE or COALESCE. For example,
>   consider
> 
>   SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;
> 
>   It might seem that this should produce five repetitions of input rows
>   that have x > 0, and a single repetition of those that do not; but
>   actually it will produce five repetitions of every input row. This is
>   because generate_series() is run first, and then the CASE expression is
>   applied to its result rows. The behavior is thus comparable to
> 
>   SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
> FROM tab, LATERAL generate_series(1,5) AS g;
> 
>   It would be exactly the same, except that in this specific example, the
>   planner could choose to put g on the outside of the nestloop join, since
>   g has no actual lateral dependency on tab. That would result in a
>   different output row order. Set-returning functions in the select list
>   are always evaluated as though they are on the inside of a nestloop join
>   with the rest of the FROM clause, so that the function(s) are run to
>   completion before the next row from the FROM clause is considered.
> 
> So is this too ugly to live, or shall we put up with it?

I'm very tentatively in favor of living with it.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
I wrote:
> I'll try to write something about the SRF-in-CASE issue too.  Seeing
> whether we can document that adequately seems like an important part
> of making the decision about whether we need to block it.

Here's what I came up with:

  This behavior also means that set-returning functions will be evaluated
  even when it might appear that they should be skipped because of a
  conditional-evaluation construct, such as CASE or COALESCE. For example,
  consider

  SELECT x, CASE WHEN x > 0 THEN generate_series(1, 5) ELSE 0 END FROM tab;

  It might seem that this should produce five repetitions of input rows
  that have x > 0, and a single repetition of those that do not; but
  actually it will produce five repetitions of every input row. This is
  because generate_series() is run first, and then the CASE expression is
  applied to its result rows. The behavior is thus comparable to

  SELECT x, CASE WHEN x > 0 THEN g ELSE 0 END
FROM tab, LATERAL generate_series(1,5) AS g;

  It would be exactly the same, except that in this specific example, the
  planner could choose to put g on the outside of the nestloop join, since
  g has no actual lateral dependency on tab. That would result in a
  different output row order. Set-returning functions in the select list
  are always evaluated as though they are on the inside of a nestloop join
  with the rest of the FROM clause, so that the function(s) are run to
  completion before the next row from the FROM clause is considered.

So is this too ugly to live, or shall we put up with it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 17:34:56 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > (I also noticed the previous patch should have had a catversion bump :(,
> > will do after the meeting).
> 
> Uh, why?  It isn't touching any on-disk data structure.

Forget what I said - I was rushing to a meeting and not thinking
entirely clearly.  Was thinking about the new node types and that we now
(de)serialize plans for parallelism - but that's guaranteed to be the
same version.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> (I also noticed the previous patch should have had a catversion bump :(,
> will do after the meeting).

Uh, why?  It isn't touching any on-disk data structure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
On 2017-01-18 16:56:46 -0500, Tom Lane wrote:
> Andres Freund  writes:
> I have not actually looked at 0003 at all yet.  So yeah, please post
> for review after you're done rebasing.

Here's a rebased and lightly massaged version. I'm vanishing in a
meeting for a bit, thought it'd be more useful to get it now rather than
later.

(I also noticed the previous patch should have had a catversion bump :(,
will do after the meeting).

- Andres
>From 5a0bdc9543291c051c2dbab26492f6e0320e8f82 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 18 Jan 2017 13:51:47 -0800
Subject: [PATCH] Remove obsoleted code relating to targetlist SRF evaluation.

Author: Andres Freund
Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypow...@alap3.anarazel.de
---
 src/backend/catalog/index.c   |   3 +-
 src/backend/catalog/partition.c   |   5 +-
 src/backend/commands/copy.c   |   2 +-
 src/backend/commands/prepare.c|   3 +-
 src/backend/commands/tablecmds.c  |   3 +-
 src/backend/commands/typecmds.c   |   2 +-
 src/backend/executor/execAmi.c|  44 +-
 src/backend/executor/execQual.c   | 919 --
 src/backend/executor/execScan.c   |  30 +-
 src/backend/executor/execUtils.c  |   6 -
 src/backend/executor/nodeAgg.c|  52 +-
 src/backend/executor/nodeBitmapHeapscan.c |   2 -
 src/backend/executor/nodeCtescan.c|   2 -
 src/backend/executor/nodeCustom.c |   2 -
 src/backend/executor/nodeForeignscan.c|   2 -
 src/backend/executor/nodeFunctionscan.c   |   2 -
 src/backend/executor/nodeGather.c |  25 +-
 src/backend/executor/nodeGroup.c  |  42 +-
 src/backend/executor/nodeHash.c   |   2 +-
 src/backend/executor/nodeHashjoin.c   |  58 +-
 src/backend/executor/nodeIndexonlyscan.c  |   2 -
 src/backend/executor/nodeIndexscan.c  |  11 +-
 src/backend/executor/nodeLimit.c  |  19 +-
 src/backend/executor/nodeMergejoin.c  |  59 +-
 src/backend/executor/nodeModifyTable.c|   4 +-
 src/backend/executor/nodeNestloop.c   |  41 +-
 src/backend/executor/nodeProjectSet.c |   2 +-
 src/backend/executor/nodeResult.c |  33 +-
 src/backend/executor/nodeSamplescan.c |   8 +-
 src/backend/executor/nodeSeqscan.c|   2 -
 src/backend/executor/nodeSubplan.c|  31 +-
 src/backend/executor/nodeSubqueryscan.c   |   2 -
 src/backend/executor/nodeTidscan.c|   8 +-
 src/backend/executor/nodeValuesscan.c |   5 +-
 src/backend/executor/nodeWindowAgg.c  |  58 +-
 src/backend/executor/nodeWorktablescan.c  |   2 -
 src/backend/optimizer/util/clauses.c  |   4 +-
 src/backend/optimizer/util/predtest.c |   2 +-
 src/backend/utils/adt/domains.c   |   2 +-
 src/backend/utils/adt/xml.c   |   4 +-
 src/include/executor/executor.h   |   9 +-
 src/include/nodes/execnodes.h |  16 +-
 src/pl/plpgsql/src/pl_exec.c  |   5 +-
 43 files changed, 346 insertions(+), 1189 deletions(-)

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index cac0cbf7d4..26cbc0e06a 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1805,8 +1805,7 @@ FormIndexDatum(IndexInfo *indexInfo,
 elog(ERROR, "wrong number of index expressions");
 			iDatum = ExecEvalExprSwitchContext((ExprState *) lfirst(indexpr_item),
 			   GetPerTupleExprContext(estate),
-			   ,
-			   NULL);
+			   );
 			indexpr_item = lnext(indexpr_item);
 		}
 		values[i] = iDatum;
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 874e69d8d6..6dec75b59e 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -1339,7 +1339,7 @@ get_qual_for_range(PartitionKey key, PartitionBoundSpec *spec)
 			test_exprstate = ExecInitExpr(test_expr, NULL);
 			test_result = ExecEvalExprSwitchContext(test_exprstate,
 			  GetPerTupleExprContext(estate),
-	, NULL);
+	);
 			MemoryContextSwitchTo(oldcxt);
 			FreeExecutorState(estate);
 
@@ -1610,8 +1610,7 @@ FormPartitionKeyDatum(PartitionDispatch pd,
 elog(ERROR, "wrong number of partition key expressions");
 			datum = ExecEvalExprSwitchContext((ExprState *) lfirst(partexpr_item),
 			  GetPerTupleExprContext(estate),
-			  ,
-			  NULL);
+			  );
 			partexpr_item = lnext(partexpr_item);
 		}
 		values[i] = datum;
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 1fd2162794..ab666b9bdd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -3395,7 +3395,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 		Assert(CurrentMemoryContext == econtext->ecxt_per_tuple_memory);
 
 		values[defmap[i]] = ExecEvalExpr(defexprs[i], econtext,
-		 [defmap[i]], NULL);
+		 

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> There's one sgml comment you'd added:
> "Furthermore, nested set-returning functions did not work at all."
> I'm not quite sure what you're referring to there - it was previously
> allowed to have one set argument to an SRF:

Ooops ... that was composed too hastily, evidently.  Will fix.

I'll try to write something about the SRF-in-CASE issue too.  Seeing
whether we can document that adequately seems like an important part
of making the decision about whether we need to block it.

> Working on rebasing the cleanup patch now.  Interested in reviewing
> that?  Otherwise I think I'll just push the rebased version of what I'd
> posted before, after making another pass through it.

I have not actually looked at 0003 at all yet.  So yeah, please post
for review after you're done rebasing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On 2017-01-18 15:24:32 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > Yea, have something lying around.  Let me push it then when I get back from 
> > lunch?
> 
> Sure, no sweat.

Pushed.  Yay!

There's one sgml comment you'd added:
"Furthermore, nested set-returning functions did not work at all."

I'm not quite sure what you're referring to there - it was previously
allowed to have one set argument to an SRF:

postgres[28758][1]=# SELECT generate_series(1,generate_series(1,5));
┌─┐
│ generate_series │
├─┤
│   1 │
│   1 │
│   2 │
│   1 │
│   2 │
│   3 │


Am I misunderstanding what you meant?  I left it in what I committed,
but we probably should clear up the language there.


Working on rebasing the cleanup patch now.  Interested in reviewing
that?  Otherwise I think I'll just push the rebased version of what I'd
posted before, after making another pass through it.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> Yea, have something lying around.  Let me push it then when I get back from 
> lunch?

Sure, no sweat.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On January 18, 2017 12:00:12 PM PST, Tom Lane  wrote:
>Andres Freund  writes:
>> On 2017-01-18 14:24:12 -0500, Tom Lane wrote:
>>> So I think we can push this patch now and get on with the downstream
>>> patches.  Do you want to do the honors, or shall I?
>
>> Whatever you prefer - either way, I'll go on to rebasing the cleanup
>> patch afterwards (whose existance should probably be mentioned in the
>> commit message).
>
>OK, I can do it --- I have the revised patch already queued up in git
>stash, so it's easy.  Need to write a commit msg though.  Did you have
>a draft for that?

Yea, have something lying around.  Let me push it then when I get back from 
lunch?

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> On 2017-01-18 14:24:12 -0500, Tom Lane wrote:
>> So I think we can push this patch now and get on with the downstream
>> patches.  Do you want to do the honors, or shall I?

> Whatever you prefer - either way, I'll go on to rebasing the cleanup
> patch afterwards (whose existance should probably be mentioned in the
> commit message).

OK, I can do it --- I have the revised patch already queued up in git
stash, so it's easy.  Need to write a commit msg though.  Did you have
a draft for that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On 2017-01-18 14:24:12 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2017-01-18 08:43:24 -0500, Tom Lane wrote:
> >> ... except for one thing.  The more I look at it,
> >> the more disturbed I am by the behavioral change shown in rangefuncs.out
> >> --- that's the SRF-in-one-arm-of-CASE issue.
> 
> > I'm fine with leaving it as is in the patch, but I'm also fine with
> > changing things to ERROR.  Personally I don't think it matters much, and
> > we can whack it back and forth as we want later.  Thus I'm inclined to
> > commit it without erroring out; since presumably we'll take some time
> > deciding on what exactly we want to prohibit.
> 
> I agree.  If we do decide to throw an error, it would best be done in
> parse analysis, and thus would be practically independent of this patch
> anyway.

Cool, agreed then.


> >> * This bit in ExecProjectSRF was no good:
> >> + else if (IsA(gstate->arg, FuncExprState) &&
> >> +  ((FuncExpr *) gstate->arg->expr)->funcretset)
> 
> > Argh. That should have been FunExprState->func->fn_retset.
> 
> Nope; that was my first thought as well, but fn_retset isn't valid if
> init_fcache hasn't been run yet, which it won't have been the first time
> through.

Righty-O :(


> So I think we can push this patch now and get on with the downstream
> patches.  Do you want to do the honors, or shall I?

Whatever you prefer - either way, I'll go on to rebasing the cleanup
patch afterwards (whose existance should probably be mentioned in the
commit message).


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
Andres Freund  writes:
> On 2017-01-18 08:43:24 -0500, Tom Lane wrote:
>> ... except for one thing.  The more I look at it,
>> the more disturbed I am by the behavioral change shown in rangefuncs.out
>> --- that's the SRF-in-one-arm-of-CASE issue.

> I'm fine with leaving it as is in the patch, but I'm also fine with
> changing things to ERROR.  Personally I don't think it matters much, and
> we can whack it back and forth as we want later.  Thus I'm inclined to
> commit it without erroring out; since presumably we'll take some time
> deciding on what exactly we want to prohibit.

I agree.  If we do decide to throw an error, it would best be done in
parse analysis, and thus would be practically independent of this patch
anyway.

>> * This bit in ExecProjectSRF was no good:
>> + else if (IsA(gstate->arg, FuncExprState) &&
>> +  ((FuncExpr *) gstate->arg->expr)->funcretset)

> Argh. That should have been FunExprState->func->fn_retset.

Nope; that was my first thought as well, but fn_retset isn't valid if
init_fcache hasn't been run yet, which it won't have been the first time
through.

So I think we can push this patch now and get on with the downstream
patches.  Do you want to do the honors, or shall I?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Andres Freund
Hi,

On 2017-01-18 08:43:24 -0500, Tom Lane wrote:
> I did a review pass over 0001 and 0002.  I think the attached updated
> version is committable

Cool.

> ... except for one thing.  The more I look at it,
> the more disturbed I am by the behavioral change shown in rangefuncs.out
> --- that's the SRF-in-one-arm-of-CASE issue.  (The changes in tsrf.out
> are fine and as per agreement.)  We touched lightly on that point far
> upthread, but didn't really resolve it.  What's bothering me is that
> we're changing, silently, from a reasonably-intuitive behavior to a
> completely-not-intuitive one.  Since we got a bug report for the previous
> less-than-intuitive behavior for such cases, it's inevitable that we'll
> get bug reports for this.  I think it'd be far better to throw error for
> SRF-inside-a-CASE.  If we don't, we certainly need to document this,
> and I'm not very sure how to explain it clearly.

I'm fine with leaving it as is in the patch, but I'm also fine with
changing things to ERROR.  Personally I don't think it matters much, and
we can whack it back and forth as we want later.  Thus I'm inclined to
commit it without erroring out; since presumably we'll take some time
deciding on what exactly we want to prohibit.


> Anyway, I've not done anything about that in the attached.  What I did do:
> 
> * Merge 0001 and 0002.  I appreciate you having separated that for my
> review, but it doesn't make any sense to commit the parts of 0001 that
> you undid in 0002.

Right. I was suggesting upthread that we'd merge them before committing.


> * Obviously, ExecMakeFunctionResultSet can be greatly simplified now
> that it need not deal with hasSetArg cases.

Yea, I've cleaned it up in my 0003; where it would have started to error
out too (without an explicit check), because there's no set evaluating
function anymore besides ExecMakeFunctionResultSet.


> I saw you'd left that
> for later, which is mostly fine, but I did lobotomize it just enough
> to throw an error if it gets a set result from an argument.  Without
> that, we wouldn't really be testing that the planner splits nested
> SRFs correctly.

Ok, that makes sense.


> * This bit in ExecProjectSRF was no good:
> 
> + else if (IsA(gstate->arg, FuncExprState) &&
> +  ((FuncExpr *) gstate->arg->expr)->funcretset)
> 
> because FuncExprState is used for more node types than just FuncExpr;
> in particular this would fail (except perhaps by accident) for a
> set-returning OpExpr.

Argh. That should have been FunExprState->func->fn_retset.  Anyway, your
approach works, too.


> * Update the user documentation (didn't address the CASE issue, though).

Cool.


Greetings,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-18 Thread Tom Lane
I did a review pass over 0001 and 0002.  I think the attached updated
version is committable ... except for one thing.  The more I look at it,
the more disturbed I am by the behavioral change shown in rangefuncs.out
--- that's the SRF-in-one-arm-of-CASE issue.  (The changes in tsrf.out
are fine and as per agreement.)  We touched lightly on that point far
upthread, but didn't really resolve it.  What's bothering me is that
we're changing, silently, from a reasonably-intuitive behavior to a
completely-not-intuitive one.  Since we got a bug report for the previous
less-than-intuitive behavior for such cases, it's inevitable that we'll
get bug reports for this.  I think it'd be far better to throw error for
SRF-inside-a-CASE.  If we don't, we certainly need to document this,
and I'm not very sure how to explain it clearly.

Upthread we had put COALESCE in the same bucket, but I think that's less
of a problem, because in typical usages the SRF would be in the first
argument and so users wouldn't be expecting conditional evaluation.

Anyway, I've not done anything about that in the attached.  What I did do:

* Merge 0001 and 0002.  I appreciate you having separated that for my
review, but it doesn't make any sense to commit the parts of 0001 that
you undid in 0002.

* Rename the node types as per yesterday's discussion.

* Require Project to always have an input plan node.

* Obviously, ExecMakeFunctionResultSet can be greatly simplified now
that it need not deal with hasSetArg cases.  I saw you'd left that
for later, which is mostly fine, but I did lobotomize it just enough
to throw an error if it gets a set result from an argument.  Without
that, we wouldn't really be testing that the planner splits nested
SRFs correctly.

* This bit in ExecProjectSRF was no good:

+ else if (IsA(gstate->arg, FuncExprState) &&
+  ((FuncExpr *) gstate->arg->expr)->funcretset)

because FuncExprState is used for more node types than just FuncExpr;
in particular this would fail (except perhaps by accident) for a
set-returning OpExpr.  I chose to fix it by adding a funcReturnsSet
field to FuncExprState and insisting that ExecInitExpr fill that in
immediately, which it can do easily.

* Minor style and comment improvements; fix a couple small oversights
such as missing outfuncs.c support.

* Update the user documentation (didn't address the CASE issue, though).

regards, tom lane



use-project-set-for-tlist-srfs.patch.gz
Description: use-project-set-for-tlist-srfs.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Tom Lane
Andres Freund  writes:
> On 2017-01-17 13:43:38 -0500, Tom Lane wrote:
>> I'm not convinced that that optimization is worth preserving, but if we
>> keep it then ProjectSet isn't le mot juste here, any more than you'd want
>> to rename Result to Project without changing its existing
>> functionality.

> Right. I'd removed that, and re-added it; primarily because the plans
> looked more complex without it. After all, you'd thought it worth adding
> that hack ;)   I'm happy with removing it again too.

Well, it seemed reasonable to do that as long as the only cost was ten or
so lines in create_projection_plan.  But if we're contorting not only the
behavior but the very name of the SRF-evaluation plan node type, that's
not negligible cost anymore.  So now I'm inclined to take it out.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Andres Freund
On 2017-01-17 13:43:38 -0500, Tom Lane wrote:
> Although ... looking closer at Andres' patch, the new node type *is*
> channeling Result, in the sense that it might or might not have any input
> plan.  This probably traces to what I wrote in September:
>
> + * XXX Possibly-temporary hack: if the subpath is a dummy ResultPath,
> + * don't bother with it, just make a Result with no input.  This avoids 
> an
> + * extra Result plan node when doing "SELECT srf()".  Depending on what 
> we
> + * decide about the desired plan structure for SRF-expanding nodes, this
> + * optimization might have to go away, and in any case it'll probably 
> look
> + * a good bit different.
>
> I'm not convinced that that optimization is worth preserving, but if we
> keep it then ProjectSet isn't le mot juste here, any more than you'd want
> to rename Result to Project without changing its existing
> functionality.

Right. I'd removed that, and re-added it; primarily because the plans
looked more complex without it. After all, you'd thought it worth adding
that hack ;)   I'm happy with removing it again too.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Tom Lane
Robert Haas  writes:
> On Tue, Jan 17, 2017 at 1:18 PM, Tom Lane  wrote:
>> Using Result for two completely different things is a wart though.  If we
>> had it to do over I think we'd define Result as a scan node that produces
>> rows from no input, and create a separate Project node for the case of
>> projecting from input tuples.  People are used to seeing Result in EXPLAIN
>> output, so it's not worth the trouble of changing that IMO, but we don't
>> have to use it as a model for more node types.

> +1, although I think changing the existing node would be fine too if
> somebody wanted to do the work.  It's not worth having that wart
> forever just to avoid whatever minor pain-of-adjustment might be
> involved.

Although ... looking closer at Andres' patch, the new node type *is*
channeling Result, in the sense that it might or might not have any input
plan.  This probably traces to what I wrote in September:

+ * XXX Possibly-temporary hack: if the subpath is a dummy ResultPath,
+ * don't bother with it, just make a Result with no input.  This avoids an
+ * extra Result plan node when doing "SELECT srf()".  Depending on what we
+ * decide about the desired plan structure for SRF-expanding nodes, this
+ * optimization might have to go away, and in any case it'll probably look
+ * a good bit different.

I'm not convinced that that optimization is worth preserving, but if we
keep it then ProjectSet isn't le mot juste here, any more than you'd want
to rename Result to Project without changing its existing functionality.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Robert Haas
On Tue, Jan 17, 2017 at 1:18 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> I'd not have gone for SetResult if we didn't already have Result.  I'm
>> not super happy ending up having Project in ProjectSet but not in the
>> Result that end up doing the majority of the projection.  But eh, we can
>> live with it.
>
> Using Result for two completely different things is a wart though.  If we
> had it to do over I think we'd define Result as a scan node that produces
> rows from no input, and create a separate Project node for the case of
> projecting from input tuples.  People are used to seeing Result in EXPLAIN
> output, so it's not worth the trouble of changing that IMO, but we don't
> have to use it as a model for more node types.

+1, although I think changing the existing node would be fine too if
somebody wanted to do the work.  It's not worth having that wart
forever just to avoid whatever minor pain-of-adjustment might be
involved.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Tom Lane
Andres Freund  writes:
> I'd not have gone for SetResult if we didn't already have Result.  I'm
> not super happy ending up having Project in ProjectSet but not in the
> Result that end up doing the majority of the projection.  But eh, we can
> live with it.

Using Result for two completely different things is a wart though.  If we
had it to do over I think we'd define Result as a scan node that produces
rows from no input, and create a separate Project node for the case of
projecting from input tuples.  People are used to seeing Result in EXPLAIN
output, so it's not worth the trouble of changing that IMO, but we don't
have to use it as a model for more node types.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Andres Freund
Hi,

On 2017-01-17 12:52:20 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On Mon, Jan 16, 2017 at 2:13 PM, Tom Lane  wrote:
> >> "Srf" is ugly as can be, and unintelligible.  SetResult might be OK.
>
> > The operation we're performing here, IIUC, is projection.  SetResult
> > lacks a verb, although Set could be confused with one; someone might
> > think this is the node that sets a result, whatever that means.
> > Anyway, I suggest working Project in there somehow.  If Project by
> > itself seems like it's too generic, perhaps ProjectSet or
> > ProjectSetResult would be suitable.

I'd not have gone for SetResult if we didn't already have Result.  I'm
not super happy ending up having Project in ProjectSet but not in the
Result that end up doing the majority of the projection.  But eh, we can
live with it.


> Andres' patch is already using "SetProjectionPath" for the path struct
> type.  Maybe make that "ProjectSetPath", giving rise to a "ProjectSet"
> plan node?

WFM.


> I'm happy to do a global-search-and-replace while I'm reviewing the
> patch, but let's decide on names PDQ.

Yes, let's decide soon please.


Greeting,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Robert Haas
On Tue, Jan 17, 2017 at 12:52 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jan 16, 2017 at 2:13 PM, Tom Lane  wrote:
>>> "Srf" is ugly as can be, and unintelligible.  SetResult might be OK.
>
>> The operation we're performing here, IIUC, is projection.  SetResult
>> lacks a verb, although Set could be confused with one; someone might
>> think this is the node that sets a result, whatever that means.
>> Anyway, I suggest working Project in there somehow.  If Project by
>> itself seems like it's too generic, perhaps ProjectSet or
>> ProjectSetResult would be suitable.
>
> Andres' patch is already using "SetProjectionPath" for the path struct
> type.  Maybe make that "ProjectSetPath", giving rise to a "ProjectSet"
> plan node?

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jan 16, 2017 at 2:13 PM, Tom Lane  wrote:
>> "Srf" is ugly as can be, and unintelligible.  SetResult might be OK.

> The operation we're performing here, IIUC, is projection.  SetResult
> lacks a verb, although Set could be confused with one; someone might
> think this is the node that sets a result, whatever that means.
> Anyway, I suggest working Project in there somehow.  If Project by
> itself seems like it's too generic, perhaps ProjectSet or
> ProjectSetResult would be suitable.

Andres' patch is already using "SetProjectionPath" for the path struct
type.  Maybe make that "ProjectSetPath", giving rise to a "ProjectSet"
plan node?

I'm happy to do a global-search-and-replace while I'm reviewing the
patch, but let's decide on names PDQ.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-17 Thread Robert Haas
On Mon, Jan 16, 2017 at 2:13 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> That worked quite well.  So we have a few questions, before I clean this
>> up:
>
>> - For now the node is named 'Srf' both internally and in explain - not
>>   sure if we want to make that something longer/easier to understand for
>>   others? Proposals? TargetFunctionScan? SetResult?
>
> "Srf" is ugly as can be, and unintelligible.  SetResult might be OK.

The operation we're performing here, IIUC, is projection.  SetResult
lacks a verb, although Set could be confused with one; someone might
think this is the node that sets a result, whatever that means.
Anyway, I suggest working Project in there somehow.  If Project by
itself seems like it's too generic, perhaps ProjectSet or
ProjectSetResult would be suitable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Alvaro Herrera
Andres Freund wrote:
> On 2017-01-16 16:04:34 -0300, Alvaro Herrera wrote:
> > Andres Freund wrote:
> > > On 2017-01-16 12:17:46 -0300, Alvaro Herrera wrote:
> > > > Hmm.  I wonder if your stuff could be used as support code for
> > > > XMLTABLE[1].
> > > 
> > > I don't immediately see what functionality overlaps, could you expand on
> > > that?
> > 
> > Well, I haven't read any previous patches in this area, but the xmltable
> > patch adds a new way of handling set-returning expressions, so it
> > appears vaguely related.
> 
> Ugh. That's not good - I'm about to remove isDone. Like completely.
> That's why I'm actually working on all this, because random expressions
> returning more rows makes optimizing expression evaluation a lot harder.

Argh.

> > These aren't properly functions in the current sense of the word,
> > though.
> 
> Why aren't they? Looks like it'd be doable to do so, at least below the
> parser level?

Hmm ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Andres Freund
On 2017-01-16 16:04:34 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > On 2017-01-16 12:17:46 -0300, Alvaro Herrera wrote:
> > > Hmm.  I wonder if your stuff could be used as support code for
> > > XMLTABLE[1].
> > 
> > I don't immediately see what functionality overlaps, could you expand on
> > that?
> 
> Well, I haven't read any previous patches in this area, but the xmltable
> patch adds a new way of handling set-returning expressions, so it
> appears vaguely related.

Ugh. That's not good - I'm about to remove isDone. Like completely.
That's why I'm actually working on all this, because random expressions
returning more rows makes optimizing expression evaluation a lot harder.

> These aren't properly functions in the current sense of the word,
> though.

Why aren't they? Looks like it'd be doable to do so, at least below the
parser level?


Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Andres Freund
Hi,

On 2017-01-16 14:13:18 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > That worked quite well.  So we have a few questions, before I clean this
> > up:
>
> > - For now the node is named 'Srf' both internally and in explain - not
> >   sure if we want to make that something longer/easier to understand for
> >   others? Proposals? TargetFunctionScan? SetResult?
>
> "Srf" is ugly as can be, and unintelligible.  SetResult might be OK.

Named it SetResult - imo looks ok.  I think I do prefer the separate
node type over re-using Result.  The planner integration looks cleaner
to me due to not needing the srfpp special cases and such.


> > Comments?
>
> Hard to comment on your other points without a patch to look at.

Attached the current version. There's a *lot* of pending cleanup needed
(especially in execQual.c) removing outdated code/comments etc, but this
seems good enough for a first review.  I'd want that cleanup done in a
separate patch anyway.


Attached are two patches. The first is just a rebased version (just some
hunk offset changed) of your planner patch, on top of that is my
executor patch.  My patch moves some minor detail in yours around, and I
do think they should eventually be merged; but leaving it split for a
round displays the changes more cleanly.

Additional questions:
- do we care about SRFs that don't actually return a set? If so we need
  to change the error checking code in ExecEvalFunc/Oper and move it to
  the actual invocation.
- the FuncExpr/OpExpr check in ExecMakeFunctionResult is fairly ugly imo
  - but I don't quite see a much better solution.

Greetings,

Andres
>From 2c16e67f46f418239ab90a51611f168508bac66e Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Sun, 15 Jan 2017 19:23:22 -0800
Subject: [PATCH 1/2] Put SRF into a separate node v1.

Author: Tom Lane
Discussion: https://postgr.es/m/557.1473895...@sss.pgh.pa.us
---
 src/backend/nodes/outfuncs.c |   1 +
 src/backend/optimizer/plan/createplan.c  |  33 -
 src/backend/optimizer/plan/planner.c | 219 +--
 src/backend/optimizer/util/clauses.c | 104 ++-
 src/backend/optimizer/util/pathnode.c|  75 +++
 src/backend/optimizer/util/tlist.c   | 199 
 src/include/nodes/relation.h |   1 +
 src/include/optimizer/clauses.h  |   1 -
 src/include/optimizer/pathnode.h |   4 +
 src/include/optimizer/tlist.h|   3 +
 src/test/regress/expected/aggregates.out |   3 +-
 src/test/regress/expected/limit.out  |  10 +-
 src/test/regress/expected/rangefuncs.out |  10 +-
 src/test/regress/expected/subselect.out  |  26 ++--
 src/test/regress/expected/tsrf.out   |  11 +-
 15 files changed, 544 insertions(+), 156 deletions(-)

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index cf0a6059e9..73fdc9706d 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1805,6 +1805,7 @@ _outProjectionPath(StringInfo str, const ProjectionPath *node)
 
 	WRITE_NODE_FIELD(subpath);
 	WRITE_BOOL_FIELD(dummypp);
+	WRITE_BOOL_FIELD(srfpp);
 }
 
 static void
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index c7bcd9b84c..875de739a8 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1421,8 +1421,21 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path)
 	Plan	   *subplan;
 	List	   *tlist;
 
-	/* Since we intend to project, we don't need to constrain child tlist */
-	subplan = create_plan_recurse(root, best_path->subpath, 0);
+	/*
+	 * XXX Possibly-temporary hack: if the subpath is a dummy ResultPath,
+	 * don't bother with it, just make a Result with no input.  This avoids an
+	 * extra Result plan node when doing "SELECT srf()".  Depending on what we
+	 * decide about the desired plan structure for SRF-expanding nodes, this
+	 * optimization might have to go away, and in any case it'll probably look
+	 * a good bit different.
+	 */
+	if (IsA(best_path->subpath, ResultPath) &&
+		((ResultPath *) best_path->subpath)->path.pathtarget->exprs == NIL &&
+		((ResultPath *) best_path->subpath)->quals == NIL)
+		subplan = NULL;
+	else
+		/* Since we intend to project, we don't need to constrain child tlist */
+		subplan = create_plan_recurse(root, best_path->subpath, 0);
 
 	tlist = build_path_tlist(root, _path->path);
 
@@ -1441,8 +1454,9 @@ create_projection_plan(PlannerInfo *root, ProjectionPath *best_path)
 	 * creation, but that would add expense to creating Paths we might end up
 	 * not using.)
 	 */
-	if (is_projection_capable_path(best_path->subpath) ||
-		tlist_same_exprs(tlist, subplan->targetlist))
+	if (!best_path->srfpp &&
+		(is_projection_capable_path(best_path->subpath) ||
+		 tlist_same_exprs(tlist, subplan->targetlist)))
 	{
 		/* Don't need a separate Result, just assign tlist to subplan */
 

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Tom Lane
Andres Freund  writes:
> That worked quite well.  So we have a few questions, before I clean this
> up:

> - For now the node is named 'Srf' both internally and in explain - not
>   sure if we want to make that something longer/easier to understand for
>   others? Proposals? TargetFunctionScan? SetResult?

"Srf" is ugly as can be, and unintelligible.  SetResult might be OK.

> - I continued with the division of Labor that Tom had set up, so we're
>   creating one Srf node for each "nested" set of SRFs.  We'd discussed
>   nearby to change that for one node/path for all nested SRFs, partially
>   because of costing.  But I don't like the idea that much anymore. The
>   implementation seems cleaner (and probably faster) this way, and I
>   don't think nested targetlist SRFs are something worth optimizing
>   for.  Anybody wants to argue differently?

Not me.

> Comments?

Hard to comment on your other points without a patch to look at.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Alvaro Herrera
Andres Freund wrote:
> On 2017-01-16 12:17:46 -0300, Alvaro Herrera wrote:
> > Andres Freund wrote:
> > 
> > > That worked quite well.  So we have a few questions, before I clean this
> > > up:
> > > 
> > > - For now the node is named 'Srf' both internally and in explain - not
> > >   sure if we want to make that something longer/easier to understand for
> > >   others? Proposals? TargetFunctionScan? SetResult?
> > > 
> > > - We could alternatively add all this into the Result node - it's not
> > >   actually a lot of new code, and most of that is boilerplate stuff
> > >   about adding a new node.  I'm ok with both.
> > 
> > Hmm.  I wonder if your stuff could be used as support code for
> > XMLTABLE[1].
> 
> I don't immediately see what functionality overlaps, could you expand on
> that?

Well, I haven't read any previous patches in this area, but the xmltable
patch adds a new way of handling set-returning expressions, so it
appears vaguely related.  These aren't properly functions in the current
sense of the word, though.  There is some parallel to what
ExecMakeFunctionResult does, which I suppose is related.

> > Currently it has a bit of additional code of its own,
> > though admittedly it's very little code executor-side.  Would you mind
> > sharing a patch, or more details on how it works?
> 
> Can do both; cleaning up the patch now. What we're talking about here is
> a way to implement targetlist SRF that is based on:
> 
> 1) a patch by Tom that creates additional Result (or now Srf) executor
>nodes containing SRF evaluation. This guarantees that only Result/Srf
>nodes have to deal with targetlist SRF evaluation.
> 
> 2) new code to evaluate SRFs in the new Result/Srf node, that doesn't
>rely on ExecEvalExpr et al. to have a IsDone argument. Instead
>there's special code to handle that in the new node. That's possible
>because it's now guaranted that all SRFs are "toplevel" in the
>relevant targetlist(s).
> 
> 3) Removal of all nearly tSRF related code execQual.c and other
>executor/ files, including the node->ps.ps_TupFromTlist checks
>everywhere.
> 
> makes sense?

Hmm, okay.  (The ps_TupFromTlist thing has long seemed an ugly
construction.)  I think the current term for this kind of thing is
TableFunction -- are you really naming this "Srf" literally?  It seems
strange, but maybe it's just me.

> > Nested targetlist SRFs make my head spin.  I suppose they may have some
> > use, but where would you really want this:
> 
> I think there's some cases where it can be useful. Targetlist SRFs as a
> whole really are much more about backward compatibility than anything :)

Sure.

> > ?  If supporting the former makes it harder to support/optimize more
> > reasonable cases, it seems fair game to leave them behind.
> 
> I don't want to desupport them, just don't want to restructure (one node
> doing several levels of SRFs, instead of one per level) just to make it
> easier to give good estimates.

No objections.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Andres Freund
On 2017-01-16 12:17:46 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> 
> > That worked quite well.  So we have a few questions, before I clean this
> > up:
> > 
> > - For now the node is named 'Srf' both internally and in explain - not
> >   sure if we want to make that something longer/easier to understand for
> >   others? Proposals? TargetFunctionScan? SetResult?
> > 
> > - We could alternatively add all this into the Result node - it's not
> >   actually a lot of new code, and most of that is boilerplate stuff
> >   about adding a new node.  I'm ok with both.
> 
> Hmm.  I wonder if your stuff could be used as support code for
> XMLTABLE[1].

I don't immediately see what functionality overlaps, could you expand on
that?


> Currently it has a bit of additional code of its own,
> though admittedly it's very little code executor-side.  Would you mind
> sharing a patch, or more details on how it works?

Can do both; cleaning up the patch now. What we're talking about here is
a way to implement targetlist SRF that is based on:

1) a patch by Tom that creates additional Result (or now Srf) executor
   nodes containing SRF evaluation. This guarantees that only Result/Srf
   nodes have to deal with targetlist SRF evaluation.

2) new code to evaluate SRFs in the new Result/Srf node, that doesn't
   rely on ExecEvalExpr et al. to have a IsDone argument. Instead
   there's special code to handle that in the new node. That's possible
   because it's now guaranted that all SRFs are "toplevel" in the
   relevant targetlist(s).

3) Removal of all nearly tSRF related code execQual.c and other
   executor/ files, including the node->ps.ps_TupFromTlist checks
   everywhere.

makes sense?


> > - I continued with the division of Labor that Tom had set up, so we're
> >   creating one Srf node for each "nested" set of SRFs.  We'd discussed
> >   nearby to change that for one node/path for all nested SRFs, partially
> >   because of costing.  But I don't like the idea that much anymore. The
> >   implementation seems cleaner (and probably faster) this way, and I
> >   don't think nested targetlist SRFs are something worth optimizing
> >   for.  Anybody wants to argue differently?
> 
> Nested targetlist SRFs make my head spin.  I suppose they may have some
> use, but where would you really want this:

I think there's some cases where it can be useful. Targetlist SRFs as a
whole really are much more about backward compatibility than anything :)


> ?  If supporting the former makes it harder to support/optimize more
> reasonable cases, it seems fair game to leave them behind.

I don't want to desupport them, just don't want to restructure (one node
doing several levels of SRFs, instead of one per level) just to make it
easier to give good estimates.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Alvaro Herrera
Andres Freund wrote:

> That worked quite well.  So we have a few questions, before I clean this
> up:
> 
> - For now the node is named 'Srf' both internally and in explain - not
>   sure if we want to make that something longer/easier to understand for
>   others? Proposals? TargetFunctionScan? SetResult?
> 
> - We could alternatively add all this into the Result node - it's not
>   actually a lot of new code, and most of that is boilerplate stuff
>   about adding a new node.  I'm ok with both.

Hmm.  I wonder if your stuff could be used as support code for
XMLTABLE[1].  Currently it has a bit of additional code of its own,
though admittedly it's very little code executor-side.  Would you mind
sharing a patch, or more details on how it works?

[1] 
https://www.postgresql.org/message-id/cafj8pra_keukobxvs4v-imoeesxu0pd0ashv0-mwrfdrwte...@mail.gmail.com

> - I continued with the division of Labor that Tom had set up, so we're
>   creating one Srf node for each "nested" set of SRFs.  We'd discussed
>   nearby to change that for one node/path for all nested SRFs, partially
>   because of costing.  But I don't like the idea that much anymore. The
>   implementation seems cleaner (and probably faster) this way, and I
>   don't think nested targetlist SRFs are something worth optimizing
>   for.  Anybody wants to argue differently?

Nested targetlist SRFs make my head spin.  I suppose they may have some
use, but where would you really want this:

alvherre=# select generate_series(1, generate_series(2, 4));
 generate_series 
─
   1
   2
   1
   2
   3
   1
   2
   3
   4
(9 filas)

instead of the much more sensible

alvherre=# select i, j from generate_series(2, 4) i, generate_series(1, i) j;
 i │ j 
───┼───
 2 │ 1
 2 │ 2
 3 │ 1
 3 │ 2
 3 │ 3
 4 │ 1
 4 │ 2
 4 │ 3
 4 │ 4
(9 filas)

?  If supporting the former makes it harder to support/optimize more
reasonable cases, it seems fair game to leave them behind.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-16 Thread Andres Freund
On 2017-01-15 19:29:52 -0800, Andres Freund wrote:
> On 2016-10-31 09:06:39 -0700, Andres Freund wrote:
> > On 2016-09-14 19:28:25 -0400, Tom Lane wrote:
> > > Andres Freund  writes:
> > > > On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
> > > Here's a draft patch that is just meant to investigate what the planner
> > > changes might look like if we do it in the pipelined-result way.
> > > Accordingly, I didn't touch the executor, but just had it emit regular
> > > Result nodes for SRF-execution steps.  However, the SRFs are all
> > > guaranteed to appear at top level of their respective tlists, so that
> > > those Results could be replaced with something that works like
> > > nodeFunctionscan.
> >
> > > So I think we should continue investigating this way of doing things.
> > > I'll try to take a look at the executor end of it tomorrow.  However
> > > I'm leaving Friday for a week's vacation, and may not have anything to
> > > show before that.
> >
> > Are you planning to work on the execution side of things? I otherwise
> > can take a stab...
>
> Doing so now.

That worked quite well.  So we have a few questions, before I clean this
up:

- For now the node is named 'Srf' both internally and in explain - not
  sure if we want to make that something longer/easier to understand for
  others? Proposals? TargetFunctionScan? SetResult?

- We could alternatively add all this into the Result node - it's not
  actually a lot of new code, and most of that is boilerplate stuff
  about adding a new node.  I'm ok with both.

- I continued with the division of Labor that Tom had set up, so we're
  creating one Srf node for each "nested" set of SRFs.  We'd discussed
  nearby to change that for one node/path for all nested SRFs, partially
  because of costing.  But I don't like the idea that much anymore. The
  implementation seems cleaner (and probably faster) this way, and I
  don't think nested targetlist SRFs are something worth optimizing
  for.  Anybody wants to argue differently?

- I chose to error out if a retset function appears in ExecEvalFunc/Oper
  and make both unconditionally set evalfunc to
  ExecMakeFunctionResultNoSets.  ExecMakeFunctionResult() is now
  externally visible.  That seems like the least noisy way to change
  things over, but I'm open for different proposals.

Comments?

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2017-01-15 Thread Andres Freund
On 2016-10-31 09:06:39 -0700, Andres Freund wrote:
> On 2016-09-14 19:28:25 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
> > Here's a draft patch that is just meant to investigate what the planner
> > changes might look like if we do it in the pipelined-result way.
> > Accordingly, I didn't touch the executor, but just had it emit regular
> > Result nodes for SRF-execution steps.  However, the SRFs are all
> > guaranteed to appear at top level of their respective tlists, so that
> > those Results could be replaced with something that works like
> > nodeFunctionscan.
> 
> > So I think we should continue investigating this way of doing things.
> > I'll try to take a look at the executor end of it tomorrow.  However
> > I'm leaving Friday for a week's vacation, and may not have anything to
> > show before that.
> 
> Are you planning to work on the execution side of things? I otherwise
> can take a stab...

Doing so now.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-10-31 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-14 19:28:25 -0400, Tom Lane wrote:
>> So I think we should continue investigating this way of doing things.
>> I'll try to take a look at the executor end of it tomorrow.  However
>> I'm leaving Friday for a week's vacation, and may not have anything to
>> show before that.

> Are you planning to work on the execution side of things? I otherwise
> can take a stab...

My plan is to start on this when I go back into commitfest mode,
but right now I'm trying to produce a draft patch for RLS changes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-10-31 Thread Andres Freund
Hi Tom,

On 2016-09-14 19:28:25 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
> >> Anyway I'll draft a prototype and then we can compare.
> 
> > Ok, cool.
> 
> Here's a draft patch that is just meant to investigate what the planner
> changes might look like if we do it in the pipelined-result way.
> Accordingly, I didn't touch the executor, but just had it emit regular
> Result nodes for SRF-execution steps.  However, the SRFs are all
> guaranteed to appear at top level of their respective tlists, so that
> those Results could be replaced with something that works like
> nodeFunctionscan.

> So I think we should continue investigating this way of doing things.
> I'll try to take a look at the executor end of it tomorrow.  However
> I'm leaving Friday for a week's vacation, and may not have anything to
> show before that.

Are you planning to work on the execution side of things? I otherwise
can take a stab...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-10-02 Thread Michael Paquier
On Fri, Sep 16, 2016 at 6:12 AM, Tom Lane  wrote:
> I'm happy to get rid of the LCM behavior, I just want to have some wiggle
> room to be able to get it back if somebody really needs it.

Er, actually no that's this thread for this CF entry:
https://commitfest.postgresql.org/10/759/
Still there has not been much activity.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-15 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-15 16:48:59 -0400, Tom Lane wrote:
>> The patch that I posted would run both the generate_series(1, 2) and
>> generate_series(2,4) calls in the same SRF node, forcing them to run in
>> lockstep, after which their results would be fed to the SRF node doing
>> the top-level SRFs.  We could probably change it to run them in separate
>> nodes, but I don't see any principled way to decide which one goes first
>> (and in some variants of this example, it would matter).

> I think that's fine. I personally still think we're *much* better off
> getting rid of the non-lockstep variants. You're still on the fence
> about retaining the LCM behaviour (for the same nesting level at least)?

I'm happy to get rid of the LCM behavior, I just want to have some wiggle
room to be able to get it back if somebody really needs it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-15 Thread Andres Freund
On 2016-09-15 16:48:59 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Hm. One thing I wonder about with this approach, is how we're going to
> > handle something absurd like:
> > SELECT generate_series(1, generate_series(1, 2)), generate_series(1, 
> > generate_series(2,4));
> 
> The patch that I posted would run both the generate_series(1, 2) and
> generate_series(2,4) calls in the same SRF node, forcing them to run in
> lockstep, after which their results would be fed to the SRF node doing
> the top-level SRFs.  We could probably change it to run them in separate
> nodes, but I don't see any principled way to decide which one goes first
> (and in some variants of this example, it would matter).

I think that's fine. I personally still think we're *much* better off
getting rid of the non-lockstep variants. You're still on the fence
about retaining the LCM behaviour (for the same nesting level at least)?


> I think the LATERAL approach would face exactly the same issues: how
> many LATERAL nodes do you use, and what's their join order?

I think this specific issue could be handled in a bit easier to grasp
variant. My PoC basically generated one RTE for each "query
level". There'd have been one RTE for generate_series(1,2), one for
gs(2,4) and one for gs(1, var(gs(1,2))), gs(1, var(gs(2,4))). Lateral
machiner would force the join order to have the argument srfs first, and
then the twoi combined srf with lateral arguments after that.

> I think we could get away with defining it like this (ie, SRFs at the same
> SRF nesting level run in lockstep) as long as it's documented.  Whatever
> the current behavior is for such cases would be pretty bizarre too.

Indeed.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-15 Thread Tom Lane
Andres Freund  writes:
> Hm. One thing I wonder about with this approach, is how we're going to
> handle something absurd like:
> SELECT generate_series(1, generate_series(1, 2)), generate_series(1, 
> generate_series(2,4));

The patch that I posted would run both the generate_series(1, 2) and
generate_series(2,4) calls in the same SRF node, forcing them to run in
lockstep, after which their results would be fed to the SRF node doing
the top-level SRFs.  We could probably change it to run them in separate
nodes, but I don't see any principled way to decide which one goes first
(and in some variants of this example, it would matter).  I think the
LATERAL approach would face exactly the same issues: how many LATERAL
nodes do you use, and what's their join order?

I think we could get away with defining it like this (ie, SRFs at the same
SRF nesting level run in lockstep) as long as it's documented.  Whatever
the current behavior is for such cases would be pretty bizarre too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-15 Thread Andres Freund
Hi,

On 2016-09-14 19:28:25 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
> >> Anyway I'll draft a prototype and then we can compare.
> 
> > Ok, cool.
> 
> Here's a draft patch that is just meant to investigate what the planner
> changes might look like if we do it in the pipelined-result way.

Nice.


> A difficulty with this restriction is that if you have a query like
> "select f1, generate_series(1,2) / 10 from tab" then you end up with both
> a SRF-executing Result and a separate scalar-projection Result above it,
> because the division-by-ten has to happen in a separate plan level.

Makes sense.  I guess we could teach the SRF pipeline node to execute a
series of such steps. Hm. That makes me think of something:

Hm. One thing I wonder about with this approach, is how we're going to
handle something absurd like:
SELECT generate_series(1, generate_series(1, 2)), generate_series(1, 
generate_series(2,4));

I guess we have to do here is
Step: generate_series(1,2), 1, 2, 4
Step: generate_series(1, Var(generate_series(1,2))), 1, 2, 4
Step: Var(generate_series(1, Var(generate_series(1,2, 1, generate_series(2, 
4)
Step: Var(generate_series(1, Var(generate_series(1,2, generate_series(1, 
Var(generate_series(2, 4)))

But that'd still not have the same lockstepping behaviour, right?  I'm
at a conference, and half-ill, so I might just standing on my own brain
here.


> The planner's notions about the cost of Result make it think that this is
> quite expensive --- mainly because the upper Result will be iterated once
> per SRF output row, so that you get hit with cpu_tuple_cost per output row.
> And that in turn leads it to change plans in one or two cases in the
> regression tests.  Maybe that's fine.  I'm worried though that it's right
> that this will be unduly expensive.  So I'm kind of tempted to define the
> SRF-executing node as acting more like, say, Agg or WindowFunc, in that
> it has a list of SRFs to execute and then it has the ability to project a
> scalar tlist on top of those results.

Hah, was thinking the same above ;)


> On the whole I'm pretty pleased with this approach, at least from the
> point of view of the planner.  The net addition of planner code is
> smaller than what you had,

Not by much. But I do agree that there's some advantages here.


> and though I'm no doubt biased, I think this
> version is much cleaner.

Certainly seems a bit easier to extend and adjust behaviour. Not having
to deal with enforcing join order, and having less issues with
determining what to push where is certainly advantageous.  After all,
that was why I initially was thinking of tis approach.

> Also, though this patch doesn't address exactly
> how we might do it, it's fairly clear that it'd be possible to allow
> FDWs and CustomScans to implement SRF execution, eg pushing a SRF down to
> a foreign server, in a reasonably straightforward extension of the
> existing upper-pathification hooks.  If we go with the lateral function
> RTE approach, that's going to be somewhere between hard and impossible.

Hm. Not sure if there's that much point in doing that, but I agree that
the LATERAL approach adds more restrictions.


> So I think we should continue investigating this way of doing things.
> I'll try to take a look at the executor end of it tomorrow.  However
> I'm leaving Friday for a week's vacation, and may not have anything to
> show before that.

If you have something that's halfway recognizable, could you perhaps
post it?

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-15 Thread Andres Freund
On 2016-09-15 15:23:58 -0400, Tom Lane wrote:
> Andres Freund  writes:
> >   In my present patch I've *ripped out* the support for materialization
> >   in nodeFunctionscan.c entirely. That means that rescans referencing
> >   volatile functions can change their behaviour (if a function is
> >   rescanned, without having it's parameters changed), and that native
> >   backward scan support is gone.  I don't think that's actually an issue.
> 
> I think you are wrong on this not being an issue: it is critical that
> rescan deliver the same results as before, else for example having a
> function RTE on the inside of a nestloop will give nonsensical/broken
> results.

I find that quite unconvincing. We quite freely re-evaluate functions in
the targetlist again, even if they're volatile and/or SRFs.

If we implement tSRFs as pipeline nodes, we can "simply" default to the
never materializing behaviour there I guess.


> Moreover, I think we'd all agreed that this effort needs to avoid any
> not-absolutely-necessary semantics changes.

I don't agree with that. Adding pointless complications for a niche
edge cases of niche features isn't worth it.


> Another idea is that we could extend the set of ExecInitNode flags
> (EXEC_FLAG_REWIND etc) to tell child nodes whether they need to implement
> rescan correctly in this sense; if they are not RHS children of nestloops,
> and maybe one or two other cases, they don't.  That would give another
> route by which nodeFunctionscan could decide that it can skip
> materialization in common cases.

That's something I've wondered about too. Materializing if rescans are
required is quite acceptable, and probably rather rare in
practice. Seems not unlikely that that information would be valuable for
other node types too.


Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-15 Thread Tom Lane
Andres Freund  writes:
> 0003-Avoid-materializing-SRFs-in-the-FROM-list.patch
>   To avoid performance regressions from moving SRFM_ValuePerCall SRFs to
>   ROWS FROM, nodeFunctionscan.c needs to support not materializing
>   output.

I looked over this patch a bit.

>   In my present patch I've *ripped out* the support for materialization
>   in nodeFunctionscan.c entirely. That means that rescans referencing
>   volatile functions can change their behaviour (if a function is
>   rescanned, without having it's parameters changed), and that native
>   backward scan support is gone.  I don't think that's actually an issue.

I think you are wrong on this not being an issue: it is critical that
rescan deliver the same results as before, else for example having a
function RTE on the inside of a nestloop will give nonsensical/broken
results.  I think what we'll have to do is allow the optimization of
skipping the tuplestore only when the function is declared nonvolatile.
(If it is, and it nonetheless gives different results on rescan, it's not
our fault if joins give haywire answers.)  I'm okay with not supporting
backward scan, but wrong answers during rescan is a different animal
entirely.

Moreover, I think we'd all agreed that this effort needs to avoid any
not-absolutely-necessary semantics changes.  This one is not only not
necessary, but it would result in subtle hard-to-detect breakage.

It's conceivable that we could allow the executor to be broken this way
and have the planner fix it by inserting a Material node when joining.
But I think it would be messy and would probably not perform as well as
an internal tuplestore --- for one thing, because the planner can't know
whether the function would return a tuplestore, making the external
materialization redundant.

Another idea is that we could extend the set of ExecInitNode flags
(EXEC_FLAG_REWIND etc) to tell child nodes whether they need to implement
rescan correctly in this sense; if they are not RHS children of nestloops,
and maybe one or two other cases, they don't.  That would give another
route by which nodeFunctionscan could decide that it can skip
materialization in common cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-14 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
>> Anyway I'll draft a prototype and then we can compare.

> Ok, cool.

Here's a draft patch that is just meant to investigate what the planner
changes might look like if we do it in the pipelined-result way.
Accordingly, I didn't touch the executor, but just had it emit regular
Result nodes for SRF-execution steps.  However, the SRFs are all
guaranteed to appear at top level of their respective tlists, so that
those Results could be replaced with something that works like
nodeFunctionscan.

A difficulty with this restriction is that if you have a query like
"select f1, generate_series(1,2) / 10 from tab" then you end up with both
a SRF-executing Result and a separate scalar-projection Result above it,
because the division-by-ten has to happen in a separate plan level.
The planner's notions about the cost of Result make it think that this is
quite expensive --- mainly because the upper Result will be iterated once
per SRF output row, so that you get hit with cpu_tuple_cost per output row.
And that in turn leads it to change plans in one or two cases in the
regression tests.  Maybe that's fine.  I'm worried though that it's right
that this will be unduly expensive.  So I'm kind of tempted to define the
SRF-executing node as acting more like, say, Agg or WindowFunc, in that
it has a list of SRFs to execute and then it has the ability to project a
scalar tlist on top of those results.  That would likely save some cycles
at execution, and it would also eliminate a few of the planner warts seen
below, like the rule about not pushing a new scalar tlist down onto a
SRF-executing Result.  I'd have to rewrite split_pathtarget_at_srfs(),
because it'd be implementing quite different rules about how to refactor
targetlists, but that's not a big problem.

On the whole I'm pretty pleased with this approach, at least from the
point of view of the planner.  The net addition of planner code is
smaller than what you had, and though I'm no doubt biased, I think this
version is much cleaner.  Also, though this patch doesn't address exactly
how we might do it, it's fairly clear that it'd be possible to allow
FDWs and CustomScans to implement SRF execution, eg pushing a SRF down to
a foreign server, in a reasonably straightforward extension of the
existing upper-pathification hooks.  If we go with the lateral function
RTE approach, that's going to be somewhere between hard and impossible.

So I think we should continue investigating this way of doing things.
I'll try to take a look at the executor end of it tomorrow.  However
I'm leaving Friday for a week's vacation, and may not have anything to
show before that.

regards, tom lane

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 7e092d7..9052273 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*** _outProjectionPath(StringInfo str, const
*** 1817,1822 
--- 1817,1823 
  
  	WRITE_NODE_FIELD(subpath);
  	WRITE_BOOL_FIELD(dummypp);
+ 	WRITE_BOOL_FIELD(srfpp);
  }
  
  static void
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 47158f6..7c59c3d 100644
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*** create_projection_plan(PlannerInfo *root
*** 1421,1428 
  	Plan	   *subplan;
  	List	   *tlist;
  
! 	/* Since we intend to project, we don't need to constrain child tlist */
! 	subplan = create_plan_recurse(root, best_path->subpath, 0);
  
  	tlist = build_path_tlist(root, _path->path);
  
--- 1421,1441 
  	Plan	   *subplan;
  	List	   *tlist;
  
! 	/*
! 	 * XXX Possibly-temporary hack: if the subpath is a dummy ResultPath,
! 	 * don't bother with it, just make a Result with no input.  This avoids an
! 	 * extra Result plan node when doing "SELECT srf()".  Depending on what we
! 	 * decide about the desired plan structure for SRF-expanding nodes, this
! 	 * optimization might have to go away, and in any case it'll probably look
! 	 * a good bit different.
! 	 */
! 	if (IsA(best_path->subpath, ResultPath) &&
! 		((ResultPath *) best_path->subpath)->path.pathtarget->exprs == NIL &&
! 		((ResultPath *) best_path->subpath)->quals == NIL)
! 		subplan = NULL;
! 	else
! 		/* Since we intend to project, we don't need to constrain child tlist */
! 		subplan = create_plan_recurse(root, best_path->subpath, 0);
  
  	tlist = build_path_tlist(root, _path->path);
  
*** create_projection_plan(PlannerInfo *root
*** 1441,1448 
  	 * creation, but that would add expense to creating Paths we might end up
  	 * not using.)
  	 */
! 	if (is_projection_capable_path(best_path->subpath) ||
! 		tlist_same_exprs(tlist, subplan->targetlist))
  	{
  		/* Don't need a separate Result, just assign tlist to subplan */
  		plan = subplan;
--- 1454,1462 
  	 * creation, but that would 

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-13 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-13 12:07:35 -0400, Tom Lane wrote:
>> /*
>> - * Don't pull up a subquery that has any set-returning functions in its
>> - * targetlist.  Otherwise we might well wind up inserting set-returning
>> - * functions into places where they mustn't go, such as quals of higher
>> - * queries.  This also ensures deletion of an empty jointree is valid.
>> - */
>> -if (expression_returns_set((Node *) subquery->targetList))
>> -return false;

> I don't quite understand parts of the comment you removed here. What
> does "This also ensures deletion of an empty jointree is valid." mean?

TBH, I don't remember what that was about anymore.  Whatever it was might
not apply now, anyway.  If there was something to it, maybe we'll
rediscover it while we're fooling with tSRFs, and then we can insert a
less cryptic comment.

> Looks good, except that you didn't adopt the hunk adjusting
> src/backend/executor/README, which still seems to read:

Ah, I missed that there was anything to change docs-wise.  Will fix.

Thanks for looking it over!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-13 Thread Andres Freund
On 2016-09-13 12:07:35 -0400, Tom Lane wrote:
> diff --git a/src/backend/optimizer/plan/analindex e28a8dc..74e4245 100644
> --- a/src/backend/optimizer/plan/analyzejoins.c
> +++ b/src/backend/optimizer/plan/analyzejoins.c
> @@ -650,6 +650,11 @@ rel_is_distinct_for(PlannerInfo *root, R
>  bool
>  query_supports_distinctness(Query *query)
>  {
> + /* we don't cope with SRFs, see comment below */
> + if (query->hasTargetSRFs)
> + return false;
> +
> + /* check for features we can prove distinctness with */
>   if (query->distinctClause != NIL ||
>   query->groupClause != NIL ||
>   query->groupingSets != NIL ||
> @@ -695,7 +700,7 @@ query_is_distinct_for(Query *query, List
>* specified columns, since those must be evaluated before 
> de-duplication;
>* but it doesn't presently seem worth the complication to check that.)
>*/
> - if (expression_returns_set((Node *) query->targetList))
> + if (query->hasTargetSRFs)
>   return false;

Maybe make this hasTargetSRFs && expression_returns_set()? The SRF could
have been optimized away. (Oh, I see you recheck below. Forget that then).

> @@ -1419,8 +1419,8 @@ is_simple_subquery(Query *subquery, Rang
>   return false;
>  
>   /*
> -  * Can't pull up a subquery involving grouping, aggregation, sorting,
> -  * limiting, or WITH.  (XXX WITH could possibly be allowed later)
> +  * Can't pull up a subquery involving grouping, aggregation, SRFs,
> +  * sorting, limiting, or WITH.  (XXX WITH could possibly be allowed 
> later)
>*
>* We also don't pull up a subquery that has explicit FOR UPDATE/SHARE
>* clauses, because pullup would cause the locking to occur semantically
> @@ -1430,6 +1430,7 @@ is_simple_subquery(Query *subquery, Rang
>*/
>   if (subquery->hasAggs ||
>   subquery->hasWindowFuncs ||
> + subquery->hasTargetSRFs ||
>   subquery->groupClause ||
>   subquery->groupingSets ||
>   subquery->havingQual ||
> @@ -1543,15 +1544,6 @@ is_simple_subquery(Query *subquery, Rang
>   }
>  
>   /*
> -  * Don't pull up a subquery that has any set-returning functions in its
> -  * targetlist.  Otherwise we might well wind up inserting set-returning
> -  * functions into places where they mustn't go, such as quals of higher
> -  * queries.  This also ensures deletion of an empty jointree is valid.
> -  */
> - if (expression_returns_set((Node *) subquery->targetList))
> - return false;

I don't quite understand parts of the comment you removed here. What
does "This also ensures deletion of an empty jointree is valid." mean?


Looks good, except that you didn't adopt the hunk adjusting
src/backend/executor/README, which still seems to read:
> We disallow set-returning functions in the targetlist of SELECT FOR UPDATE,
> so as to ensure that at most one tuple can be returned for any particular
> set of scan tuples.  Otherwise we'd get duplicates due to the original
> query returning the same set of scan tuples multiple times.  (Note: there
> is no explicit prohibition on SRFs in UPDATE, but the net effect will be
> that only the first result row of an SRF counts, because all subsequent
> rows will result in attempts to re-update an already updated target row.
> This is historical behavior and seems not worth changing.)

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-13 Thread Tom Lane
Andres Freund  writes:
> On September 13, 2016 9:07:35 AM PDT, Tom Lane  wrote:
>> I'd like to go ahead and push this, since it's a necessary prerequisite
>> for either approach we might adopt for the rest of the patch series,
>> and the improved error reporting and avoidance of expensive
>> expression_returns_set searches make it a win IMO even if we were not
>> planning to do anything more with SRFs.

> Can't look are the code just now, on my way to the airport for pgopen, but 
> the idea sounds good to me.

OK, I went ahead and pushed it.  We can tweak later if needed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-13 Thread Andres Freund


On September 13, 2016 9:07:35 AM PDT, Tom Lane  wrote:
>Andres Freund  writes:
>> Attached is a significantly updated patch series (see the mail one up
>> for details about what this is, I don't want to quote it in its
>> entirety).
>
>I've reviewed the portions of 0005 that have to do with making the
>parser
>mark queries with hasTargetSRF.  The code as you had it was wrong
>because
>it would set the flag as a consequence of SRFs in function RTEs, which
>we don't want.

I'd taken it more as the possibility that there's an srf, than guarantee so 
far. There might be cases where the planner removes the srf during folding or 
such.  Makes sense to make it more accurate.


>  It seemed to me that the best way to fix that was to
>rely
>on the parser's p_expr_kind mechanism to tell which part of the query
>we're in, whereupon we might as well make the parser act more like it
>does
>for aggregates and window functions and give a suitable error at parse
>time for misplaced SRFs.

That's a nice improvement. The execution time errors are ugly.

>I also renamed the flag to hasTargetSRFs, which is more parallel to
>hasAggs and hasWindowFuncs, and made some effort to use it in place
>of expression_returns_set() searches.
>
>I'd like to go ahead and push this, since it's a necessary prerequisite
>for either approach we might adopt for the rest of the patch series,
>and the improved error reporting and avoidance of expensive
>expression_returns_set searches make it a win IMO even if we were not
>planning to do anything more with SRFs.

Can't look are the code just now, on my way to the airport for pgopen, but the 
idea sounds good to me.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-13 Thread Tom Lane
Andres Freund  writes:
> Attached is a significantly updated patch series (see the mail one up
> for details about what this is, I don't want to quote it in its
> entirety).

I've reviewed the portions of 0005 that have to do with making the parser
mark queries with hasTargetSRF.  The code as you had it was wrong because
it would set the flag as a consequence of SRFs in function RTEs, which
we don't want.  It seemed to me that the best way to fix that was to rely
on the parser's p_expr_kind mechanism to tell which part of the query
we're in, whereupon we might as well make the parser act more like it does
for aggregates and window functions and give a suitable error at parse
time for misplaced SRFs.  The attached isn't perfect, in that it doesn't
know about nesting restrictions (ie that SRFs must be at top level of a
function RTE), but we could improve that later if we wanted, and anyway
it's definitely a good bit nicer than before.

This also incorporates the part of 0002 that I agree with, namely
disallowing SRFs in UPDATE, since check_srf_call_placement() naturally
would do that.

I also renamed the flag to hasTargetSRFs, which is more parallel to
hasAggs and hasWindowFuncs, and made some effort to use it in place
of expression_returns_set() searches.

I'd like to go ahead and push this, since it's a necessary prerequisite
for either approach we might adopt for the rest of the patch series,
and the improved error reporting and avoidance of expensive
expression_returns_set searches make it a win IMO even if we were not
planning to do anything more with SRFs.

regards, tom lane

diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index e997b57..dbd6094 100644
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
*** cookDefault(ParseState *pstate,
*** 2560,2573 
  
  	/*
  	 * transformExpr() should have already rejected subqueries, aggregates,
! 	 * and window functions, based on the EXPR_KIND_ for a default expression.
! 	 *
! 	 * It can't return a set either.
  	 */
- 	if (expression_returns_set(expr))
- 		ereport(ERROR,
- (errcode(ERRCODE_DATATYPE_MISMATCH),
-  errmsg("default expression must not return a set")));
  
  	/*
  	 * Coerce the expression to the correct type and typmod, if given. This
--- 2560,2568 
  
  	/*
  	 * transformExpr() should have already rejected subqueries, aggregates,
! 	 * window functions, and SRFs, based on the EXPR_KIND_ for a default
! 	 * expression.
  	 */
  
  	/*
  	 * Coerce the expression to the correct type and typmod, if given. This
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 4f39dad..71714bc 100644
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*** _copyQuery(const Query *from)
*** 2731,2736 
--- 2731,2737 
  	COPY_SCALAR_FIELD(resultRelation);
  	COPY_SCALAR_FIELD(hasAggs);
  	COPY_SCALAR_FIELD(hasWindowFuncs);
+ 	COPY_SCALAR_FIELD(hasTargetSRFs);
  	COPY_SCALAR_FIELD(hasSubLinks);
  	COPY_SCALAR_FIELD(hasDistinctOn);
  	COPY_SCALAR_FIELD(hasRecursive);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 4800165..29a090f 100644
*** a/src/backend/nodes/equalfuncs.c
--- b/src/backend/nodes/equalfuncs.c
*** _equalQuery(const Query *a, const Query 
*** 921,926 
--- 921,927 
  	COMPARE_SCALAR_FIELD(resultRelation);
  	COMPARE_SCALAR_FIELD(hasAggs);
  	COMPARE_SCALAR_FIELD(hasWindowFuncs);
+ 	COMPARE_SCALAR_FIELD(hasTargetSRFs);
  	COMPARE_SCALAR_FIELD(hasSubLinks);
  	COMPARE_SCALAR_FIELD(hasDistinctOn);
  	COMPARE_SCALAR_FIELD(hasRecursive);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 90fecb1..7e092d7 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*** _outQuery(StringInfo str, const Query *n
*** 2683,2688 
--- 2683,2689 
  	WRITE_INT_FIELD(resultRelation);
  	WRITE_BOOL_FIELD(hasAggs);
  	WRITE_BOOL_FIELD(hasWindowFuncs);
+ 	WRITE_BOOL_FIELD(hasTargetSRFs);
  	WRITE_BOOL_FIELD(hasSubLinks);
  	WRITE_BOOL_FIELD(hasDistinctOn);
  	WRITE_BOOL_FIELD(hasRecursive);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 894a48f..917e6c8 100644
*** a/src/backend/nodes/readfuncs.c
--- b/src/backend/nodes/readfuncs.c
*** _readQuery(void)
*** 238,243 
--- 238,244 
  	READ_INT_FIELD(resultRelation);
  	READ_BOOL_FIELD(hasAggs);
  	READ_BOOL_FIELD(hasWindowFuncs);
+ 	READ_BOOL_FIELD(hasTargetSRFs);
  	READ_BOOL_FIELD(hasSubLinks);
  	READ_BOOL_FIELD(hasDistinctOn);
  	READ_BOOL_FIELD(hasRecursive);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 04264b4..99b6bc8 100644
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*** check_output_expressions(Query *subquery
*** 2422,2428 
  			continue;
  
  		/* Functions 

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 20:15:46 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 16:56:32 -0700, Andres Freund wrote:
> >> Attached is a noticeably expanded set of tests, with fixes for the stuff
> >> you'd found.  I plan to push that soon-ish.  Independent of the approach
> >> we'll be choosing, increased coverage seems quite useful.
> 
> > And for real.
> 
> Looks good to me, please push.

Done.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 16:56:32 -0700, Andres Freund wrote:
>> Attached is a noticeably expanded set of tests, with fixes for the stuff
>> you'd found.  I plan to push that soon-ish.  Independent of the approach
>> we'll be choosing, increased coverage seems quite useful.

> And for real.

Looks good to me, please push.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 16:56:32 -0700, Andres Freund wrote:
> On 2016-09-12 09:14:47 -0700, Andres Freund wrote:
> > On 2016-09-12 10:19:14 -0400, Tom Lane wrote:
> > > Andres Freund  writes:
> > > 
> > > > 0001-Add-some-more-targetlist-srf-tests.patch
> > > >   Add some test.
> > > 
> > > I think you should go ahead and push this tests-adding patch now, as it
> > > adds documentation of the current behavior that is a good thing to have
> > > independently of what the rest of the patchset does.  Also, I'm worried
> > > that some of the GROUP BY tests might have machine-dependent results
> > > (if they are implemented by hashing) so it would be good to get in a few
> > > buildfarm cycles and let that settle out before we start changing the
> > > answers.
> > 
> > Generally a sound plan - I started to noticeably expand it though,
> > there's some important edge cases it didn't cover.
> 
> Attached is a noticeably expanded set of tests, with fixes for the stuff
> you'd found.  I plan to push that soon-ish.  Independent of the approach
> we'll be choosing, increased coverage seems quite useful.

And for real.
>From 3bdaab7028c0ae7cf9bea666a6e555adbc68640e Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Wed, 3 Aug 2016 18:29:42 -0700
Subject: [PATCH] Add more tests for targetlist SRFs.

We're considering changing the implementation of targetlist SRFs
considerably, and a lot of the current behaviour isn't tested in our
regression tests. Thus it seems useful to increase coverage to avoid
accidental behaviour changes.

It's quite possible that some of the plans here will require adjustments
to avoid falling afoul of ordering differences (e.g. hashed group
bys). The buildfarm will tell us.

Reviewed-By: Tom Lane
Discussion: <20160827214829.zo2dfb5jaikii...@alap3.anarazel.de>
---
 src/test/regress/expected/tsrf.out | 501 +
 src/test/regress/parallel_schedule |   2 +-
 src/test/regress/serial_schedule   |   1 +
 src/test/regress/sql/tsrf.sql  | 124 +
 4 files changed, 627 insertions(+), 1 deletion(-)
 create mode 100644 src/test/regress/expected/tsrf.out
 create mode 100644 src/test/regress/sql/tsrf.sql

diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
new file mode 100644
index 000..983ce17
--- /dev/null
+++ b/src/test/regress/expected/tsrf.out
@@ -0,0 +1,501 @@
+--
+-- tsrf - targetlist set returning function tests
+--
+-- simple srf
+SELECT generate_series(1, 3);
+ generate_series 
+-
+   1
+   2
+   3
+(3 rows)
+
+-- parallel iteration
+SELECT generate_series(1, 3), generate_series(3,5);
+ generate_series | generate_series 
+-+-
+   1 |   3
+   2 |   4
+   3 |   5
+(3 rows)
+
+-- parallel iteration, different number of rows
+SELECT generate_series(1, 2), generate_series(1,4);
+ generate_series | generate_series 
+-+-
+   1 |   1
+   2 |   2
+   1 |   3
+   2 |   4
+(4 rows)
+
+-- srf, with SRF argument
+SELECT generate_series(1, generate_series(1, 3));
+ generate_series 
+-
+   1
+   1
+   2
+   1
+   2
+   3
+(6 rows)
+
+-- srf, with two SRF arguments
+SELECT generate_series(generate_series(1,3), generate_series(2, 4));
+ERROR:  functions and operators can take at most one set argument
+CREATE TABLE few(id int, dataa text, datab text);
+INSERT INTO few VALUES(1, 'a', 'foo'),(2, 'a', 'bar'),(3, 'b', 'bar');
+-- SRF output order of sorting is maintained, if SRF is not referenced
+SELECT few.id, generate_series(1,3) g FROM few ORDER BY id DESC;
+ id | g 
++---
+  3 | 1
+  3 | 2
+  3 | 3
+  2 | 1
+  2 | 2
+  2 | 3
+  1 | 1
+  1 | 2
+  1 | 3
+(9 rows)
+
+-- but SRFs can be referenced in sort
+SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, g DESC;
+ id | g 
++---
+  1 | 3
+  1 | 2
+  1 | 1
+  2 | 3
+  2 | 2
+  2 | 1
+  3 | 3
+  3 | 2
+  3 | 1
+(9 rows)
+
+SELECT few.id, generate_series(1,3) g FROM few ORDER BY id, generate_series(1,3) DESC;
+ id | g 
++---
+  1 | 3
+  1 | 2
+  1 | 1
+  2 | 3
+  2 | 2
+  2 | 1
+  3 | 3
+  3 | 2
+  3 | 1
+(9 rows)
+
+-- it's weird to have ORDER BYs that increase the number of results
+SELECT few.id FROM few ORDER BY id, generate_series(1,3) DESC;
+ id 
+
+  1
+  1
+  1
+  2
+  2
+  2
+  3
+  3
+  3
+(9 rows)
+
+-- SRFs are computed after aggregation
+SELECT few.dataa, count(*), min(id), max(id), unnest('{1,1,3}'::int[]) FROM few WHERE few.id = 1 GROUP BY few.dataa;
+ dataa | count | min | max | unnest 
+---+---+-+-+
+ a | 1 |   1 |   1 |  1
+ a | 1 |   1 |   1 |  1
+ a | 1 |   1 |   1 |  3
+(3 rows)
+
+-- unless 

Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 09:14:47 -0700, Andres Freund wrote:
> On 2016-09-12 10:19:14 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > 
> > > 0001-Add-some-more-targetlist-srf-tests.patch
> > >   Add some test.
> > 
> > I think you should go ahead and push this tests-adding patch now, as it
> > adds documentation of the current behavior that is a good thing to have
> > independently of what the rest of the patchset does.  Also, I'm worried
> > that some of the GROUP BY tests might have machine-dependent results
> > (if they are implemented by hashing) so it would be good to get in a few
> > buildfarm cycles and let that settle out before we start changing the
> > answers.
> 
> Generally a sound plan - I started to noticeably expand it though,
> there's some important edge cases it didn't cover.

Attached is a noticeably expanded set of tests, with fixes for the stuff
you'd found.  I plan to push that soon-ish.  Independent of the approach
we'll be choosing, increased coverage seems quite useful.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 19:35:22 -0400, Tom Lane wrote:
> >> You're inventing objections.
> 
> > Heh, it's actually your own objection ;)
> > http://archives.postgresql.org/message-id/32673.1464023429%40sss.pgh.pa.us
> 
> I'm changing my opinion in the light of unfavorable evidence.  Is that
> wrong?

Nah, not at all.  I was just referring to "inventing".


> > I wonder how much duplication we'd end up between nodeFunctionscan.c and
> > nodeSRF (or whatever). We'd need the latter node to support ValuePerCall
> > in an non-materializing fashion as well.  Could we combine them somehow?
> 
> Yeah, I was wondering that too.  I doubt that we want to make one node
> type do both things --- the fact that Result comes in two flavors with
> different semantics (with or without an input node) isn't very nice IMO,
> and this would be almost that identical case.

It might not, agreed. That imo has to be taken into acount calculating
the code costs - although the executor stuff usually is pretty boring in
comparison.


> But maybe they could share
> some code at the level of ExecMakeTableFunctionResult.  (I've not looked
> at your executor changes yet, not sure how much of that still exists.)

I'd split ExecMakeTableFunctionResult up, to allow for a percall mode,
and that seems like it'd still be needed to avoid performance
regressions.

It'd be an awfully large amount of code those two nodes would
duplicate. Excepting different rescan logic and ORDINALITY support,
nearly all the nodeFunctionscan.c code would be needed in both nodes.

> Anyway I'll draft a prototype and then we can compare.

Ok, cool.


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 18:35:03 -0400, Tom Lane wrote:
>> You're inventing objections.

> Heh, it's actually your own objection ;)
> http://archives.postgresql.org/message-id/32673.1464023429%40sss.pgh.pa.us

I'm changing my opinion in the light of unfavorable evidence.  Is that
wrong?

>> It won't require that any more than the
>> LATERAL approach does; it's basically the same code as whatever
>> nodeFunctionscan is going to do, but packaged as a pipeline eval node
>> rather than a base scan node.

> That might work.  It gets somewhat nasty though because you also need to
> handle, SRF arguments to SRFs. And those again can contain nearly
> arbitrary expressions inbetween. With the ROWS FROM approach that can be
> fairly easily handled via LATERAL.  I guess what we could do here is to
> use one pipeline node to evaluate all the argument SRFs, and then
> another for the outer expression. Where the outer node would evaluate
> the SRF arguments using normal expression evaluation, with the inner SRF
> output replaced by a var.

Right.  Nested SRFs translate to multiple ROWS-FROM RTEs with lateral
references in the one approach, and nested Result-thingies in the other.
It's pretty much the same thing mutatis mutandis, but I think it will
likely be a lot easier to get there from here with the Result-based
approach --- for example, we don't have to worry about forcing lateral
join order, and the ordering constraints vis-a-vis GROUP BY etc won't take
any great effort either.  Anyway I think it is worth trying.

> I wonder how much duplication we'd end up between nodeFunctionscan.c and
> nodeSRF (or whatever). We'd need the latter node to support ValuePerCall
> in an non-materializing fashion as well.  Could we combine them somehow?

Yeah, I was wondering that too.  I doubt that we want to make one node
type do both things --- the fact that Result comes in two flavors with
different semantics (with or without an input node) isn't very nice IMO,
and this would be almost that identical case.  But maybe they could share
some code at the level of ExecMakeTableFunctionResult.  (I've not looked
at your executor changes yet, not sure how much of that still exists.)

> I don't think the code for adding these intermediate SRF evaluating
> nodes will be noticeably simpler than what's in my prototype. We'll
> still have to do the whole conversion recursively, we'll still need
> complexity of figuring out whether to put those SRFs evaluations
> before/after group by, order by, distinct on and window functions.

I think it will slot into the code that's already there rather more
easily than what you've done, because we already *have* code that makes
decisions in that form.  We just need to teach it to break down what
it now thinks of as a single projection step into N+1 steps when there
are N levels of nested SRF present.  Anyway I'll draft a prototype and
then we can compare.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
Hi,

On 2016-09-12 18:35:03 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > I don't think it'd be all that hard to add something like the current
> > LCM behaviour into nodeFunctionscan.c if we really wanted. But I think
> > it'll be better to just say no here.
> 
> "Just say no" soon translates to memes about "disasters like the removal
> of implicit casting" (which in fact is not what 8.3 did, but I've grown
> pretty damn tired of the amount of bitching that that cleanup did and
> still does provoke).  In any case, it feels like the LATERAL approach is
> locking us into more and subtler incompatibilities than just that one.

I can't see those being equivalent impact-wise. Note that the person
bitching most loudly about the "implicit casting" thing (Merlin Moncure)
is voting to remove the LCM behaviour ;)

I think we'll continue to get more bitching about the confusing LCM
behaviour than complaints the backward compat break would generate.


> >> If we go with a Result-like tSRF evaluation node, then whether we change
> >> semantics or not becomes mostly a matter of what that node does.  It could
> >> become basically a wrapper around the existing ExecTargetList() logic if
> >> we needed to provide backwards-compatible behavior.
> 
> > As you previously objected: If we keep ExecTargetList() style logic, we
> > need to keep most of execQual.c's handling of ExprMultipleResult et al,
> > and that's going to prevent the stuff I want to work on.
> 
> You're inventing objections.

Heh, it's actually your own objection ;)

http://archives.postgresql.org/message-id/32673.1464023429%40sss.pgh.pa.us


> It won't require that any more than the
> LATERAL approach does; it's basically the same code as whatever
> nodeFunctionscan is going to do, but packaged as a pipeline eval node
> rather than a base scan node.  Or to be clearer: what I'm suggesting it
> would contain is ExecTargetList's logic about restarting individual SRFs.
> That wouldn't propagate into execQual because we would only allow SRFs at
> the top level of the node's tlist, just like nodeFunctionscan does.
> ExecMakeTableFunctionResult doesn't require the generic execQual code
> to support SRFs today, and it still wouldn't.

(it accidentally does (see your cast example), but that that's besides
your point)

That might work.  It gets somewhat nasty though because you also need to
handle, SRF arguments to SRFs. And those again can contain nearly
arbitrary expressions inbetween. With the ROWS FROM approach that can be
fairly easily handled via LATERAL.  I guess what we could do here is to
use one pipeline node to evaluate all the argument SRFs, and then
another for the outer expression. Where the outer node would evaluate
the SRF arguments using normal expression evaluation, with the inner SRF
output replaced by a var.

I wonder how much duplication we'd end up between nodeFunctionscan.c and
nodeSRF (or whatever). We'd need the latter node to support ValuePerCall
in an non-materializing fashion as well.  Could we combine them somehow?


> In larger terms: the whole point here is to fish SRF calls up to the
> top level of the tlist of whatever node is executing them, where they
> can be special-cased by that node.  Their SRF-free argument
> expressions would be evaluated by generic execQual.  AFAICS this goes
> through in the same way from the executor's viewpoint whether we use
> LATERAL as the query restructuring method or a SRF-capable variant of
> Result.  But it's now looking to me like the latter would be a lot
> simpler from the point of view of planner complexity, and in
> particular from the point of view of proving correctness (equivalence
> of the query transformation).

It's nicer not to introduce subqueries from a two angles from my pov:
1) EXPLAINs will look more like the original query
2) The evaluation order of the non-srf part of the query, and the query
   itself, will be easier. That's by the thing I'm least happy with the
   LATERAL approach.

I don't think the code for adding these intermediate SRF evaluating
nodes will be noticeably simpler than what's in my prototype. We'll
still have to do the whole conversion recursively, we'll still need
complexity of figuring out whether to put those SRFs evaluations
before/after group by, order by, distinct on and window functions.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 17:36:07 -0400, Tom Lane wrote:
>> Um, I dunno.  You've added half a thousand lines of not-highly-readable-
>> nor-extensively-commented code to the planner; that certainly reaches *my*
>> threshold of pain.

> Well, I certainly plan (and started to) make that code easier to
> understand, and better commented.  It also removes ~1400 LOC of not easy
> to understand code... A good chunk of that'd would also be removed with
> a Result style approach, but far from all.

Hm, I've not studied 0006 yet, but surely that's executor code that would
go away with *any* approach that takes away the need for generic execQual
to support SRFs?  I don't see that it counts while discussing which way
we take to reach that point.

>> I'm also growing rather concerned that the LATERAL
>> approach is going to lock us into some unremovable incompatibilities
>> no matter how much we might regret that later (and in view of how quickly
>> I got my wrist slapped in <001201d18524$f84c4580$e8e4d080$@pcorp.us>,
>> I am afraid there may be more pushback awaiting us than we think).

> I don't think it'd be all that hard to add something like the current
> LCM behaviour into nodeFunctionscan.c if we really wanted. But I think
> it'll be better to just say no here.

"Just say no" soon translates to memes about "disasters like the removal
of implicit casting" (which in fact is not what 8.3 did, but I've grown
pretty damn tired of the amount of bitching that that cleanup did and
still does provoke).  In any case, it feels like the LATERAL approach is
locking us into more and subtler incompatibilities than just that one.

>> If we go with a Result-like tSRF evaluation node, then whether we change
>> semantics or not becomes mostly a matter of what that node does.  It could
>> become basically a wrapper around the existing ExecTargetList() logic if
>> we needed to provide backwards-compatible behavior.

> As you previously objected: If we keep ExecTargetList() style logic, we
> need to keep most of execQual.c's handling of ExprMultipleResult et al,
> and that's going to prevent the stuff I want to work on.

You're inventing objections.  It won't require that any more than the
LATERAL approach does; it's basically the same code as whatever
nodeFunctionscan is going to do, but packaged as a pipeline eval node
rather than a base scan node.  Or to be clearer: what I'm suggesting it
would contain is ExecTargetList's logic about restarting individual SRFs.
That wouldn't propagate into execQual because we would only allow SRFs at
the top level of the node's tlist, just like nodeFunctionscan does.
ExecMakeTableFunctionResult doesn't require the generic execQual code
to support SRFs today, and it still wouldn't.

In larger terms: the whole point here is to fish SRF calls up to the top
level of the tlist of whatever node is executing them, where they can be
special-cased by that node.  Their SRF-free argument expressions would be
evaluated by generic execQual.  AFAICS this goes through in the same way
from the executor's viewpoint whether we use LATERAL as the query
restructuring method or a SRF-capable variant of Result.  But it's now
looking to me like the latter would be a lot simpler from the point of
view of planner complexity, and in particular from the point of view of
proving correctness (equivalence of the query transformation).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 17:36:07 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> >> Stepping back a little bit ... way back at the start of this thread
> >> you muttered about possibly implementing tSRFs as a special pipeline
> >> node type, a la Result.  That would have the same benefits in terms
> >> of being able to take SRF support out of the main execQual code paths.
> >> I, and I think some other people, felt that the LATERAL approach would
> >> be a cleaner answer --- but now that we're seeing some of the messy
> >> details required to make the LATERAL way work, I'm beginning to have
> >> second thoughts.  I wonder if we should do at least a POC implementation
> >> of the other way to get a better fix on which way is really cleaner.
> 
> > I'm not particularly in love in restarting with a different approach. I
> > think fixing the ROWS FROM expansion is the only really painful bit, and
> > that seems like it's independently beneficial to allow for suppression
> > of expansion there.
> 
> Um, I dunno.  You've added half a thousand lines of not-highly-readable-
> nor-extensively-commented code to the planner; that certainly reaches *my*
> threshold of pain.

Well, I certainly plan (and started to) make that code easier to
understand, and better commented.  It also removes ~1400 LOC of not easy
to understand code... A good chunk of that'd would also be removed with
a Result style approach, but far from all.


> I'm also growing rather concerned that the LATERAL
> approach is going to lock us into some unremovable incompatibilities
> no matter how much we might regret that later (and in view of how quickly
> I got my wrist slapped in <001201d18524$f84c4580$e8e4d080$@pcorp.us>,
> I am afraid there may be more pushback awaiting us than we think).

I don't think it'd be all that hard to add something like the current
LCM behaviour into nodeFunctionscan.c if we really wanted. But I think
it'll be better to just say no here.


> If we go with a Result-like tSRF evaluation node, then whether we change
> semantics or not becomes mostly a matter of what that node does.  It could
> become basically a wrapper around the existing ExecTargetList() logic if
> we needed to provide backwards-compatible behavior.

As you previously objected: If we keep ExecTargetList() style logic, we
need to keep most of execQual.c's handling of ExprMultipleResult et al,
and that's going to prevent the stuff I want to work on. Because
handling ExprMultipleResult in all these places is a serious issue
WRT making expression evaluation faster.  If we find a good answer to
that, I'd be more open to pursuing this approach.


> > I actually had started to work on a Result style approach, and I don't
> > think it turned out that nice. But I didn't complete it, so I might just
> > be wrong.
> 
> It's also possible that it's easier now in the post-pathification code
> base than it was before.  After contemplating my navel for awhile, it
> seems like the core of the planner code for a Result-like approach would
> be something very close to make_group_input_target(), plus a thing like
> pull_var_clause() that extracts SRFs rather than Vars.  Those two
> functions, including their lengthy comments, weigh in at ~250 lines.
> Admittedly there'd be some boilerplate on top of that, if we invent a
> new plan node type rather than extending Result, but not all that much.

That's pretty much what I did (or rather started to do), yes.  I had
something that was called from grouping_planner() that added the new
node ontop of the original set of paths, after grouping or after
distinct, depending on where SRFs were referenced.  The biggest benefit
I saw with that is that there's no need to push things into a subquery,
and that the ordering is a lot more explicit.


> If you like, I'll have a go at drafting a patch in that style.  Do you
> happen to still have the executor side of what you did, so I don't have
> to reinvent that?

The executor side is actually what I found harder here. Either we end up
keeping most of ExecQual's handling, or we reinvent a good deal of
separate logic.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
>> Stepping back a little bit ... way back at the start of this thread
>> you muttered about possibly implementing tSRFs as a special pipeline
>> node type, a la Result.  That would have the same benefits in terms
>> of being able to take SRF support out of the main execQual code paths.
>> I, and I think some other people, felt that the LATERAL approach would
>> be a cleaner answer --- but now that we're seeing some of the messy
>> details required to make the LATERAL way work, I'm beginning to have
>> second thoughts.  I wonder if we should do at least a POC implementation
>> of the other way to get a better fix on which way is really cleaner.

> I'm not particularly in love in restarting with a different approach. I
> think fixing the ROWS FROM expansion is the only really painful bit, and
> that seems like it's independently beneficial to allow for suppression
> of expansion there.

Um, I dunno.  You've added half a thousand lines of not-highly-readable-
nor-extensively-commented code to the planner; that certainly reaches *my*
threshold of pain.  I'm also growing rather concerned that the LATERAL
approach is going to lock us into some unremovable incompatibilities
no matter how much we might regret that later (and in view of how quickly
I got my wrist slapped in <001201d18524$f84c4580$e8e4d080$@pcorp.us>,
I am afraid there may be more pushback awaiting us than we think).
If we go with a Result-like tSRF evaluation node, then whether we change
semantics or not becomes mostly a matter of what that node does.  It could
become basically a wrapper around the existing ExecTargetList() logic if
we needed to provide backwards-compatible behavior.

> I actually had started to work on a Result style approach, and I don't
> think it turned out that nice. But I didn't complete it, so I might just
> be wrong.

It's also possible that it's easier now in the post-pathification code
base than it was before.  After contemplating my navel for awhile, it
seems like the core of the planner code for a Result-like approach would
be something very close to make_group_input_target(), plus a thing like
pull_var_clause() that extracts SRFs rather than Vars.  Those two
functions, including their lengthy comments, weigh in at ~250 lines.
Admittedly there'd be some boilerplate on top of that, if we invent a
new plan node type rather than extending Result, but not all that much.

If you like, I'll have a go at drafting a patch in that style.  Do you
happen to still have the executor side of what you did, so I don't have
to reinvent that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 14:05:33 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 13:48:05 -0400, Tom Lane wrote:
> >> Andres Freund  writes:
> >>> I kind of like ROWS FROM (... AS VALUE), that seems to confer the
> >>> meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
> >>> really work inside ROWS FROM() where AS is required.
> 
> >> Hm, wouldn't ... AS RECORD convey the meaning better?
> 
> > I was kind of envisioning AS VALUE to work for composite types without
> > removing their original type (possibly even for TYPEFUNC_SCALAR
> > ones).
> 
> Maybe.  A problem with any of these proposals though is that there's no
> place to put a column alias.  Yeah, you can stick it on outside the ROWS
> FROM, but it seems a bit non-orthogonal to have to do it that way when
> you can do it inside the ROWS FROM when adding a coldeflist.

I don't necessarily see that as a problem. The coldeflists inside ROWS
FROM() already don't allow assigning aliases for !record/composite
types, and they require specifying types.


> >> (Although once you look at it that way, it's just a cast spelled in
> >> an idiosyncratic fashion.)
> 
> > Well, not quite, by virtue of keeping the original type around. After a
> > record cast you likely couldn't directly access the columns anymore,
> > even if it were a known composite type, right?
> 
> Same is true for any of these syntax proposals, no?  So far as the rest of
> the query is concerned, the function output is going to be an anonymous
> record type.

Not for composite types, no. As implemented ROWS FROM (.. AS()) does:
CREATE OR REPLACE FUNCTION get_pg_class() RETURNS SETOF pg_class LANGUAGE sql 
AS $$SELECT * FROM pg_class;$$;
SELECT DISTINCT pg_typeof(f) FROM ROWS FROM (get_pg_class() AS ()) f;
┌───┐
│ pg_typeof │
├───┤
│ pg_class  │
└───┘
(1 row)
SELECT (f).relname FROM ROWS FROM (get_pg_class() AS ()) f LIMIT 1;
┌┐
│relname │
├┤
│ pg_toast_77994 │
└┘
(1 row)
which seems sensible to me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 13:48:05 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>>> I kind of like ROWS FROM (... AS VALUE), that seems to confer the
>>> meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
>>> really work inside ROWS FROM() where AS is required.

>> Hm, wouldn't ... AS RECORD convey the meaning better?

> I was kind of envisioning AS VALUE to work for composite types without
> removing their original type (possibly even for TYPEFUNC_SCALAR
> ones).

Maybe.  A problem with any of these proposals though is that there's no
place to put a column alias.  Yeah, you can stick it on outside the ROWS
FROM, but it seems a bit non-orthogonal to have to do it that way when
you can do it inside the ROWS FROM when adding a coldeflist.

Maybe we could do it like

ROWS FROM (func(...) AS alias)

where the difference from a coldeflist is that there's no parenthesized
list of names/types.  It's a bit weird that adding an alias makes for
a semantic not just naming difference, but it's no weirder than these
other ideas.

>> (Although once you look at it that way, it's just a cast spelled in
>> an idiosyncratic fashion.)

> Well, not quite, by virtue of keeping the original type around. After a
> record cast you likely couldn't directly access the columns anymore,
> even if it were a known composite type, right?

Same is true for any of these syntax proposals, no?  So far as the rest of
the query is concerned, the function output is going to be an anonymous
record type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 13:48:05 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> >> Andres Freund  writes:
> > On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
>  I can't say that I like the proposed syntax much.
>
> >>> Me neither. But I haven't really found a better approach.  It seems
> >>> kinda consistent to have ROWS FROM (... AS ()) change the picked out
> >>> columns to 0, and just return the whole thing.
>
> >> I just remembered that we allow zero-column composite types, which
> >> makes this proposal formally ambiguous.  So we really need a different
> >> syntax.  I'm not especially in love with the cast-to-record idea, but
> >> it does dodge that problem.
>
> > I kind of like ROWS FROM (... AS VALUE), that seems to confer the
> > meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
> > really work inside ROWS FROM() where AS is required.
>
> Hm, wouldn't ... AS RECORD convey the meaning better?

I was kind of envisioning AS VALUE to work for composite types without
removing their original type (possibly even for TYPEFUNC_SCALAR
ones). That, for one, makes the SRF to ROWS FROM conversion easier, and
for another seems generally useful. composites keeping their type with
AS RECORD seems a bit confusing.  There's also the issue that VALUE is
already a keyword, record not...

> (Although once you look at it that way, it's just a cast spelled in
> an idiosyncratic fashion.)

Well, not quite, by virtue of keeping the original type around. After a
record cast you likely couldn't directly access the columns anymore,
even if it were a known composite type, right?

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
>> Andres Freund  writes:
> On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
 I can't say that I like the proposed syntax much.

>>> Me neither. But I haven't really found a better approach.  It seems
>>> kinda consistent to have ROWS FROM (... AS ()) change the picked out
>>> columns to 0, and just return the whole thing.

>> I just remembered that we allow zero-column composite types, which
>> makes this proposal formally ambiguous.  So we really need a different
>> syntax.  I'm not especially in love with the cast-to-record idea, but
>> it does dodge that problem.

> I kind of like ROWS FROM (... AS VALUE), that seems to confer the
> meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
> really work inside ROWS FROM() where AS is required.

Hm, wouldn't ... AS RECORD convey the meaning better?

(Although once you look at it that way, it's just a cast spelled in
an idiosyncratic fashion.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
> >> I can't say that I like the proposed syntax much.
> 
> > Me neither. But I haven't really found a better approach.  It seems
> > kinda consistent to have ROWS FROM (... AS ()) change the picked out
> > columns to 0, and just return the whole thing.
> 
> I just remembered that we allow zero-column composite types, which
> makes this proposal formally ambiguous.  So we really need a different
> syntax.  I'm not especially in love with the cast-to-record idea, but
> it does dodge that problem.

I kind of like ROWS FROM (... AS VALUE), that seems to confer the
meaning quite well. As VALUE isn't a reserved keyword, that'd afaik only
really work inside ROWS FROM() where AS is required.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
Hi,


On 2016-09-12 13:26:20 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
> >> I can't say that I like the proposed syntax much.
> 
> > Me neither. But I haven't really found a better approach.  It seems
> > kinda consistent to have ROWS FROM (... AS ()) change the picked out
> > columns to 0, and just return the whole thing.
> 
> I just remembered that we allow zero-column composite types, which
> makes this proposal formally ambiguous.

Well, we errored out in the grammar for AS () so far... We might want to
fix that independently.


> Stepping back a little bit ... way back at the start of this thread
> you muttered about possibly implementing tSRFs as a special pipeline
> node type, a la Result.  That would have the same benefits in terms
> of being able to take SRF support out of the main execQual code paths.
> I, and I think some other people, felt that the LATERAL approach would
> be a cleaner answer --- but now that we're seeing some of the messy
> details required to make the LATERAL way work, I'm beginning to have
> second thoughts.  I wonder if we should do at least a POC implementation
> of the other way to get a better fix on which way is really cleaner.

I'm not particularly in love in restarting with a different approach. I
think fixing the ROWS FROM expansion is the only really painful bit, and
that seems like it's independently beneficial to allow for suppression
of expansion there.  I'm working on this to actually be finally able to
get some stuff from the "faster executor" thread in a committable
shape,...  The other stuff like making SELECT * FROM func; not
materialize also seems independently useful; it's something people have
complained about repeatedly over the years.


I actually had started to work on a Result style approach, and I don't
think it turned out that nice. But I didn't complete it, so I might just
be wrong.


> Also, one of the points that's come up repeatedly in these discussions
> is the way that the parser's implementation of *-expansion sucks for
> composite-returning functions.  That is, if you write
>   SELECT (foo(...)).* FROM ...
> you get
>   SELECT (foo(...)).col1, (foo(...)).col2, ... FROM ...
> so that the function is executed N times not once.  We had discussed
> fixing that for setof-composite-returning functions by folding multiple
> identical SRF calls into a single LATERAL entry, but that doesn't
> directly fix the problem for non-SRF composite functions.  Also the
> whole idea of having the planner undo the parser's damage in this way
> is kinda grotty, not least because we can't safely combine multiple
> calls of volatile functions, so it only works for not-volatile ones.


> That line of thought leads to the idea that if we could have the *parser*
> do the transformation to LATERAL form, we could avoid breaking a
> composite-returning function call into multiple copies in the first place.
> I had said that I didn't think we wanted this transformation done in the
> parser, but maybe this is a sufficient reason to do so.

I still don't like doing all this is in the parser. It'd just trigger
complaints of users that we're changing their query structure, and we'd
have to solve a good bit of the same problems we have to solve here.

If we really want to reduce the expansion cost - and to me that's a
largely independent issue from this - it seems better to have the parser
emit some structure that's easily recognized at plan time, rather than
have the praser do all the work.


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
>> I can't say that I like the proposed syntax much.

> Me neither. But I haven't really found a better approach.  It seems
> kinda consistent to have ROWS FROM (... AS ()) change the picked out
> columns to 0, and just return the whole thing.

I just remembered that we allow zero-column composite types, which
makes this proposal formally ambiguous.  So we really need a different
syntax.  I'm not especially in love with the cast-to-record idea, but
it does dodge that problem.

Stepping back a little bit ... way back at the start of this thread
you muttered about possibly implementing tSRFs as a special pipeline
node type, a la Result.  That would have the same benefits in terms
of being able to take SRF support out of the main execQual code paths.
I, and I think some other people, felt that the LATERAL approach would
be a cleaner answer --- but now that we're seeing some of the messy
details required to make the LATERAL way work, I'm beginning to have
second thoughts.  I wonder if we should do at least a POC implementation
of the other way to get a better fix on which way is really cleaner.

Also, one of the points that's come up repeatedly in these discussions
is the way that the parser's implementation of *-expansion sucks for
composite-returning functions.  That is, if you write
SELECT (foo(...)).* FROM ...
you get
SELECT (foo(...)).col1, (foo(...)).col2, ... FROM ...
so that the function is executed N times not once.  We had discussed
fixing that for setof-composite-returning functions by folding multiple
identical SRF calls into a single LATERAL entry, but that doesn't
directly fix the problem for non-SRF composite functions.  Also the
whole idea of having the planner undo the parser's damage in this way
is kinda grotty, not least because we can't safely combine multiple
calls of volatile functions, so it only works for not-volatile ones.

That line of thought leads to the idea that if we could have the *parser*
do the transformation to LATERAL form, we could avoid breaking a
composite-returning function call into multiple copies in the first place.
I had said that I didn't think we wanted this transformation done in the
parser, but maybe this is a sufficient reason to do so.

If we think in terms of pipeline evaluation nodes rather than LATERAL,
we could implement the above by having the parser emit multiple levels
of SELECT some-expressions FROM (SELECT some-expressions FROM ...),
with SRFs being rigidly separated into their own evaluation levels.

I'm not certain that any of these ideas are worth the electrons they're
written on, but I do think we ought to consider alternatives and not
just push forward with committing a first-draft implementation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
Hi,

On 2016-09-12 12:10:01 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > 0003-Avoid-materializing-SRFs-in-the-FROM-list.patch
> >   To avoid performance regressions from moving SRFM_ValuePerCall SRFs to
> >   ROWS FROM, nodeFunctionscan.c needs to support not materializing
> >   output.
> 
> Personally I'd put this one later, as it's a performance optimization not
> part of the core patch IMO --- or is there something in the later ones
> that directly depends on it?  Anyway I'm setting it aside for now.

Not strongly dependant. But the ROWS FROM stuff touches a lot of the
same code. And I wanted to implement this before ripping out the current
implementation, to allow for meaningful performance tests.

> > 0004-Allow-ROWS-FROM-to-return-functions-as-single-record.patch
> >   To allow transforming SELECT record_srf(); nodeFunctionscan.c needs to
> >   learn to return the result as a record.  I chose
> >   ROWS FROM (record_srf() AS ()) as the syntax for that. It doesn't
> >   necessarily have to be SQL exposed, but it does make testing easier.
> 
> The proposed commit message is wrong, as it claims aclexplode()
> demonstrates the problem which it doesn't --- we get the column set
> from the function's declared OUT parameters.

Oops. I'd probably tested with some self defined function and was
looking for an example...


> I can't say that I like the proposed syntax much.

Me neither. But I haven't really found a better approach.  It seems
kinda consistent to have ROWS FROM (... AS ()) change the picked out
columns to 0, and just return the whole thing.


> What about leaving out
> any syntax changes, and simply saying that "if the function returns record
> and hasn't got OUT parameters, then return its result as an unexpanded
> record"?  That might not take much more than removing the error check ;-)

Having the ability to do this for non-record returning functions turned
out to be quite convenient. Otherwise we need to create ROW()
expressions for composite returning functions, which is neither cheap,
nor fun..

As you say, that might be doable with some form of casting, but,
ugh. I'm actually kind of surprised that even works. The function call
that nodeFunctionscan.c sees, isn't a function call, much less a set
returning one. Which means this hits the direct_function_call == false
path in ExecMakeTableFunctionResult(). If it didn't, we'd have hit
/* We don't allow sets in the arguments of the table function */
if (argDone != ExprSingleResult)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("set-valued function called in 
context that cannot accept a set")));
therein. Which you'd hit e.g. with
SELECT * FROM ROWS FROM (int4mul(generate_series(1, 2), 3));

Thus this actually relies on the SRF code path in execQual.c;
the thing we want to rip out...

> A possible objection is that then you could not get the no-expansion
> behavior for functions that return named composite types or have OUT
> parameters that effectively give them known composite types.  From a
> semantic standpoint we could fix that by saying "just cast the result to
> record", ie ROWS FROM (aclexplode('whatever')::record) would give the
> no-expansion behavior.  I'm not sure if there might be any implementation
> glitches in the way of doing it like that.

See above.  Personally I think just using explicit syntax is cleaner,
but I don't feel like arguing about this a whole lot.


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-12 11:29:37 -0400, Tom Lane wrote:
>> I'm on board with disallowing SRFs in UPDATE, because it produces
>> underdetermined and unspecified results; but the other restriction
>> seems 100% arbitrary.  There is no semantic difference between
>> SELECT a, b FROM ... ORDER BY srf();
>> and
>> SELECT a, b, srf() FROM ... ORDER BY 3;
>> except that in the first case the ordering column doesn't get returned to
>> the client.  I do not see why that's so awful that we should make it fail
>> after twenty years of allowing it.

> I do think it's awful that an ORDER BY / GROUP BY changes the number of
> rows processed. This should never have been allowed.

Meh.  That's just an opinion, and it's a bit late to be making such
changes.  I think the general consensus of the previous discussion was
that we would preserve existing tSRF behavior as far as it was reasonably
practical to do so, with the exception that there's wide agreement that
the least-common-multiple rule for number of rows emitted is bad.  I do
not think you're going to get anywhere near that level of agreement that
a SRF appearing only in ORDER BY is bad.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 11:29:37 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > 0002-Shore-up-some-weird-corner-cases-for-targetlist-SRFs.patch
> >   Forbid UPDATE ... SET foo = SRF() and ORDER BY / GROUP BY containing
> >   SRFs that would change the number of returned rows.  Without the
> >   latter e.g. SELECT 1 ORDER BY generate_series(1,10); returns 10 rows.
> 
> I'm on board with disallowing SRFs in UPDATE, because it produces
> underdetermined and unspecified results; but the other restriction
> seems 100% arbitrary.  There is no semantic difference between
>   SELECT a, b FROM ... ORDER BY srf();
> and
>   SELECT a, b, srf() FROM ... ORDER BY 3;
> except that in the first case the ordering column doesn't get returned to
> the client.  I do not see why that's so awful that we should make it fail
> after twenty years of allowing it.

I do think it's awful that an ORDER BY / GROUP BY changes the number of
rows processed. This should never have been allowed. You just need a
little typo somewhere that makes the targetlist entry not match the
ORDER/GROUP BY anymore and your results will differ in weird ways -
rather hard to debug in the GROUP BY case.


> And I certainly don't see why there
> would be an implementation reason why we couldn't support it anymore
> if we can still do the second one.

There's nothing requiring this here, indeed.


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Andres Freund
On 2016-09-12 10:19:14 -0400, Tom Lane wrote:
> Andres Freund  writes:
> 
> > 0001-Add-some-more-targetlist-srf-tests.patch
> >   Add some test.
> 
> I think you should go ahead and push this tests-adding patch now, as it
> adds documentation of the current behavior that is a good thing to have
> independently of what the rest of the patchset does.  Also, I'm worried
> that some of the GROUP BY tests might have machine-dependent results
> (if they are implemented by hashing) so it would be good to get in a few
> buildfarm cycles and let that settle out before we start changing the
> answers.

Generally a sound plan - I started to noticeably expand it though,
there's some important edge cases it didn't cover.


> Although tsrf.sql doesn't currently need to create any views, it doesn't
> seem like a great idea to assume that it never will.  Maybe add this
> after misc_functions in the previous parallel group, instead?

WFM


> +-- it's weird to GROUP BYs that increase the number of results
> 
> "it's weird to have ..."
> 
> +-- nonsensically that seems to be allowed
> +UPDATE fewmore SET data = generate_series(4,9);
> 
> "nonsense that seems to be allowed..."
> 
> +-- SRFs are now allowed in RETURNING
> +INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3);
> 
> s/now/not/, apparently

Err, yes. Will update.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> 0003-Avoid-materializing-SRFs-in-the-FROM-list.patch
>   To avoid performance regressions from moving SRFM_ValuePerCall SRFs to
>   ROWS FROM, nodeFunctionscan.c needs to support not materializing
>   output.

Personally I'd put this one later, as it's a performance optimization not
part of the core patch IMO --- or is there something in the later ones
that directly depends on it?  Anyway I'm setting it aside for now.

> 0004-Allow-ROWS-FROM-to-return-functions-as-single-record.patch
>   To allow transforming SELECT record_srf(); nodeFunctionscan.c needs to
>   learn to return the result as a record.  I chose
>   ROWS FROM (record_srf() AS ()) as the syntax for that. It doesn't
>   necessarily have to be SQL exposed, but it does make testing easier.

The proposed commit message is wrong, as it claims aclexplode()
demonstrates the problem which it doesn't --- we get the column set
from the function's declared OUT parameters.

I can't say that I like the proposed syntax much.  What about leaving out
any syntax changes, and simply saying that "if the function returns record
and hasn't got OUT parameters, then return its result as an unexpanded
record"?  That might not take much more than removing the error check ;-)

A possible objection is that then you could not get the no-expansion
behavior for functions that return named composite types or have OUT
parameters that effectively give them known composite types.  From a
semantic standpoint we could fix that by saying "just cast the result to
record", ie ROWS FROM (aclexplode('whatever')::record) would give the
no-expansion behavior.  I'm not sure if there might be any implementation
glitches in the way of doing it like that.  Also there seems to be some
syntactic issue with it ... actually, the current behavior there is just
weird:

regression=# select * from rows from (aclexplode('{=r/postgres}')::record); 
ERROR:  syntax error at or near "::"
LINE 1: ...lect * from rows from (aclexplode('{=r/postgres}')::record);
 ^
regression=# select * from rows from (cast(aclexplode('{=r/postgres}') as 
record));
 grantor | grantee | privilege_type | is_grantable 
-+-++--
  10 |   0 | SELECT | f
(1 row)

I was not aware that there was *anyplace* in the grammar where :: and CAST
behaved differently, and I'm not very pleased to find this one.

I haven't looked at the code, as there probably isn't much point in
reviewing in any detail till we choose the syntax.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> 0002-Shore-up-some-weird-corner-cases-for-targetlist-SRFs.patch
>   Forbid UPDATE ... SET foo = SRF() and ORDER BY / GROUP BY containing
>   SRFs that would change the number of returned rows.  Without the
>   latter e.g. SELECT 1 ORDER BY generate_series(1,10); returns 10 rows.

I'm on board with disallowing SRFs in UPDATE, because it produces
underdetermined and unspecified results; but the other restriction
seems 100% arbitrary.  There is no semantic difference between
SELECT a, b FROM ... ORDER BY srf();
and
SELECT a, b, srf() FROM ... ORDER BY 3;
except that in the first case the ordering column doesn't get returned to
the client.  I do not see why that's so awful that we should make it fail
after twenty years of allowing it.  And I certainly don't see why there
would be an implementation reason why we couldn't support it anymore
if we can still do the second one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-12 Thread Tom Lane
Andres Freund  writes:
> Attached is a significantly updated patch series (see the mail one up
> for details about what this is, I don't want to quote it in its
> entirety).

I've finally cleared my plate enough to start reviewing this patchset.

> 0001-Add-some-more-targetlist-srf-tests.patch
>   Add some test.

I think you should go ahead and push this tests-adding patch now, as it
adds documentation of the current behavior that is a good thing to have
independently of what the rest of the patchset does.  Also, I'm worried
that some of the GROUP BY tests might have machine-dependent results
(if they are implemented by hashing) so it would be good to get in a few
buildfarm cycles and let that settle out before we start changing the
answers.

I do have some trivial nitpicks about 0001:

 # rules cannot run concurrently with any test that creates a view
-test: rules psql_crosstab amutils
+test: rules psql_crosstab amutils tsrf

Although tsrf.sql doesn't currently need to create any views, it doesn't
seem like a great idea to assume that it never will.  Maybe add this
after misc_functions in the previous parallel group, instead?

+-- it's weird to GROUP BYs that increase the number of results

"it's weird to have ..."

+-- nonsensically that seems to be allowed
+UPDATE fewmore SET data = generate_series(4,9);

"nonsense that seems to be allowed..."

+-- SRFs are now allowed in RETURNING
+INSERT INTO fewmore VALUES(1) RETURNING generate_series(1,3);

s/now/not/, apparently


More to come later, but 0001 is pushable with these fixes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
On 2016-09-02 14:04:24 +0530, Robert Haas wrote:
> On Sun, Aug 28, 2016 at 3:18 AM, Andres Freund  wrote:
> > 0003-Avoid-materializing-SRFs-in-the-FROM-list.patch
> >   To avoid performance regressions from moving SRFM_ValuePerCall SRFs to
> >   ROWS FROM, nodeFunctionscan.c needs to support not materializing
> >   output.
> >
> >   In my present patch I've *ripped out* the support for materialization
> >   in nodeFunctionscan.c entirely. That means that rescans referencing
> >   volatile functions can change their behaviour (if a function is
> >   rescanned, without having it's parameters changed), and that native
> >   backward scan support is gone.  I don't think that's actually an issue.
> 
> Can you expand on why you think those things aren't an issue?  Because
> it seems like they might be.

Backward scans are, by the planner, easily implemented by adding a
materialize node. Which will, when ordinality or multiple ROWS FROM
expressions are present, even be more runtime & memory efficient.  I
also don't think all that many people use FOR SCROLL cursors over SRFs
containing queries.

The part about rewinding is a bit more complicated. As of HEAD, a
rewound scan where some of the SRFs have to change due to parameter
inputs, but others don't, will only re-compute the ones with parameter
changes.  I don't think it's more confusing to rescan the entire input,
rather parts of it in that case.  If the entire input is re-scanned, the
planner knows how to materialize the entire scan output.

I think it'd be pretty annoying to continue to always materialize
ValuePerCall SRFs just to support that type of re-scan behaviour. We
don't really, to my knowledge, flag well whether rescans are required
atm, so we can't even easily do it conditionally.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Tom Lane
Andres Freund  writes:
> ...  What we could do is to add efficient
> ROWS FROM (..) WITH ORDINALITY ORDER bY ordinality;
> support.

Hm?

regression=# explain select * from rows from (generate_series(1,10)) with 
ordinality order by ordinality;
   QUERY PLAN
-
 Function Scan on generate_series  (cost=0.00..10.00 rows=1000 width=12)
(1 row)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
On 2016-09-02 10:58:59 -0500, Kevin Grittner wrote:
> If it has no significant performance impact to maintain the
> historical order, then I have no problem with doing so.

It's not really a runtime issue, it's just a question of how to nicely
constraint the join order. There's no additional sorting or such.


> No.  I'm arguing that we track the order coming out of different
> nodes during planning, and sometimes take advantage of it to avoid
> a sort which would otherwise be required.

I don't think that's realistically possible with SRFs, given they're
often in some language which we have no insight on from the planner
point of view. We could possibly hack something up for SQL SRFs (that'd
be nice, but I'm doubtful it's worth it), but for everything else it
seems unrealistic.  What we could do is to add efficient
ROWS FROM (..) WITH ORDINALITY ORDER bY ordinality;
support.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Kevin Grittner
On Fri, Sep 2, 2016 at 10:10 AM, Tom Lane  wrote:
> Kevin Grittner  writes:
>> On Fri, Sep 2, 2016 at 9:51 AM, Tom Lane  wrote:
>>> regression=# select *, generate_series(1,3) from int8_tbl;
>
>> I'm sure that you realize that running a query of that form twice
>> against a table with more than one heap page could result in rows
>> in a different order, even if no changes had been made to the
>> database (including no vacuum activity, auto- or otherwise).
>
> You missed my point: they might complain about the generate_series
> output not being in the order they expect, independently of what
> the table rows are.

I didn't miss it, I just never thought that anyone would care about
a secondary sort key if the primary sort key was random.  I have
trouble imagining a use-case for that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Kevin Grittner
On Fri, Sep 2, 2016 at 10:10 AM, Tom Lane  wrote:

> Also, before getting too high and mighty with users who expect
> "select * from table" to produce rows in a predictable order,
> you should reflect on the number of places in our regression
> tests that assume exactly that ...

An assumption that not infrequently breaks.  AFAIK, we generally
adjust the tests when that happens, rather than considering it a
bug in the code.  I never thought we did that because there was a
secret, undocumented guarantee of order, but to allow different
code paths to be tested than we would test if we always specified
an order.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Kevin Grittner
On Fri, Sep 2, 2016 at 10:31 AM, Andres Freund  wrote:
> On 2016-09-02 09:41:28 -0500, Kevin Grittner wrote:
>> On Fri, Sep 2, 2016 at 9:25 AM, Tom Lane  wrote:
>> > Andres Freund  writes:
>> >> Oh, and we've previously re-added that based on
>> >> complaints. C.f. d543170f2fdd6d9845aaf91dc0f6be7a2bf0d9e7 (and others
>> >> IIRC).
>> >
>> > That one wasn't about row order per se, but I agree that people *will*
>> > bitch if we change the behavior, especially if we don't provide a way
>> > to fix it.
>>
>> They might also bitch if you add any overhead to put rows in a
>> specific order when they subsequently sort the rows into some
>> different order.
>
> Huh? It's just the order the SRFs are returning rows. If they
> subsequently ORDER, there's no issue. And that doesn't have a
> performance impact afaict.

If it has no significant performance impact to maintain the
historical order, then I have no problem with doing so.  If you
burn resources putting them into historical order, that is going to
be completely wasted effort in some queries.  THAT is what I would
object to.  I'm certainly not arguing that we have any reason to go
out of our way to change the order.

>> You might even destroy an order that would have
>> allowed a sort step to be skipped, so you would pay twice -- once
>> to put them into some "implied" order and then to sort them back
>> into the order they would have had without that extra effort.
>
> So you're arguing that you can't rely on order, but that users rely on
> order?

No.  I'm arguing that we track the order coming out of different
nodes during planning, and sometimes take advantage of it to avoid
a sort which would otherwise be required.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
On 2016-09-02 09:41:28 -0500, Kevin Grittner wrote:
> On Fri, Sep 2, 2016 at 9:25 AM, Tom Lane  wrote:
> > Andres Freund  writes:
> >> Oh, and we've previously re-added that based on
> >> complaints. C.f. d543170f2fdd6d9845aaf91dc0f6be7a2bf0d9e7 (and others
> >> IIRC).
> >
> > That one wasn't about row order per se, but I agree that people *will*
> > bitch if we change the behavior, especially if we don't provide a way
> > to fix it.
> 
> They might also bitch if you add any overhead to put rows in a
> specific order when they subsequently sort the rows into some
> different order.

Huh? It's just the order the SRFs are returning rows. If they
subsequently ORDER, there's no issue. And that doesn't have a
performance impact afaict.


> You might even destroy an order that would have
> allowed a sort step to be skipped, so you would pay twice -- once
> to put them into some "implied" order and then to sort them back
> into the order they would have had without that extra effort.

So you're arguing that you can't rely on order, but that users rely on
order?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
On 2016-09-02 10:34:54 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-02 10:20:42 -0400, Tom Lane wrote:
> >> ... ISTM all we
> >> need is that the SRF be on the inside of the join, which is automatic
> >> if it's LATERAL.
> 
> > Right. But there's nothing to force a lateral reference to be there
> > intrinsically. I've added a "fake" lateral reference to the ROWS FROM
> > RTE to the subquery, when there's none otherwise, but that's not
> > entirely pretty.
> 
> Hm, do you get cases like this right:
> 
>   select generate_series(1, t1.a) from t1, t2;
> 
> That would result in a lateral ref from the SRF RTE to t1, but you really
> need to treat it as laterally dependent on the join of t1/t2 in order to
> preserve the old semantics.  That is, you need to be laterally dependent
> on the whole FROM clause regardless of which variable references appear.

Yes - as the original query is moved into a subquery, the lateral
dependency I force-add simply is to the entire subquery atm (as a
wholerow var).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Tom Lane
Kevin Grittner  writes:
> On Fri, Sep 2, 2016 at 9:51 AM, Tom Lane  wrote:
>> regression=# select *, generate_series(1,3) from int8_tbl;

> I'm sure that you realize that running a query of that form twice
> against a table with more than one heap page could result in rows
> in a different order, even if no changes had been made to the
> database (including no vacuum activity, auto- or otherwise).

You missed my point: they might complain about the generate_series
output not being in the order they expect, independently of what
the table rows are.

Also, before getting too high and mighty with users who expect
"select * from table" to produce rows in a predictable order,
you should reflect on the number of places in our regression
tests that assume exactly that ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Kevin Grittner
On Fri, Sep 2, 2016 at 9:51 AM, Tom Lane  wrote:

> regression=# select *, generate_series(1,3) from int8_tbl;

I'm sure that you realize that running a query of that form twice
against a table with more than one heap page could result in rows
in a different order, even if no changes had been made to the
database (including no vacuum activity, auto- or otherwise).  If
someone reported that as a bug, what would we tell them?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Tom Lane
Kevin Grittner  writes:
> On Fri, Sep 2, 2016 at 9:25 AM, Tom Lane  wrote:
>> ORDER BY is not a useful suggestion when there is nothing
>> you could order by to get the old behavior.

> I'm apparently missing something, because I see a column with the
> header "generate_series" in the result set.

You are apparently only thinking about generate_series and not any
other SRF.  Other SRFs don't necessarily produce outputs that are
in a nice sortable order.  Even for one that does, sorting by it
would destroy the existing behavior:

regression=# select *, generate_series(1,3) from int8_tbl;
q1|q2 | generate_series 
--+---+-
  123 |   456 |   1
  123 |   456 |   2
  123 |   456 |   3
  123 |  4567890123456789 |   1
  123 |  4567890123456789 |   2
  123 |  4567890123456789 |   3
 4567890123456789 |   123 |   1
 4567890123456789 |   123 |   2
 4567890123456789 |   123 |   3
 4567890123456789 |  4567890123456789 |   1
 4567890123456789 |  4567890123456789 |   2
 4567890123456789 |  4567890123456789 |   3
 4567890123456789 | -4567890123456789 |   1
 4567890123456789 | -4567890123456789 |   2
 4567890123456789 | -4567890123456789 |   3
(15 rows)

Now you could argue that the ordering of the table rows
themselves is poorly defined, and you'd be right, but that
doesn't change the fact that the generate_series output
has a well-defined repeating sequence.  People might be
relying on that property.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-02 10:20:42 -0400, Tom Lane wrote:
>> ... ISTM all we
>> need is that the SRF be on the inside of the join, which is automatic
>> if it's LATERAL.

> Right. But there's nothing to force a lateral reference to be there
> intrinsically. I've added a "fake" lateral reference to the ROWS FROM
> RTE to the subquery, when there's none otherwise, but that's not
> entirely pretty.

Hm, do you get cases like this right:

select generate_series(1, t1.a) from t1, t2;

That would result in a lateral ref from the SRF RTE to t1, but you really
need to treat it as laterally dependent on the join of t1/t2 in order to
preserve the old semantics.  That is, you need to be laterally dependent
on the whole FROM clause regardless of which variable references appear.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Kevin Grittner
On Fri, Sep 2, 2016 at 9:25 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> Oh, and we've previously re-added that based on
>> complaints. C.f. d543170f2fdd6d9845aaf91dc0f6be7a2bf0d9e7 (and others
>> IIRC).
>
> That one wasn't about row order per se, but I agree that people *will*
> bitch if we change the behavior, especially if we don't provide a way
> to fix it.

They might also bitch if you add any overhead to put rows in a
specific order when they subsequently sort the rows into some
different order.  You might even destroy an order that would have
allowed a sort step to be skipped, so you would pay twice -- once
to put them into some "implied" order and then to sort them back
into the order they would have had without that extra effort.

> ORDER BY is not a useful suggestion when there is nothing
> you could order by to get the old behavior.

I'm apparently missing something, because I see a column with the
header "generate_series" in the result set.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Tom Lane
Andres Freund  writes:
> On 2016-09-02 09:05:35 -0500, Kevin Grittner wrote:
>>> In general, we've been skeptical about giving any guarantees about
>>> result ordering.

> Well, it's historically how we behaved for SRFs. I'm pretty sure that
> people will be confused if
> SELECT generate_series(1, 10) FROM sometbl;
> suddenly returns rows in an order that reverse from what
> generate_series() returns.

True, but how much "enforcement" do we need really?  This will be a cross
product join, which means that it can only be done as a nestloop not as a
merge or hash (there being no join key to merge or hash on).  ISTM all we
need is that the SRF be on the inside of the join, which is automatic
if it's LATERAL.

>> I think it is a very bad idea to move away from the statement that
>> a query generates a set of rows, and that no order is guaranteed
>> unless the top level has an ORDER BY clause.  How hard is it to add
>> ORDER BY 1, 2 to the above query?  Let the optimizer notice when a
>> node returns data in the needed order and skip the sort if possible.

> There's no such infrastructure for SRFS/ROWS FROM.

And in particular nothing to ORDER BY in this example.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
On 2016-09-02 10:20:42 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-09-02 09:05:35 -0500, Kevin Grittner wrote:
> >>> In general, we've been skeptical about giving any guarantees about
> >>> result ordering.
> 
> > Well, it's historically how we behaved for SRFs. I'm pretty sure that
> > people will be confused if
> > SELECT generate_series(1, 10) FROM sometbl;
> > suddenly returns rows in an order that reverse from what
> > generate_series() returns.
> 
> True, but how much "enforcement" do we need really?  This will be a cross
> product join, which means that it can only be done as a nestloop not as a
> merge or hash (there being no join key to merge or hash on).  ISTM all we
> need is that the SRF be on the inside of the join, which is automatic
> if it's LATERAL.

Right. But there's nothing to force a lateral reference to be there
intrinsically. I've added a "fake" lateral reference to the ROWS FROM
RTE to the subquery, when there's none otherwise, but that's not
entirely pretty.  I'm inclined to go with that though, unless somebody
has a better idea.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Tom Lane
Andres Freund  writes:
> Oh, and we've previously re-added that based on
> complaints. C.f. d543170f2fdd6d9845aaf91dc0f6be7a2bf0d9e7 (and others
> IIRC).

That one wasn't about row order per se, but I agree that people *will*
bitch if we change the behavior, especially if we don't provide a way
to fix it.  ORDER BY is not a useful suggestion when there is nothing
you could order by to get the old behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
On 2016-09-02 07:11:10 -0700, Andres Freund wrote:
> On 2016-09-02 09:05:35 -0500, Kevin Grittner wrote:
> > On Fri, Sep 2, 2016 at 3:31 AM, Robert Haas  wrote:
> > > On Tue, Aug 23, 2016 at 3:10 AM, Andres Freund  wrote:
> > 
> > >> =# SELECT * FROM few, ROWS FROM(generate_series(1,3));
> > >> ┌┬─┐
> > >> │ id │ generate_series │
> > >> ├┼─┤
> > >> │  1 │   1 │
> > >> │  2 │   1 │
> > >> │  1 │   2 │
> > >> │  2 │   2 │
> > >> │  1 │   3 │
> > >> │  2 │   3 │
> > >> └┴─┘
> > >> (6 rows)
> > >> surely isn't what was intended.  So the join order needs to be enforced.
> > >
> > > In general, we've been skeptical about giving any guarantees about
> > > result ordering.
> 
> Well, it's historically how we behaved for SRFs. I'm pretty sure that
> people will be confused if
> SELECT generate_series(1, 10) FROM sometbl;
> suddenly returns rows in an order that reverse from what
> generate_series() returns.

Oh, and we've previously re-added that based on
complaints. C.f. d543170f2fdd6d9845aaf91dc0f6be7a2bf0d9e7 (and others
IIRC).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Kevin Grittner
On Fri, Sep 2, 2016 at 9:11 AM, Andres Freund  wrote:
> On 2016-09-02 09:05:35 -0500, Kevin Grittner wrote:
>> On Fri, Sep 2, 2016 at 3:31 AM, Robert Haas  wrote:
>>> On Tue, Aug 23, 2016 at 3:10 AM, Andres Freund  wrote:
>>
 =# SELECT * FROM few, ROWS FROM(generate_series(1,3));
 ┌┬─┐
 │ id │ generate_series │
 ├┼─┤
 │  1 │   1 │
 │  2 │   1 │
 │  1 │   2 │
 │  2 │   2 │
 │  1 │   3 │
 │  2 │   3 │
 └┴─┘
 (6 rows)
 surely isn't what was intended.  So the join order needs to be enforced.
>>>
>>> In general, we've been skeptical about giving any guarantees about
>>> result ordering.
>
> Well, it's historically how we behaved for SRFs.

And until we had synchronized scans a sequential scan always
returned rows in the order they were present in the heap.
Implementation details are not guarantees.

> I'm pretty sure that people will be confused if
> SELECT generate_series(1, 10) FROM sometbl;
> suddenly returns rows in an order that reverse from what
> generate_series() returns.

If this changes, it is probably worth a mentioning in the release
notes.

>> I think it is a very bad idea to move away from the statement that
>> a query generates a set of rows, and that no order is guaranteed
>> unless the top level has an ORDER BY clause.  How hard is it to add
>> ORDER BY 1, 2 to the above query?  Let the optimizer notice when a
>> node returns data in the needed order and skip the sort if possible.
>
> There's no such infrastructure for SRFS/ROWS FROM.

Well, that's something to fix (or not), but not a justification for
"except on Tuesdays when the moon is full" sorts of exceptions to
simple rules about what to expect.  No ORDER BY means no order
guaranteed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
Hi,

Thanks for looking.

On 2016-09-02 14:01:32 +0530, Robert Haas wrote:
> > 6) SETOF record type functions cannot directly be used in ROWS FROM() -
> >as ROWS FROM "expands" records returned by functions. When converting
> >something like
> > CREATE OR REPLACE FUNCTION setof_record_sql() RETURNS SETOF record LANGUAGE 
> > sql AS $$SELECT 1 AS a, 2 AS b UNION ALL SELECT 1, 2;$$;
> > SELECT setof_record_sql();
> >we don't have that available though.
> >
> > The best way to handle that seems to be to introduce the ability for
> > ROWS FROM not to expand the record returned by a column.  I'm currently
> > thinking that something like ROWS FROM(setof_record_sql() AS ()) would
> > do the trick.  That'd also considerably simplify the handling of
> > functions returning known composite types - my current POC patch
> > generates a ROW(a,b,..) type expression for those.
> >
> > I'm open to better syntax suggestions.
> 
> I definitely agree that having some syntax to avoid row-expansion in
> this case (and maybe in other cases) would be a good thing; I suspect
> that would get a good bit of use.  I don't care much for that
> particular choice of syntax, which seems fairly magical, but I'm not
> sure what would be better.

I'm not a fan either, but until somebody ocmes up with something better
:/

> That sounds good.
> > I've implemented ([2]) a prototype of this. My basic approach is:
> >
> > I) During parse-analysis, remember whether a query has any tSRFs
> >(Query->hasTargetSRF). That avoids doing a useless pass over the
> >query, if no tSRFs are present.
> > II) At the beginning of subquery_planner(), before doing any work
> >operating on subqueries and such, implement SRFs if ->hasTargetSRF().
> >(unsrfify() in the POC)
> > III) Unconditionally move the "current" query into a subquery. For that
> >do a mutator pass over the query, replacing Vars/Aggrefs/... in the
> >original targetlist with Var references to the new subquery.
> >(unsrfify_reference_subquery_mutator() in the POC)
> > IV) Do a pass over the outer query's targetlist, and implement any tSRFs
> >using a ROWS FROM() RTE (or multiple ones in case of nested tSRFs).
> >(unsrfify_implement_srfs_mutator() in the POC)
> >
> > that seems to mostly work well.
> 
> I gather that III and IV are skipped if hasTargetSRF isn't set.

Precisely.


> > d) Not a problem with the patch per-se, but I'm doubful that that's ok:
> > =# SELECT 1 ORDER BY generate_series(1, 10);
> > returns 10 rows ;) - maybe we should forbid that?
> 
> OK by me.  I feel like this isn't the only case where the presence of
> resjunk columns has user-visible effects, although I can't think of
> another one right at the moment.  It seems like something to avoid,
> though.

An early patch in the series now errors out if ORDER BY or GROUP BY adds
a retset resjunk element.


Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

2016-09-02 Thread Andres Freund
On 2016-09-02 09:05:35 -0500, Kevin Grittner wrote:
> On Fri, Sep 2, 2016 at 3:31 AM, Robert Haas  wrote:
> > On Tue, Aug 23, 2016 at 3:10 AM, Andres Freund  wrote:
> 
> >> =# SELECT * FROM few, ROWS FROM(generate_series(1,3));
> >> ┌┬─┐
> >> │ id │ generate_series │
> >> ├┼─┤
> >> │  1 │   1 │
> >> │  2 │   1 │
> >> │  1 │   2 │
> >> │  2 │   2 │
> >> │  1 │   3 │
> >> │  2 │   3 │
> >> └┴─┘
> >> (6 rows)
> >> surely isn't what was intended.  So the join order needs to be enforced.
> >
> > In general, we've been skeptical about giving any guarantees about
> > result ordering.

Well, it's historically how we behaved for SRFs. I'm pretty sure that
people will be confused if
SELECT generate_series(1, 10) FROM sometbl;
suddenly returns rows in an order that reverse from what
generate_series() returns.

> +
> 
> I think it is a very bad idea to move away from the statement that
> a query generates a set of rows, and that no order is guaranteed
> unless the top level has an ORDER BY clause.  How hard is it to add
> ORDER BY 1, 2 to the above query?  Let the optimizer notice when a
> node returns data in the needed order and skip the sort if possible.

There's no such infrastructure for SRFS/ROWS FROM.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >