On 23/08/16 09:40, Andres Freund wrote:
Hi,

as noted in [1] I started hacking on removing the current implementation
of SRFs in the targetlist (tSRFs henceforth). IM discussion brought the
need for a description of the problem, need and approach to light.

There are several reasons for wanting to get rid of tSRFs. The primary
ones in my opinion are that the current behaviour of several SRFs in one
targetlist is confusing, and that the implementation burden currently is
all over the executor.  Especially the latter is what is motivating me
working on this, because it blocks my work on making the executor faster
for queries involving significant amounts of tuples.  Batching is hard
if random places in the querytree can icnrease the number of tuples.

The basic idea, hinted at in several threads, is, at plan time, to convert a 
query like
SELECT generate_series(1, 10);
into
SELECT generate_series FROM ROWS FROM(generate_series(1, 10));

thereby avoiding the complications in the executor (c.f. execQual.c
handling of isDone/ExprMultipleResult and supporting code in many
executor nodes / node->*.ps.ps_TupFromTlist).

There are several design questions along the way:

1) How to deal with the least-common-multiple behaviour of tSRFs. E.g.
=# SELECT generate_series(1, 3), generate_series(1,2);
returning
┌─────────────────┬─────────────────┐
│ generate_series │ generate_series │
├─────────────────┼─────────────────┤
│               1 │               1 │
│               2 │               2 │
│               3 │               1 │
│               1 │               2 │
│               2 │               1 │
│               3 │               2 │
└─────────────────┴─────────────────┘
(6 rows)
but
=# SELECT generate_series(1, 3), generate_series(5,7);
returning
┌─────────────────┬─────────────────┐
│ generate_series │ generate_series │
├─────────────────┼─────────────────┤
│               1 │               5 │
│               2 │               6 │
│               3 │               7 │
└─────────────────┴─────────────────┘

discussion in this thread came, according to my reading, to the
conclusion that that behaviour is just confusing and that the ROWS FROM
behaviour of
=# SELECT * FROM ROWS FROM(generate_series(1, 3), generate_series(1,2));
┌─────────────────┬─────────────────┐
│ generate_series │ generate_series │
├─────────────────┼─────────────────┤
│               1 │               1 │
│               2 │               2 │
│               3 │          (null) │
└─────────────────┴─────────────────┘
(3 rows)

makes more sense.
I had always implicitly assumed that having 2 generated sequences would act as equivalent to:

SELECT
    sa,
    sb
FROM
    ROWS FROM(generate_series(1, 3)) AS sa,
    ROWS FROM(generate_series(5, 7)) AS sb
ORDER BY
    sa,
    sb;

 sa | sb
----+----
  1 |  5
  1 |  6
  1 |  7
  2 |  5
  2 |  6
  2 |  7
  3 |  5
  3 |  6
  3 |  7


Obviously I was wrong - but to me, my implicit assumption makes more sense!
[...]


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to