[HACKERS] Pushing restrictions down into GROUP BYs?

Jay Levitt Sat, 22 Sep 2012 11:02:55 -0700

A while back, I posted a pathological minimal-case query where, in orderto select one row from a users table, Postgres needed to scan the wholeusers table, because the restriction was not visible to the GROUP BY.


At the time, Tom wrote:


> Don't hold your breath waiting for that to change.  To do what you're
> wishing for, we'd have to treat the GROUP BY subquery as if it were an
> inner indexscan, and push a join condition into it.  That's not even
> possible today.  It might be possible after I get done with the
> parameterized-path stuff I've been speculating about for a couple of
> years now; but I suspect that even if it is possible, we won't do it
> for subqueries because of the planner-performance hit we'd take from
> repeatedly replanning the same subquery.

http://archives.postgresql.org/pgsql-performance/2011-11/msg00131.php

Given all the work that's been done on parameterized paths and LATERAL,is this something to revisit yet? In 9.3, it's at least possible tomanually tweak the SRF, thanks to LATERAL. But it'd be great to allowset-returning functions to remain blissfully unaware of theirsurroundings. Modular code, Single Responsibility Principle, all that.

I guess a more general question is: Are there cases where the plannercan *use* LATERAL functionality to push down restrictions like this? (DoLATERAL and pushdown conceptually overlap? I think maybe they do.)

Example code below - and before you say "but you could just use 'whereexists'", trust me that the original queries were much more involved :)


-----


    drop schema if exists jaytest cascade;
    create schema jaytest;
    set search_path to jaytest;

    create table questions (
      id int not null primary key,
      user_id int not null
    );
    insert into questions
      select generate_series(1,1100), (random()*2000000)::int;

    create table users (
      id int not null primary key
    );
    insert into users select generate_series(1, 2000000);

    vacuum freeze analyze;

    create function srf() returns table (user_id int) as $$
      select u.id
      from users as u
      group by u.id
    $$ language sql stable;

    -- Option 1: Use the set-returning function

    explain analyze
    select questions.id
    from questions
    join srf()
    on srf.user_id = questions.user_id
    where questions.id = 1;

    -- Option 2: Use the equivalent of the set-returning function
    -- (remove any doubts about function call overhead)

    explain analyze
    select questions.id
    from questions
    join (
      select u.id as user_id
      from users as u
      group by u.id
    ) as srf1
    on srf1.user_id = questions.user_id
    where questions.id = 1;

    -- Option 3: Now that we have LATERAL, manually
    -- push the join down into the set-returning function

create function srf_lateral(questions_user_id int) returns table(user_id int) as $$

      select u.id
      from users as u
      where u.id = questions_user_id
      group by u.id
    $$ language sql stable;

    explain analyze
    select questions.id
    from questions, lateral srf_lateral(questions.user_id)
    where questions.id = 1;

    drop schema jaytest cascade;

-----

On my machine, options 1 and 2 usually have runtimes of 100-400 ms, andoption 3 is about 0.04ms.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Pushing restrictions down into GROUP BYs?

Reply via email to