Teodor Sigaev <teo...@sigaev.ru> writes:
> I tried to look into patch and I had a question (one for now): why LimitPath 
> doesn't contain actual limit/offset value? I saw a lot of subqueries with 
> 1 which could be transformed into EXISTS subquery.

Oh, yeah, I intended to change that but didn't get to it yet.  Consider
it done.

> Me too. I applied the patch and can confirm that 'make test' doesn't fail on 
> FreeBSD 10.2. Now I will try to run kind of TPC-H with and without patch.

I do not think the patch will make a lot of performance difference as-is;
its value is more in what it will let us do later.  There are a couple of
regression test cases that change plans for the better, but it's sort of
accidental.  Those cases look like

  select d.* from d left join (select * from b group by b.id, b.c_id) s
    on d.a = s.id;

and what happens in HEAD is that the subquery chooses a hashagg plan
and then the upper query decides a mergejoin would be a good idea ...
so it has to sort the output of the hashagg.  With the patch, what
comes back from the subquery is a Path for the hashagg and a Path
for doing the GROUP BY with Sort/Uniq.  The second path is more expensive,
but it survives the add_path tournament because it can produce sorted
output.  Then the outer level discovers that it can use that to do its
mergejoin without a separate sort step, and that way is cheaper overall.
So instead of

!    ->  Sort
!          Sort Key: s.id
!          ->  Subquery Scan on s
!                ->  HashAggregate
!                      Group Key: b.id
!                      ->  Seq Scan on b

we get

!    ->  Group
!          Group Key: b.id
!          ->  Index Scan using b_pkey on b

which is noticeably cheaper, and not just because we got rid of the
Subquery Scan node.  So that's nice --- but it's more or less accidental,
because the outer level isn't telling the inner level that this sort order
might be interesting.

Once this infrastructure is in place, I want to look at passing down more
information to recursive subquery_planner calls so that we're not leaving
this kind of optimization to chance.  But the patch is big enough already,
so that (and a lot of other things) are getting left for later.

                        regards, tom lane

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to