On 10 October 2015 at 08:52, Sean Rhea <sean.c.r...@gmail.com> wrote:

>
> 1. When I join two tables with "WHERE id IN (...)" versus with an explicit
> join, and the join column for the inner table is a primary key, I would
> expect
> the same behavior in both cases, but the optimizer is choosing a merge
> join in
> one case and a merge semi join in the other. There's at most one customer
> with a given id. Why not do a semi join?
>
>
Unfortunately the 9.2 planner does not make any checks to verify that
customers.id is unique to perform a semi join. There is a pending patch in
the 9.6 cycle to add this optimisation.


> production=> select count(*) from customers;
>  count
> --------
>  473733
> (1 row)
>
> ...


>    ->  Index Scan using customers_pkey on customers o
>  (cost=0.00..63836.61 rows=836 width=8) (actual time=159.840..201915.765
> rows=7672 loops=1)
>          Filter: (group_id = 45)
>          Rows Removed by Filter: 212699113
>
>
Rows Removed by Filter: 212699113 seems to indicate that your 473733 row
count for "customers" is incorrect.

If you're doing lots of filtering on group_id, then perhaps you should
think about adding an index on customers (group_id,id)


--
 David Rowley                   http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
 PostgreSQL Development, 24x7 Support, Training & Services

Reply via email to