On 09/30/2015 03:12 AM, David Rowley wrote:
    CREATE TABLE f (id1 INT, id2 INT, PRIMARY KEY (id1, id2));

    f(id1, id2));
    f(id1, id2));

    INSERT INTO f SELECT i, i FROM generate_series(1,1000000) s(i);

    INSERT INTO d1 SELECT i, i FROM generate_series(1,100000) s(i);
    INSERT INTO d2 SELECT i, i FROM generate_series(1,300000) s(i);

    now, both pair-wise joins (f JOIN d1) and (f JOIN d2) are estimated
    perfectly accurately, but as soon as the query involves both of
    them, this happens:

    SELECT * FROM f JOIN d1 ON (f.id1 = d1.id1 AND f.id2 = d1.id2)
                     JOIN d2 ON (f.id1 = d2.id1 AND f.id2 = d2.id2);

                               QUERY PLAN
      Nested Loop  (cost=3334.43..12647.57 rows=30000 width=24)
                   (actual time=221.086..1767.206 rows=100000 loops=1)
        Join Filter: ((d1.id1 = f.id1) AND (d1.id2 = f.id2))
        ->  Hash Join  (cost=3334.00..12647.01 rows=1 width=16)
                       (actual time=221.058..939.482 rows=100000 loops=1)
              Hash Cond: ((d2.id1 = d1.id1) AND (d2.id2 = d1.id2))
              ->  Seq Scan on d2  (cost=0.00..4328.00 rows=300000 width=8)
                          (actual time=0.038..263.356 rows=300000 loops=1)
              ->  Hash  (cost=1443.00..1443.00 rows=100000 width=8)
                        (actual time=220.721..220.721 rows=100000 loops=1)
                    Buckets: 131072  Batches: 2  Memory Usage: 2982kB
                    ->  Seq Scan on d1  (cost=0.00..1443.00 rows=100000 ...)
                            (actual time=0.033..101.547 rows=100000 loops=1)
        ->  Index Only Scan using f_pkey on f  (cost=0.42..0.54 rows=1 ...)
                             (actual time=0.004..0.004 rows=1 loops=100000)
              Index Cond: ((id1 = d2.id1) AND (id2 = d2.id2))
              Heap Fetches: 100000

    Clearly, the inner join (d1 JOIN d2) is poorly estimated (1 vs.
    100000). I assume that's only because we find FK only on the second
    join with f.

    So it seems like s a clear improvement, both compared to master and
    the previous versions of the patch.

I've been experimenting with this example. Of course, the reason why we
get the 1 row estimate on the join between d1 and d2 is that there's no
foreign key between those two relations.

The attached patch changes things so that the foreign key matching code
is better able to see foreign keys "hidden" behind eclasses. So it does
now in fact detect a foreign key on d2 referencing d1, by looking for
Vars foreign keys which have Vars in the same eclasses as the joinquals
are built from. This has improved the result

postgres=# EXPLAIN ANALYZE SELECT * FROM f JOIN d1 ON (f.id1 = d1.id1
AND f.id2 = d1.id2) JOIN d2 ON (f.id1 = d2.id1 AND f.id2 = d2.id2);

  Hash Join  (cost=16655.94..26066.95 rows=30000 width=24) (actual
time=267.322..468.383 rows=100000 loops=1)
    Hash Cond: ((d2.id1 = f.id1) AND (d2.id2 = f.id2))
    ->  Seq Scan on d2  (cost=0.00..4328.00 rows=300000 width=8) (actual
time=0.019..31.396 rows=300000 loops=1)
    ->  Hash  (cost=14666.94..14666.94 rows=100000 width=16) (actual
time=266.263..266.263 rows=100000 loops=1)
          Buckets: 131072  Batches: 2  Memory Usage: 3373kB
          ->  Merge Join  (cost=9748.32..14666.94 rows=100000 width=16)
(actual time=104.494..224.908 rows=100000 loops=1)
                Merge Cond: ((f.id1 = d1.id1) AND (f.id2 = d1.id2))
                ->  Index Only Scan using f_pkey on f
  (cost=0.42..36214.93 rows=1000000 width=8) (actual time=0.045..35.758
rows=100001 loops=1)
                      Heap Fetches: 100001
                ->  Sort  (cost=9747.82..9997.82 rows=100000 width=8)
(actual time=104.440..122.401 rows=100000 loops=1)
                      Sort Key: d1.id1, d1.id2
                      Sort Method: external sort  Disk: 2152kB
                      ->  Seq Scan on d1  (cost=0.00..1443.00
rows=100000 width=8) (actual time=0.019..9.443 rows=100000 loops=1)

The problem is that the code I added is sometimes a bit too optimistic
at finding a suitable foreign key. When performing estimates for the
join between (f,d1) <-> (d2), since the code loops over each relation
making up the set of relations at either side of the join, we find a
foreign key on 'f' which references d2, this one actually exists. It
then goes on and also finds a foreign key for (d1) references (d2), of
course this one does not exists and it's only could due to the eclasses.
The problem here is, which one do we use? If we multiply the selectivity
for each of these foreign keys then we'd end up with a selectivty = (1.0
/ 1000000) * (1.0 / 300000), which is a massive underestimation. Perhaps
doing this would be perfectly valid if the actual foreign key being
around was not the same one as the last one, but this seems wrong when
we match to the same foreign key in both instances.

I've gone though a few variations on ways to handle this and I'm a bit
stuck on what's the best way.

In the attached I've coded it to take the Min() selectivity for when the
same quals are matched more than once. I know this is not correct, but
since it seems impossible to obtain an exact estimate in this case, we'd
need to decide on some logic which does well in the average case.

I don't think we should derive foreign keys like this. The basis for using foreign keys to improve estimates is that the foreign keys are supposed to provide guarantees of existence, but that's clearly not the case here - there's no foreign key between the two tables that get joined first, and the FK you derive this way guarantees nothing.

For example the tables might refer different subsets of the "f" table, e.g. d1 might reference odd rows while d2 even rows. That kinda breaks the assumption of containment, but well - that's exactly what FK constraints are supposed to guarantee (and what we use to improve the estimates).

The problem with estimating cardinality of the d1:d2 join is purely in our inability to estimate the cardinality of the pair of columns used in the join condition. The planner sees the conditions independently, estimates the selectivity as 1/ndistinct for the column and then multiplies that together. Sadly, in this case ndistinct is equal to cardinality of each of the tables, thus we get extreme under-estimate.

Consider a simplified example:

CREATE TABLE d1 (id1 INT, id2 INT);
CREATE TABLE d2 (id1 INT, id2 INT);

INSERT INTO d1 SELECT i/100, i%100 FROM generate_series(0,9999) s(i);
INSERT INTO d2 SELECT i/548, i%548 FROM generate_series(0,299999) s(i);

In this case the data are constructed in a way that the product of column cardinalities is equal to table cardinality.

d1: 100 x 100 =  10.000
d2: 548 x 548 = 300.000 (about)

                               QUERY PLAN
 Hash Join  (cost=10000.00..30046.69 rows=9969 width=8)
            (actual time=278.989..306.935 rows=10000 loops=1)
   Hash Cond: ((d1.id1 = d2.id1) AND (d1.id2 = d2.id2))
   ->  Seq Scan on d1  (cost=0.00..145.00 rows=10000 width=8)
                       (actual time=0.031..4.202 rows=10000 loops=1)
   ->  Hash  (cost=4328.00..4328.00 rows=300000 width=8)
             (actual time=278.717..278.717 rows=300000 loops=1)
         Buckets: 131072  Batches: 4  Memory Usage: 3947kB
         ->  Seq Scan on d2  (cost=0.00..4328.00 rows=300000 width=8)
                     (actual time=0.020..129.025 rows=300000 loops=1)
 Planning time: 0.556 ms
 Execution time: 311.037 ms
(8 rows)

So fixing the cardinality estimate for (id1,id2) actually fixes the estimate, but that's a task for the multivariate stats patch, not for this one.

The FK improves the ndistinct estimate implicitly as it guarantees the cardinality of one side is exactly the cardinality of the table (thanks to referencing a primary key). Maybe we could use the existence of unique constraints in other cases.

Overall, I still believe the FK patch is a clear improvement of the current status - while it's true it does not improve all possible cases and there's a room for additional improvements (like handling multiple candidate FK constraints), it should not make existing estimates worse.


Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to