Hi,I've been looking at this issue today, and so far I don't think it's a bug in the foreign key estimation. It seems mostly that the 9.5 estimates were hopelessly bad, and the join estimation changes simply pushed it a tiny bit the wrong direction.
Although maybe there is a bug (or at least a change of behavior) in one case, but I'll get to that.
I've managed to extract a small part of Q20 that demonstrates the differences between versions quite nicely, I think. The part causing the trouble looks like this:
explain select ps_suppkey from partsupp, ( select l_partkey agg_partkey, l_suppkey agg_suppkey from lineitem where l_shipdate >= date '1997-01-01' and l_shipdate < date '1997-01-01' + interval '1' year group by l_partkey, l_suppkey ) agg_lineitem where agg_partkey = ps_partkey and agg_suppkey = ps_suppkey and ps_partkey in ( select p_partkey from part where p_name like 'hot%' );i.e. it aggregates the "lineitem" table, and then joins "partsupp" and "part" tables to it.
"aggregated lineitem" <-> partsupp <-> partI've collected estimates from four different variants of the query (see the attached exlain.sql):
1) SIMPLE - join directly to lineitem (without the aggregation) - remove the p_name LIKE pattern matching 2) SIMPLE+LIKE - like SIMPLE, but keep the LIKE condition 3) GROUPING - join to the aggregated lineitem table - remove the p_name LIKE pattern matching 4) GROUPING+LIKE - like GROUPING, but keep the LIKE conditionI've collected estimates on a 20GB data set, both from 9.5 (so without any of the FK estimation changes) and on master with different foreign keys between the tables.
no-keys - no foreign keys between the three tables lineitem - lineitem references partsupp partsupp - partsupp references part both - both foreign keysAnd the results look like this (actual row counts were collected on 9.5, but that should not matter - the results should be the same on all versions):
branch SIMPLE SIMPLE+LIKE GROUPING GROUPING+LIKE -------------------------------------------------------------------- actual 119994608 1311974 10897186 119238 9.5 2863 35 160 160 no-keys 2340 24 868 868 lineitem 119994848 1229750 868 868 partsupp 2340 24 1737 18 both-keys 119994848 1212065 1737 18 This seems mostly sane, I guess, but let's look at various cases.In the SIMPLE cases, the foreign key "lineitem->partsupp" makes a huge difference - the estimates are pretty exact, both with and without the LIKE condition. The "partsupp->part" key makes almost no difference, though - the minor differences (35/24 and 1229750/1212065) seem to be mostly due to minor differences in stats built by ANALYZE, particularly in histograms used by patternsel().
In the GROUPING cases, the situation is obviously much worse. The grouping makes it impossible to use the "lineitem->partsupp" foreign key, resulting in severe underestimates. The "partsupp->part" is used, but the difference is pretty negligible as it's a simple (one column) foreign key.
The change from 160 -> 868 is merely due to 84f9a35e3 changing how we estimate number of groups in a GROUP BY clause. In 9.5 we get this:
-> HashAggregate (rows=1836028) (actual rows=10897186) while since 9.6 we get this -> GroupAggregate (rows=9674242)Not only is that much closer to the actual value than the 9.5 estimate, but it's almost exactly the factor between 160 and 868:
9674242 / 1836028 = 5.27 160 * 5.26 = 843So I'd say the 160 vs. 868 is expected, although the result is still way off, of course.
Which brings me to the slightly suspicious bit. On 9.5, there's no difference between GROUP and GROUP+LIKE cases - the estimates are exactly the same in both cases. This is true too, but only without the foreign key between "partsupp" and "part", i.e. the two non-grouped relations in the join. And what's more, the difference (1737 vs. 16) is pretty much exactly 100x, which is the estimate for the LIKE condition.
So it kinda seems 9.5 does not apply this condition for semi-joins, while >=9.6 does that.
regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=23613912.46..24689677.34 rows=2340 width=4) -> Merge Join (cost=23613912.03..24673859.22 rows=2340 width=12) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> Sort (cost=20971852.46..21271839.58 rows=119994848 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Seq Scan on lineitem (cost=0.00..3228950.48 rows=119994848 width=8) -> Materialize (cost=2642034.49..2722034.49 rows=16000000 width=8) -> Sort (cost=2642034.49..2682034.49 rows=16000000 width=8) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey -> Seq Scan on partsupp (cost=0.00..508759.00 rows=16000000 width=8) -> Index Only Scan using part_pkey on part (cost=0.43..6.75 rows=1 width=4) Index Cond: (p_partkey = lineitem.l_partkey) (12 rows) QUERY PLAN -------------------------------------------------------------------------------------------------- Nested Loop (cost=1.00..1621494.68 rows=24 width=4) -> Nested Loop (cost=0.43..473219.70 rows=161616 width=12) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..8.10 rows=35 width=8) Index Cond: (ps_partkey = part.p_partkey) -> Index Only Scan using idx_lineitem_part_supp on lineitem (cost=0.57..7.09 rows=1 width=8) Index Cond: ((l_partkey = partsupp.ps_partkey) AND (l_suppkey = partsupp.ps_suppkey)) (8 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=5205236.43..6886133.98 rows=1737 width=4) Join Filter: (lineitem.l_partkey = part.p_partkey) -> Merge Join (cost=5205236.00..6885310.01 rows=1737 width=12) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Index Only Scan using partsupp_pkey on partsupp (cost=0.43..1221995.31 rows=16000000 width=8) -> Index Only Scan using part_pkey on part (cost=0.43..0.47 rows=1 width=4) Index Cond: (p_partkey = partsupp.ps_partkey) (15 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=5671028.84..6050302.90 rows=18 width=4) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Sort (cost=465793.27..466197.31 rows=161616 width=12) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey -> Nested Loop (cost=0.43..451811.70 rows=161616 width=12) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..7.57 rows=35 width=8) Index Cond: (ps_partkey = part.p_partkey) (17 rows)
QUERY PLAN ----------------------------------------------------------------------------------------------------------- Merge Join (cost=24060582.74..25120389.31 rows=119994848 width=4) Merge Cond: ((lineitem.l_suppkey = partsupp.ps_suppkey) AND (lineitem.l_partkey = partsupp.ps_partkey)) -> Sort (cost=20971852.46..21271839.58 rows=119994848 width=8) Sort Key: lineitem.l_suppkey, lineitem.l_partkey -> Seq Scan on lineitem (cost=0.00..3228950.48 rows=119994848 width=8) -> Materialize (cost=3088641.19..3168641.19 rows=16000000 width=12) -> Sort (cost=3088641.19..3128641.19 rows=16000000 width=12) Sort Key: partsupp.ps_suppkey, partsupp.ps_partkey -> Hash Join (cost=171919.20..900678.20 rows=16000000 width=12) Hash Cond: (partsupp.ps_partkey = part.p_partkey) -> Seq Scan on partsupp (cost=0.00..508759.00 rows=16000000 width=8) -> Hash (cost=121919.09..121919.09 rows=4000009 width=4) -> Seq Scan on part (cost=0.00..121919.09 rows=4000009 width=4) (13 rows) QUERY PLAN -------------------------------------------------------------------------------------------------- Nested Loop (cost=1.00..1621494.68 rows=1212065 width=4) -> Nested Loop (cost=0.43..473219.70 rows=161616 width=12) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..8.10 rows=35 width=8) Index Cond: (ps_partkey = part.p_partkey) -> Index Only Scan using idx_lineitem_part_supp on lineitem (cost=0.57..7.09 rows=1 width=8) Index Cond: ((l_partkey = partsupp.ps_partkey) AND (l_suppkey = partsupp.ps_suppkey)) (8 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=5205236.43..6886133.98 rows=1737 width=4) Join Filter: (lineitem.l_partkey = part.p_partkey) -> Merge Join (cost=5205236.00..6885310.01 rows=1737 width=12) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Index Only Scan using partsupp_pkey on partsupp (cost=0.43..1221995.31 rows=16000000 width=8) -> Index Only Scan using part_pkey on part (cost=0.43..0.47 rows=1 width=4) Index Cond: (p_partkey = partsupp.ps_partkey) (15 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=5671028.84..6050302.90 rows=18 width=4) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Sort (cost=465793.27..466197.31 rows=161616 width=12) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey -> Nested Loop (cost=0.43..451811.70 rows=161616 width=12) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..7.57 rows=35 width=8) Index Cond: (ps_partkey = part.p_partkey) (17 rows)
QUERY PLAN ----------------------------------------------------------------------------------------------------------- Merge Join (cost=24060582.74..25120389.31 rows=119994848 width=4) Merge Cond: ((lineitem.l_suppkey = partsupp.ps_suppkey) AND (lineitem.l_partkey = partsupp.ps_partkey)) -> Sort (cost=20971852.46..21271839.58 rows=119994848 width=8) Sort Key: lineitem.l_suppkey, lineitem.l_partkey -> Seq Scan on lineitem (cost=0.00..3228950.48 rows=119994848 width=8) -> Materialize (cost=3088641.19..3168641.19 rows=16000000 width=12) -> Sort (cost=3088641.19..3128641.19 rows=16000000 width=12) Sort Key: partsupp.ps_suppkey, partsupp.ps_partkey -> Hash Join (cost=171919.20..900678.20 rows=16000000 width=12) Hash Cond: (partsupp.ps_partkey = part.p_partkey) -> Seq Scan on partsupp (cost=0.00..508759.00 rows=16000000 width=8) -> Hash (cost=121919.09..121919.09 rows=4000009 width=4) -> Seq Scan on part (cost=0.00..121919.09 rows=4000009 width=4) (13 rows) QUERY PLAN ------------------------------------------------------------------------------------------------------- Merge Join (cost=3947865.64..4839365.50 rows=1229750 width=4) Merge Cond: (partsupp.ps_partkey = lineitem.l_partkey) Join Filter: (lineitem.l_suppkey = partsupp.ps_suppkey) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..764250.44 rows=16000000 width=8) -> Sort (cost=3947864.78..3950939.15 rows=1229750 width=12) Sort Key: lineitem.l_partkey -> Hash Join (cost=132424.16..3823475.97 rows=1229750 width=12) Hash Cond: (lineitem.l_partkey = part.p_partkey) -> Seq Scan on lineitem (cost=0.00..3228950.48 rows=119994848 width=8) -> Hash (cost=131919.11..131919.11 rows=40404 width=4) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) (12 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=5205236.43..6886133.98 rows=868 width=4) Join Filter: (lineitem.l_partkey = part.p_partkey) -> Merge Join (cost=5205236.00..6885310.01 rows=1737 width=12) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Index Only Scan using partsupp_pkey on partsupp (cost=0.43..1221995.31 rows=16000000 width=8) -> Index Only Scan using part_pkey on part (cost=0.43..0.47 rows=1 width=4) Index Cond: (p_partkey = partsupp.ps_partkey) (15 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=5803124.91..6191904.83 rows=868 width=4) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Sort (cost=597889.34..601461.54 rows=1428880 width=12) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey -> Nested Loop (cost=0.43..451811.70 rows=1428880 width=12) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..7.57 rows=35 width=8) Index Cond: (ps_partkey = part.p_partkey) (17 rows)
explain.sql
Description: application/sql
QUERY PLAN ----------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=23613912.46..24689677.34 rows=2340 width=4) -> Merge Join (cost=23613912.03..24673859.22 rows=2340 width=12) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> Sort (cost=20971852.46..21271839.58 rows=119994848 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Seq Scan on lineitem (cost=0.00..3228950.48 rows=119994848 width=8) -> Materialize (cost=2642034.49..2722034.49 rows=16000000 width=8) -> Sort (cost=2642034.49..2682034.49 rows=16000000 width=8) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey -> Seq Scan on partsupp (cost=0.00..508759.00 rows=16000000 width=8) -> Index Only Scan using part_pkey on part (cost=0.43..6.75 rows=1 width=4) Index Cond: (p_partkey = lineitem.l_partkey) (12 rows) QUERY PLAN ------------------------------------------------------------------------------------------------------- Merge Join (cost=3947865.64..4839365.50 rows=24 width=4) Merge Cond: (partsupp.ps_partkey = lineitem.l_partkey) Join Filter: (lineitem.l_suppkey = partsupp.ps_suppkey) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..764250.44 rows=16000000 width=8) -> Sort (cost=3947864.78..3950939.15 rows=1229750 width=12) Sort Key: lineitem.l_partkey -> Hash Join (cost=132424.16..3823475.97 rows=1229750 width=12) Hash Cond: (lineitem.l_partkey = part.p_partkey) -> Seq Scan on lineitem (cost=0.00..3228950.48 rows=119994848 width=8) -> Hash (cost=131919.11..131919.11 rows=40404 width=4) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) (12 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop Semi Join (cost=5205236.43..6886133.98 rows=868 width=4) Join Filter: (lineitem.l_partkey = part.p_partkey) -> Merge Join (cost=5205236.00..6885310.01 rows=1737 width=12) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Index Only Scan using partsupp_pkey on partsupp (cost=0.43..1221995.31 rows=16000000 width=8) -> Index Only Scan using part_pkey on part (cost=0.43..0.47 rows=1 width=4) Index Cond: (p_partkey = partsupp.ps_partkey) (15 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=5803124.91..6191904.83 rows=868 width=4) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> GroupAggregate (cost=5205235.57..5438183.70 rows=9674242 width=40) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Sort (cost=5205235.57..5250637.47 rows=18160762 width=8) Sort Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=465856.38..2767269.81 rows=18160762 width=8) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..461316.19 rows=18160762 width=0) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Sort (cost=597889.34..601461.54 rows=1428880 width=12) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey -> Nested Loop (cost=0.43..451811.70 rows=1428880 width=12) -> Seq Scan on part (cost=0.00..131919.11 rows=40404 width=4) Filter: ((p_name)::text ~~ 'hot%'::text) -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..7.57 rows=35 width=8) Index Cond: (ps_partkey = part.p_partkey) (17 rows)
select ps_suppkey from lineitem, partsupp where l_partkey = ps_partkey and l_suppkey = ps_suppkey and ps_partkey in ( select p_partkey from part ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop Semi Join (cost=23613854.76..24693102.68 rows=2863 width=4) (actual time=252662.703..884575.874 rows=119994608 loops=1) -> Merge Join (cost=23613854.33..24673749.14 rows=2863 width=12) (actual time=252662.656..458535.952 rows=119994608 loops=1) Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey)) -> Sort (cost=20971812.62..21271799.14 rows=119994608 width=8) (actual time=228806.169..336527.618 rows=119994608 loops=1) Sort Key: lineitem.l_partkey, lineitem.l_suppkey Sort Method: external merge Disk: 2111136kB -> Seq Scan on lineitem (cost=0.00..3228948.08 rows=119994608 width=8) (actual time=0.227..31541.384 rows=119994608 loops=1) -> Materialize (cost=2642034.49..2722034.49 rows=16000000 width=8) (actual time=23856.476..44148.011 rows=120001929 loops=1) -> Sort (cost=2642034.49..2682034.49 rows=16000000 width=8) (actual time=23856.465..28096.959 rows=16000000 loops=1) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey Sort Method: external sort Disk: 281536kB -> Seq Scan on partsupp (cost=0.00..508759.00 rows=16000000 width=8) (actual time=0.013..4056.366 rows=16000000 loops=1) -> Index Only Scan using part_pkey on part (cost=0.43..6.75 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=119994608) Index Cond: (p_partkey = lineitem.l_partkey) Heap Fetches: 119994608 Planning time: 1.414 ms Execution time: 891781.476 ms (17 rows) select ps_suppkey from lineitem, partsupp where l_partkey = ps_partkey and l_suppkey = ps_suppkey and ps_partkey in ( select p_partkey from part where p_name like 'hot%' ); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=3973962.16..4902728.75 rows=35 width=4) (actual time=42931.190..51758.776 rows=1311974 loops=1) Merge Cond: (partsupp.ps_partkey = lineitem.l_partkey) Join Filter: (lineitem.l_suppkey = partsupp.ps_suppkey) Rows Removed by Join Filter: 3935922 -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..764250.44 rows=16000000 width=8) (actual time=0.011..5189.382 rows=15998913 loops=1) -> Sort (cost=3973913.32..3977585.07 rows=1468702 width=12) (actual time=42930.688..43398.341 rows=5247893 loops=1) Sort Key: lineitem.l_partkey Sort Method: quicksort Memory: 110651kB -> Hash Join (cost=132424.70..3823473.36 rows=1468702 width=12) (actual time=899.792..42215.343 rows=1311974 loops=1) Hash Cond: (lineitem.l_partkey = part.p_partkey) -> Seq Scan on lineitem (cost=0.00..3228948.08 rows=119994608 width=8) (actual time=0.005..17835.189 rows=119994608 loops=1) -> Hash (cost=131919.64..131919.64 rows=40405 width=4) (actual time=899.698..899.698 rows=43703 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 7498kB -> Seq Scan on part (cost=0.00..131919.64 rows=40405 width=4) (actual time=0.023..886.782 rows=43703 loops=1) Filter: ((p_name)::text ~~ 'hot%'::text) Rows Removed by Filter: 3956297 Planning time: 1.450 ms Execution time: 51809.901 ms (18 rows) select ps_suppkey from partsupp, ( select l_partkey agg_partkey, l_suppkey agg_suppkey, 0.5 * sum(l_quantity) AS agg_quantity from lineitem where l_shipdate >= date '1997-01-01' and l_shipdate < date '1997-01-01' + interval '1' year group by l_partkey, l_suppkey ) agg_lineitem where agg_partkey = ps_partkey and agg_suppkey = ps_suppkey and ps_partkey in ( select p_partkey from part ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=3039101.38..4313205.40 rows=160 width=4) (actual time=49355.090..220422.294 rows=10897186 loops=1) -> Hash Join (cost=3039100.94..3101066.89 rows=918014 width=12) (actual time=49354.508..69808.651 rows=10897186 loops=1) Hash Cond: (lineitem.l_partkey = part.p_partkey) -> HashAggregate (cost=2867180.80..2885541.08 rows=1836028 width=8) (actual time=47150.671..53455.449 rows=10897186 loops=1) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=470973.36..2775379.44 rows=18360272 width=8) (actual time=7924.016..28786.995 rows=18214751 loops=1) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) Heap Blocks: exact=1924458 -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..466383.29 rows=18360272 width=0) (actual time=6847.338..6847.338 rows=18214751 loops=1) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) -> Hash (cost=121919.51..121919.51 rows=4000051 width=4) (actual time=2196.673..2196.673 rows=4000000 loops=1) Buckets: 4194304 (originally 4194304) Batches: 4 (originally 1) Memory Usage: 229377kB -> Seq Scan on part (cost=0.00..121919.51 rows=4000051 width=4) (actual time=0.012..712.926 rows=4000000 loops=1) -> Index Only Scan using partsupp_pkey on partsupp (cost=0.43..1.31 rows=1 width=8) (actual time=0.013..0.013 rows=1 loops=10897186) Index Cond: ((ps_partkey = lineitem.l_partkey) AND (ps_suppkey = lineitem.l_suppkey)) Heap Fetches: 10897186 Planning time: 0.855 ms Execution time: 221281.967 ms (18 rows) select ps_suppkey from partsupp, ( select l_partkey agg_partkey, l_suppkey agg_suppkey, 0.5 * sum(l_quantity) AS agg_quantity from lineitem where l_shipdate >= date '1997-01-01' and l_shipdate < date '1997-01-01' + interval '1' year group by l_partkey, l_suppkey ) agg_lineitem where agg_partkey = ps_partkey and agg_suppkey = ps_suppkey and ps_partkey in ( select p_partkey from part where p_name like 'hot%' ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=3690996.25..3715358.15 rows=160 width=4) (actual time=79478.109..83008.686 rows=119238 loops=1) Merge Cond: ((partsupp.ps_partkey = agg_lineitem.agg_partkey) AND (partsupp.ps_suppkey = agg_lineitem.agg_suppkey)) -> Sort (cost=596073.10..599603.57 rows=1412188 width=12) (actual time=1648.385..1659.393 rows=174812 loops=1) Sort Key: partsupp.ps_partkey, partsupp.ps_suppkey Sort Method: quicksort Memory: 14339kB -> Nested Loop (cost=0.43..451821.62 rows=1412188 width=12) (actual time=0.071..1548.566 rows=174812 loops=1) -> Seq Scan on part (cost=0.00..131919.64 rows=40405 width=4) (actual time=0.047..1006.884 rows=43703 loops=1) Filter: ((p_name)::text ~~ 'hot%'::text) Rows Removed by Filter: 3956297 -> Index Scan using idx_partsupp_partkey on partsupp (cost=0.43..7.57 rows=35 width=8) (actual time=0.010..0.011 rows=4 loops=43703) Index Cond: (ps_partkey = part.p_partkey) -> Sort (cost=3094923.15..3099513.22 rows=1836028 width=8) (actual time=77829.655..79681.366 rows=10896449 loops=1) Sort Key: agg_lineitem.agg_partkey, agg_lineitem.agg_suppkey Sort Method: external sort Disk: 234352kB -> Subquery Scan on agg_lineitem (cost=2867180.80..2903901.36 rows=1836028 width=8) (actual time=48973.005..55717.145 rows=10897186 loops=1) -> HashAggregate (cost=2867180.80..2885541.08 rows=1836028 width=8) (actual time=48973.003..54254.198 rows=10897186 loops=1) Group Key: lineitem.l_partkey, lineitem.l_suppkey -> Bitmap Heap Scan on lineitem (cost=470973.36..2775379.44 rows=18360272 width=8) (actual time=7918.595..29778.138 rows=18214751 loops=1) Recheck Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) Heap Blocks: exact=1924458 -> Bitmap Index Scan on idx_lineitem_shipdate (cost=0.00..466383.29 rows=18360272 width=0) (actual time=6849.743..6849.743 rows=18214751 loops=1) Index Cond: ((l_shipdate >= '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp without time zone)) Planning time: 0.723 ms Execution time: 83096.504 ms (24 rows)
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers