Hi Ashutosh, Thanks for your reply!
>I first thought enable_hashagg should be sufficient to choose one strategy over the other. But that is not true, enable_hashagg = true allows both the strategies, enable_hashagg = false disables just hash strategy. There's no way to disable group agg alone. So I think it makes sense to have this GUC. Yes, exactly! Thank you for the clarification. >I am surprised that we didn't see this being a problem for so long. I was also wondering why this GUC parameter hadn't been introduced earlier. >We seem to disable mixed strategy when enable_hashagg is false. Do we want to do the same when enable_groupagg = false as well? As I mentioned in my previous email, setting enable_groupagg = false resulted in better execution time compared to the mixed strategy (which, as I understand, includes both HashAgg and GroupAgg). So, I believe this new GUC parameter would be helpful for users in certain situations. Here’s how the current and proposed behaviors compare: Current behavior: enable_hashagg = ON → Mixed enable_hashagg = OFF → GroupAgg After applying the patch: enable_hashagg = ON, enable_groupagg = ON → Mixed enable_hashagg = OFF, enable_groupagg = ON → GroupAgg enable_hashagg = ON, enable_groupagg = OFF → HashAgg (new) enable_hashagg = OFF, enable_groupagg = OFF → GroupAgg In addition, if the both parameters = OFF, GroupAgg will be selected in the patch. The reason is that I found a comment that HashAgg might use too much memory, so I decided to prioritize using GroupAgg, which is more secure. In the last case, I chose to default to GroupAgg since I found a comment suggesting HashAgg might consume excessive memory, so GroupAgg seemed the safer fallback. Some hackers may want to compare the actual execution plans and times under different GUC settings, so I've attached my test results in "test_result.txt". >Some of those instances are for plan stability, all of which need not be replicated. But some of them explicitly test sort based grouping. For rest of them hash based plan seems to be the best one, so explicit enable_groupagg = false is not needed. We will need some test to test the switch though. Thanks for your advice. I'll create a regression test and send a new patch to -hackers in my next email. Regards, Tatsuro Yamada
# 1. Setting: default (enable_hashagg = ON, enable_groupagg = ON) # Mix strategy was used # Execution time was 41 sec ==== QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=159135.83..7790760.21 rows=100 width=66) (actual time=12449.124..41253.731 rows=100.00 loops=1) Buffers: shared hit=116473 read=30167, temp read=1482 written=11035 -> GroupAggregate (cost=159135.83..1689495507.58 rows=22136 width=66) (actual time=12449.121..41253.648 rows=100.00 loops=1) Group Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Buffers: shared hit=116473 read=30167, temp read=1482 written=11035 -> Nested Loop (cost=159135.83..1689494898.84 rows=22136 width=34) (actual time=12317.664..41252.393 rows=101.00 loops=1) Join Filter: (cd.cd_demo_sk = c.c_current_cdemo_sk) Rows Removed by Join Filter: 197188565 Buffers: shared hit=116473 read=30167, temp read=1482 written=11035 -> Gather Merge (cost=125190.47..348899.28 rows=1920800 width=38) (actual time=10122.049..10269.869 rows=34276.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=4889 read=11029, temp read=1482 written=11035 -> Sort (cost=124190.45..126191.28 rows=800333 width=38) (actual time=9985.641..10044.083 rows=11982.00 loops=3) Sort Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Sort Method: external merge Disk: 28488kB Buffers: shared hit=4889 read=11029, temp read=1482 written=11035 Worker 0: Sort Method: external merge Disk: 29392kB Worker 1: Sort Method: external merge Disk: 30112kB -> Parallel Seq Scan on customer_demographics cd (cost=0.00..23831.33 rows=800333 width=38) (actual time=0.377..852.678 rows=640266.67 loops=3) Buffers: shared hit=4799 read=11029 -> Materialize (cost=33945.36..1051363622.90 rows=22136 width=4) (actual time=0.052..0.337 rows=5752.97 loops=34276) Storage: Memory Maximum Storage: 289kB Buffers: shared hit=111584 read=19138 -> Nested Loop (cost=33945.36..1051363512.22 rows=22136 width=4) (actual time=1785.640..2183.175 rows=5753.00 loops=1) Buffers: shared hit=111584 read=19138 -> Nested Loop (cost=33945.07..1051355209.75 rows=24605 width=8) (actual time=1785.567..2119.781 rows=6409.00 loops=1) Buffers: shared hit=92357 read=19138 -> HashAggregate (cost=33944.78..34272.85 rows=32807 width=4) (actual time=819.215..842.604 rows=28544.00 loops=1) Group Key: ss.ss_customer_sk Batches: 1 Memory Usage: 2329kB Buffers: shared hit=2187 read=11744 -> Gather (cost=2683.76..33862.76 rows=32807 width=4) (actual time=23.253..768.211 rows=33609.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2187 read=11744 -> Hash Join (cost=1683.76..29582.06 rows=13670 width=4) (actual time=22.719..780.073 rows=11203.00 loops=3) Hash Cond: (ss.ss_sold_date_sk = d.d_date_sk) Buffers: shared hit=2187 read=11744 -> Parallel Seq Scan on store_sales ss (cost=0.00..24747.68 rows=1200168 width=8) (actual time=5.144..424.772 rows=960134.67 loops=3) Buffers: shared hit=1002 read=11744 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=17.501..17.502 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d (cost=0.00..1673.36 rows=832 width=4) (actual time=0.024..17.253 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_pkey on customer c (cost=0.29..36766.33 rows=1 width=12) (actual time=0.044..0.044 rows=0.22 loops=28544) Index Cond: (c_customer_sk = ss.ss_customer_sk) Filter: ((ANY (c_customer_sk = (hashed SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1))) Rows Removed by Filter: 1 Index Searches: 28544 Buffers: shared hit=90170 read=7394 SubPlan 2 -> Gather (cost=2683.76..10471.46 rows=8194 width=4) (actual time=18.769..103.477 rows=8156.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2496 read=1873 -> Hash Join (cost=1683.76..8652.06 rows=3414 width=4) (actual time=27.114..97.488 rows=2718.67 loops=3) Hash Cond: (ws.ws_sold_date_sk = d_1.d_date_sk) Buffers: shared hit=2496 read=1873 -> Parallel Seq Scan on web_sales ws (cost=0.00..6181.43 rows=299743 width=8) (actual time=0.005..26.446 rows=239794.67 loops=3) Buffers: shared hit=1311 read=1873 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=27.079..27.080 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_1 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.033..26.791 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 SubPlan 4 -> Gather (cost=2683.76..18287.89 rows=16419 width=4) (actual time=11.472..831.348 rows=16874.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2043 read=5521 -> Hash Join (cost=1683.76..15645.99 rows=6841 width=4) (actual time=18.591..824.234 rows=5624.67 loops=3) Hash Cond: (cs.cs_sold_date_sk = d_2.d_date_sk) Buffers: shared hit=2043 read=5521 -> Parallel Seq Scan on catalog_sales cs (cost=0.00..12385.45 rows=600645 width=8) (actual time=0.334..455.274 rows=480516.00 loops=3) Buffers: shared hit=858 read=5521 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=18.205..18.206 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_2 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.048..17.917 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_address_pkey on customer_address ca (cost=0.29..0.34 rows=1 width=4) (actual time=0.009..0.009 rows=0.90 loops=6409) Index Cond: (ca_address_sk = c.c_current_addr_sk) Filter: ((ca_county)::text = ANY ('{"Rush County","Toole County","Jefferson County","Dona Ana County","La Porte County"}'::text[])) Rows Removed by Filter: 0 Index Searches: 6409 Buffers: shared hit=19227 Planning: Buffers: shared hit=316 Planning Time: 5.683 ms Execution Time: 41263.299 ms (98 rows) ==== # 2. Setting: enable_hashagg = OFF, enable_groupagg = ON # GroupAgg was used # Execution time was 44 sec ==== QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=161514.62..7793138.25 rows=100 width=66) (actual time=10197.188..44152.766 rows=100.00 loops=1) Buffers: shared hit=118758 read=27882, temp read=1482 written=11040 -> GroupAggregate (cost=161514.62..1689497722.34 rows=22136 width=66) (actual time=10197.187..44152.676 rows=100.00 loops=1) Group Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Buffers: shared hit=118758 read=27882, temp read=1482 written=11040 -> Nested Loop (cost=161514.62..1689497113.60 rows=22136 width=34) (actual time=10129.162..44151.929 rows=101.00 loops=1) Join Filter: (cd.cd_demo_sk = c.c_current_cdemo_sk) Rows Removed by Join Filter: 197184348 Buffers: shared hit=118758 read=27882, temp read=1482 written=11040 -> Gather Merge (cost=125190.47..348899.28 rows=1920800 width=38) (actual time=8741.033..8900.320 rows=34276.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=5469 read=10449, temp read=1482 written=11040 -> Sort (cost=124190.45..126191.28 rows=800333 width=38) (actual time=8499.285..8571.085 rows=11989.67 loops=3) Sort Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Sort Method: external merge Disk: 30368kB Buffers: shared hit=5469 read=10449, temp read=1482 written=11040 Worker 0: Sort Method: external merge Disk: 29104kB Worker 1: Sort Method: external merge Disk: 28560kB -> Parallel Seq Scan on customer_demographics cd (cost=0.00..23831.33 rows=800333 width=38) (actual time=2.220..766.199 rows=640266.67 loops=3) Buffers: shared hit=5379 read=10449 -> Materialize (cost=36324.15..1051365837.65 rows=22136 width=4) (actual time=0.032..0.363 rows=5752.84 loops=34276) Storage: Memory Maximum Storage: 289kB Buffers: shared hit=113289 read=17433 -> Nested Loop (cost=36324.15..1051365726.97 rows=22136 width=4) (actual time=1084.423..1391.426 rows=5753.00 loops=1) Buffers: shared hit=113289 read=17433 -> Nested Loop (cost=36323.86..1051357424.51 rows=24605 width=8) (actual time=1084.375..1346.642 rows=6409.00 loops=1) Buffers: shared hit=94062 read=17433 -> Unique (cost=36323.57..36487.60 rows=32807 width=4) (actual time=545.023..574.883 rows=28544.00 loops=1) Buffers: shared hit=2031 read=11900 -> Sort (cost=36323.57..36405.58 rows=32807 width=4) (actual time=545.019..556.733 rows=33609.00 loops=1) Sort Key: ss.ss_customer_sk Sort Method: quicksort Memory: 1537kB Buffers: shared hit=2031 read=11900 -> Gather (cost=2683.76..33862.76 rows=32807 width=4) (actual time=33.440..529.429 rows=33609.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2031 read=11900 -> Hash Join (cost=1683.76..29582.06 rows=13670 width=4) (actual time=37.334..511.304 rows=11203.00 loops=3) Hash Cond: (ss.ss_sold_date_sk = d.d_date_sk) Buffers: shared hit=2031 read=11900 -> Parallel Seq Scan on store_sales ss (cost=0.00..24747.68 rows=1200168 width=8) (actual time=4.599..258.454 rows=960134.67 loops=3) Buffers: shared hit=846 read=11900 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=32.247..32.248 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d (cost=0.00..1673.36 rows=832 width=4) (actual time=0.024..31.303 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_pkey on customer c (cost=0.29..36766.33 rows=1 width=12) (actual time=0.026..0.026 rows=0.22 loops=28544) Index Cond: (c_customer_sk = ss.ss_customer_sk) Filter: ((ANY (c_customer_sk = (hashed SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1))) Rows Removed by Filter: 1 Index Searches: 28544 Buffers: shared hit=92031 read=5533 SubPlan 2 -> Gather (cost=2683.76..10471.46 rows=8194 width=4) (actual time=28.354..100.442 rows=8156.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=4369 -> Hash Join (cost=1683.76..8652.06 rows=3414 width=4) (actual time=15.023..74.578 rows=2718.67 loops=3) Hash Cond: (ws.ws_sold_date_sk = d_1.d_date_sk) Buffers: shared hit=4369 -> Parallel Seq Scan on web_sales ws (cost=0.00..6181.43 rows=299743 width=8) (actual time=0.007..19.459 rows=239794.67 loops=3) Buffers: shared hit=3184 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=14.987..14.987 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_1 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.016..14.672 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 SubPlan 4 -> Gather (cost=2683.76..18287.89 rows=16419 width=4) (actual time=69.799..443.128 rows=16874.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2031 read=5533 -> Hash Join (cost=1683.76..15645.99 rows=6841 width=4) (actual time=66.236..401.988 rows=5624.67 loops=3) Hash Cond: (cs.cs_sold_date_sk = d_2.d_date_sk) Buffers: shared hit=2031 read=5533 -> Parallel Seq Scan on catalog_sales cs (cost=0.00..12385.45 rows=600645 width=8) (actual time=10.580..208.806 rows=480516.00 loops=3) Buffers: shared hit=846 read=5533 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=55.629..55.630 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_2 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.021..53.491 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_address_pkey on customer_address ca (cost=0.29..0.34 rows=1 width=4) (actual time=0.006..0.006 rows=0.90 loops=6409) Index Cond: (ca_address_sk = c.c_current_addr_sk) Filter: ((ca_county)::text = ANY ('{"Rush County","Toole County","Jefferson County","Dona Ana County","La Porte County"}'::text[])) Rows Removed by Filter: 0 Index Searches: 6409 Buffers: shared hit=19227 Settings: enable_hashagg = 'off' Planning: Buffers: shared hit=316 Planning Time: 2.101 ms Execution Time: 44168.266 ms (101 rows) ==== # 3. Setting: enable_hashagg = ON, enable_groupagg = OFF # HashAgg was used # Execution time was 1 sec ==== QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=1051399723.16..1051399723.41 rows=100 width=66) (actual time=1096.948..1097.642 rows=100.00 loops=1) Buffers: shared hit=133811 read=19929 -> Sort (cost=1051399723.16..1051399778.50 rows=22136 width=66) (actual time=1096.946..1097.633 rows=100.00 loops=1) Sort Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Sort Method: top-N heapsort Memory: 49kB Buffers: shared hit=133811 read=19929 -> HashAggregate (cost=1051398655.78..1051398877.14 rows=22136 width=66) (actual time=1087.378..1090.664 rows=5740.00 loops=1) Group Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Batches: 1 Memory Usage: 1049kB Buffers: shared hit=133805 read=19929 -> Nested Loop (cost=33945.79..1051398268.40 rows=22136 width=34) (actual time=702.467..1076.172 rows=5753.00 loops=1) Buffers: shared hit=133805 read=19929 -> Nested Loop (cost=33945.36..1051363512.22 rows=22136 width=4) (actual time=702.451..958.299 rows=5753.00 loops=1) Buffers: shared hit=114045 read=16677 -> Nested Loop (cost=33945.07..1051355209.75 rows=24605 width=8) (actual time=702.387..921.978 rows=6409.00 loops=1) Buffers: shared hit=94818 read=16677 -> HashAggregate (cost=33944.78..34272.85 rows=32807 width=4) (actual time=354.993..373.328 rows=28544.00 loops=1) Group Key: ss.ss_customer_sk Batches: 1 Memory Usage: 2329kB Buffers: shared hit=2223 read=11708 -> Gather (cost=2683.76..33862.76 rows=32807 width=4) (actual time=17.988..336.471 rows=33609.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2223 read=11708 -> Hash Join (cost=1683.76..29582.06 rows=13670 width=4) (actual time=16.964..337.747 rows=11203.00 loops=3) Hash Cond: (ss.ss_sold_date_sk = d.d_date_sk) Buffers: shared hit=2223 read=11708 -> Parallel Seq Scan on store_sales ss (cost=0.00..24747.68 rows=1200168 width=8) (actual time=0.394..157.640 rows=960134.67 loops=3) Buffers: shared hit=1038 read=11708 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=16.492..16.493 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d (cost=0.00..1673.36 rows=832 width=4) (actual time=0.021..15.934 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_pkey on customer c (cost=0.29..36766.33 rows=1 width=12) (actual time=0.018..0.018 rows=0.22 loops=28544) Index Cond: (c_customer_sk = ss.ss_customer_sk) Filter: ((ANY (c_customer_sk = (hashed SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1))) Rows Removed by Filter: 1 Index Searches: 28544 Buffers: shared hit=92595 read=4969 SubPlan 2 -> Gather (cost=2683.76..10471.46 rows=8194 width=4) (actual time=72.010..153.327 rows=8156.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=4369 -> Hash Join (cost=1683.76..8652.06 rows=3414 width=4) (actual time=44.282..111.007 rows=2718.67 loops=3) Hash Cond: (ws.ws_sold_date_sk = d_1.d_date_sk) Buffers: shared hit=4369 -> Parallel Seq Scan on web_sales ws (cost=0.00..6181.43 rows=299743 width=8) (actual time=0.006..22.145 rows=239794.67 loops=3) Buffers: shared hit=3184 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=44.238..44.239 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_1 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.015..42.452 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 SubPlan 4 -> Gather (cost=2683.76..18287.89 rows=16419 width=4) (actual time=34.268..183.058 rows=16874.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2595 read=4969 -> Hash Join (cost=1683.76..15645.99 rows=6841 width=4) (actual time=30.132..174.422 rows=5624.67 loops=3) Hash Cond: (cs.cs_sold_date_sk = d_2.d_date_sk) Buffers: shared hit=2595 read=4969 -> Parallel Seq Scan on catalog_sales cs (cost=0.00..12385.45 rows=600645 width=8) (actual time=3.432..69.160 rows=480516.00 loops=3) Buffers: shared hit=1410 read=4969 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=26.671..26.672 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_2 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.019..26.377 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_address_pkey on customer_address ca (cost=0.29..0.34 rows=1 width=4) (actual time=0.005..0.005 rows=0.90 loops=6409) Index Cond: (ca_address_sk = c.c_current_addr_sk) Filter: ((ca_county)::text = ANY ('{"Rush County","Toole County","Jefferson County","Dona Ana County","La Porte County"}'::text[])) Rows Removed by Filter: 0 Index Searches: 6409 Buffers: shared hit=19227 -> Index Scan using customer_demographics_pkey on customer_demographics cd (cost=0.43..1.57 rows=1 width=38) (actual time=0.019..0.019 rows=1.00 loops=5753) Index Cond: (cd_demo_sk = c.c_current_cdemo_sk) Index Searches: 5753 Buffers: shared hit=19760 read=3252 Settings: enable_groupagg = 'off' Planning: Buffers: shared hit=316 Planning Time: 21.057 ms Execution Time: 1105.672 ms (91 rows) ==== # 4. Setting: enable_hashagg = OFF, enable_groupagg = OFF # GroupAgg was used # Execution time was 38 sec ==== QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=161514.62..7793138.25 rows=100 width=66) (actual time=7611.145..38354.564 rows=100.00 loops=1) Buffers: shared hit=119783 read=26857, temp read=1482 written=11039 -> GroupAggregate (cost=161514.62..1689497722.34 rows=22136 width=66) (actual time=7611.128..38354.435 rows=100.00 loops=1) Group Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Buffers: shared hit=119783 read=26857, temp read=1482 written=11039 -> Nested Loop (cost=161514.62..1689497113.60 rows=22136 width=34) (actual time=7510.502..38353.114 rows=101.00 loops=1) Join Filter: (cd.cd_demo_sk = c.c_current_cdemo_sk) Rows Removed by Join Filter: 197184348 Buffers: shared hit=119783 read=26857, temp read=1482 written=11039 -> Gather Merge (cost=125190.47..348899.28 rows=1920800 width=38) (actual time=6363.769..6499.733 rows=34276.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=5833 read=10085, temp read=1482 written=11039 -> Sort (cost=124190.45..126191.28 rows=800333 width=38) (actual time=6163.457..6232.614 rows=11966.67 loops=3) Sort Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count Sort Method: external merge Disk: 28992kB Buffers: shared hit=5833 read=10085, temp read=1482 written=11039 Worker 0: Sort Method: external merge Disk: 28752kB Worker 1: Sort Method: external merge Disk: 30280kB -> Parallel Seq Scan on customer_demographics cd (cost=0.00..23831.33 rows=800333 width=38) (actual time=0.737..321.713 rows=640266.67 loops=3) Buffers: shared hit=5743 read=10085 -> Materialize (cost=36324.15..1051365837.65 rows=22136 width=4) (actual time=0.025..0.325 rows=5752.84 loops=34276) Storage: Memory Maximum Storage: 289kB Buffers: shared hit=113950 read=16772 -> Nested Loop (cost=36324.15..1051365726.97 rows=22136 width=4) (actual time=865.332..1133.116 rows=5753.00 loops=1) Buffers: shared hit=113950 read=16772 -> Nested Loop (cost=36323.86..1051357424.51 rows=24605 width=8) (actual time=863.365..1073.375 rows=6409.00 loops=1) Buffers: shared hit=94723 read=16772 -> Unique (cost=36323.57..36487.60 rows=32807 width=4) (actual time=393.964..415.541 rows=28544.00 loops=1) Buffers: shared hit=2128 read=11803 -> Sort (cost=36323.57..36405.58 rows=32807 width=4) (actual time=393.957..402.146 rows=33609.00 loops=1) Sort Key: ss.ss_customer_sk Sort Method: quicksort Memory: 1537kB Buffers: shared hit=2128 read=11803 -> Gather (cost=2683.76..33862.76 rows=32807 width=4) (actual time=80.713..349.762 rows=33609.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2128 read=11803 -> Hash Join (cost=1683.76..29582.06 rows=13670 width=4) (actual time=44.679..308.141 rows=11203.00 loops=3) Hash Cond: (ss.ss_sold_date_sk = d.d_date_sk) Buffers: shared hit=2128 read=11803 -> Parallel Seq Scan on store_sales ss (cost=0.00..24747.68 rows=1200168 width=8) (actual time=9.033..125.081 rows=960134.67 loops=3) Buffers: shared hit=943 read=11803 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=35.605..35.606 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d (cost=0.00..1673.36 rows=832 width=4) (actual time=0.036..35.172 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_pkey on customer c (cost=0.29..36766.33 rows=1 width=12) (actual time=0.022..0.022 rows=0.22 loops=28544) Index Cond: (c_customer_sk = ss.ss_customer_sk) Filter: ((ANY (c_customer_sk = (hashed SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1))) Rows Removed by Filter: 1 Index Searches: 28544 Buffers: shared hit=92595 read=4969 SubPlan 2 -> Gather (cost=2683.76..10471.46 rows=8194 width=4) (actual time=39.476..124.540 rows=8156.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=4369 -> Hash Join (cost=1683.76..8652.06 rows=3414 width=4) (actual time=20.456..94.877 rows=2718.67 loops=3) Hash Cond: (ws.ws_sold_date_sk = d_1.d_date_sk) Buffers: shared hit=4369 -> Parallel Seq Scan on web_sales ws (cost=0.00..6181.43 rows=299743 width=8) (actual time=0.020..28.916 rows=239794.67 loops=3) Buffers: shared hit=3184 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=20.396..20.397 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_1 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.019..20.094 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 SubPlan 4 -> Gather (cost=2683.76..18287.89 rows=16419 width=4) (actual time=56.442..326.649 rows=16874.00 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=2595 read=4969 -> Hash Join (cost=1683.76..15645.99 rows=6841 width=4) (actual time=64.537..301.087 rows=5624.67 loops=3) Hash Cond: (cs.cs_sold_date_sk = d_2.d_date_sk) Buffers: shared hit=2595 read=4969 -> Parallel Seq Scan on catalog_sales cs (cost=0.00..12385.45 rows=600645 width=8) (actual time=11.234..130.365 rows=480516.00 loops=3) Buffers: shared hit=1410 read=4969 -> Hash (cost=1673.36..1673.36 rows=832 width=4) (actual time=53.242..53.243 rows=848.00 loops=3) Buckets: 1024 Batches: 1 Memory Usage: 38kB Buffers: shared hit=1185 -> Seq Scan on date_dim d_2 (cost=0.00..1673.36 rows=832 width=4) (actual time=0.025..52.328 rows=848.00 loops=3) Filter: ((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002)) Rows Removed by Filter: 72201 Buffers: shared hit=1185 -> Index Scan using customer_address_pkey on customer_address ca (cost=0.29..0.34 rows=1 width=4) (actual time=0.009..0.009 rows=0.90 loops=6409) Index Cond: (ca_address_sk = c.c_current_addr_sk) Filter: ((ca_county)::text = ANY ('{"Rush County","Toole County","Jefferson County","Dona Ana County","La Porte County"}'::text[])) Rows Removed by Filter: 0 Index Searches: 6409 Buffers: shared hit=19227 Settings: enable_hashagg = 'off', enable_groupagg = 'off' Planning: Buffers: shared hit=316 Planning Time: 10.736 ms Execution Time: 38358.196 ms (101 rows) ====