Hi Ashutosh,

Thanks for your reply!

>I first thought enable_hashagg should be sufficient to choose one strategy
over the other. But that is not true, enable_hashagg = true allows both the
strategies, enable_hashagg = false disables just hash strategy. There's no
way to disable group agg alone. So I think it makes sense to have this GUC.

Yes, exactly! Thank you for the clarification.


>I am surprised that we didn't see this being a problem for so long.

I was also wondering why this GUC parameter hadn't been introduced
earlier.


>We seem to disable mixed strategy when enable_hashagg is false. Do we want
to do the same when enable_groupagg = false as well?

As I mentioned in my previous email, setting enable_groupagg = false
resulted in better execution time compared to the mixed strategy (which,
as I understand, includes both HashAgg and GroupAgg). So, I believe
this new GUC parameter would be helpful for users in certain situations.

Here’s how the current and proposed behaviors compare:

Current behavior:
    enable_hashagg = ON  → Mixed
    enable_hashagg = OFF → GroupAgg

After applying the patch:
    enable_hashagg = ON,  enable_groupagg = ON  → Mixed
    enable_hashagg = OFF, enable_groupagg = ON  → GroupAgg
    enable_hashagg = ON,  enable_groupagg = OFF → HashAgg (new)
    enable_hashagg = OFF, enable_groupagg = OFF → GroupAgg

In addition, if the both parameters = OFF, GroupAgg will be selected in
the patch. The reason is that I found a comment that HashAgg might use
too much memory, so I decided to prioritize using GroupAgg, which is
more secure.

In the last case, I chose to default to GroupAgg since I found a
comment suggesting HashAgg might consume excessive memory, so GroupAgg
seemed the safer fallback.

Some hackers may want to compare the actual execution plans and times
under different GUC settings, so I've attached my test results in
"test_result.txt".


>Some of those instances are for plan stability, all of which need not be
replicated. But some of them explicitly test sort based grouping. For rest
of them hash based plan seems to be the best one, so explicit
enable_groupagg = false is not needed. We will need some test to test the
switch though.

Thanks for your advice. I'll create a regression test and send a new patch
to -hackers in my next email.

Regards,
Tatsuro Yamada
# 1. Setting: default (enable_hashagg = ON,  enable_groupagg = ON)
# Mix strategy was used
# Execution time was 41 sec
====
                                                                                
          QUERY PLAN                                                            
                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=159135.83..7790760.21 rows=100 width=66) (actual 
time=12449.124..41253.731 rows=100.00 loops=1)
   Buffers: shared hit=116473 read=30167, temp read=1482 written=11035
   ->  GroupAggregate  (cost=159135.83..1689495507.58 rows=22136 width=66) 
(actual time=12449.121..41253.648 rows=100.00 loops=1)
         Group Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, 
cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
         Buffers: shared hit=116473 read=30167, temp read=1482 written=11035
         ->  Nested Loop  (cost=159135.83..1689494898.84 rows=22136 width=34) 
(actual time=12317.664..41252.393 rows=101.00 loops=1)
               Join Filter: (cd.cd_demo_sk = c.c_current_cdemo_sk)
               Rows Removed by Join Filter: 197188565
               Buffers: shared hit=116473 read=30167, temp read=1482 
written=11035
               ->  Gather Merge  (cost=125190.47..348899.28 rows=1920800 
width=38) (actual time=10122.049..10269.869 rows=34276.00 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     Buffers: shared hit=4889 read=11029, temp read=1482 
written=11035
                     ->  Sort  (cost=124190.45..126191.28 rows=800333 width=38) 
(actual time=9985.641..10044.083 rows=11982.00 loops=3)
                           Sort Key: cd.cd_gender, cd.cd_marital_status, 
cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
                           Sort Method: external merge  Disk: 28488kB
                           Buffers: shared hit=4889 read=11029, temp read=1482 
written=11035
                           Worker 0:  Sort Method: external merge  Disk: 29392kB
                           Worker 1:  Sort Method: external merge  Disk: 30112kB
                           ->  Parallel Seq Scan on customer_demographics cd  
(cost=0.00..23831.33 rows=800333 width=38) (actual time=0.377..852.678 
rows=640266.67 loops=3)
                                 Buffers: shared hit=4799 read=11029
               ->  Materialize  (cost=33945.36..1051363622.90 rows=22136 
width=4) (actual time=0.052..0.337 rows=5752.97 loops=34276)
                     Storage: Memory  Maximum Storage: 289kB
                     Buffers: shared hit=111584 read=19138
                     ->  Nested Loop  (cost=33945.36..1051363512.22 rows=22136 
width=4) (actual time=1785.640..2183.175 rows=5753.00 loops=1)
                           Buffers: shared hit=111584 read=19138
                           ->  Nested Loop  (cost=33945.07..1051355209.75 
rows=24605 width=8) (actual time=1785.567..2119.781 rows=6409.00 loops=1)
                                 Buffers: shared hit=92357 read=19138
                                 ->  HashAggregate  (cost=33944.78..34272.85 
rows=32807 width=4) (actual time=819.215..842.604 rows=28544.00 loops=1)
                                       Group Key: ss.ss_customer_sk
                                       Batches: 1  Memory Usage: 2329kB
                                       Buffers: shared hit=2187 read=11744
                                       ->  Gather  (cost=2683.76..33862.76 
rows=32807 width=4) (actual time=23.253..768.211 rows=33609.00 loops=1)
                                             Workers Planned: 2
                                             Workers Launched: 2
                                             Buffers: shared hit=2187 read=11744
                                             ->  Hash Join  
(cost=1683.76..29582.06 rows=13670 width=4) (actual time=22.719..780.073 
rows=11203.00 loops=3)
                                                   Hash Cond: 
(ss.ss_sold_date_sk = d.d_date_sk)
                                                   Buffers: shared hit=2187 
read=11744
                                                   ->  Parallel Seq Scan on 
store_sales ss  (cost=0.00..24747.68 rows=1200168 width=8) (actual 
time=5.144..424.772 rows=960134.67 loops=3)
                                                         Buffers: shared 
hit=1002 read=11744
                                                   ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=17.501..17.502 
rows=848.00 loops=3)
                                                         Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                         Buffers: shared 
hit=1185
                                                         ->  Seq Scan on 
date_dim d  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.024..17.253 
rows=848.00 loops=3)
                                                               Filter: ((d_moy 
>= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                               Rows Removed by 
Filter: 72201
                                                               Buffers: shared 
hit=1185
                                 ->  Index Scan using customer_pkey on customer 
c  (cost=0.29..36766.33 rows=1 width=12) (actual time=0.044..0.044 rows=0.22 
loops=28544)
                                       Index Cond: (c_customer_sk = 
ss.ss_customer_sk)
                                       Filter: ((ANY (c_customer_sk = (hashed 
SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1)))
                                       Rows Removed by Filter: 1
                                       Index Searches: 28544
                                       Buffers: shared hit=90170 read=7394
                                       SubPlan 2
                                         ->  Gather  (cost=2683.76..10471.46 
rows=8194 width=4) (actual time=18.769..103.477 rows=8156.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=2496 
read=1873
                                               ->  Hash Join  
(cost=1683.76..8652.06 rows=3414 width=4) (actual time=27.114..97.488 
rows=2718.67 loops=3)
                                                     Hash Cond: 
(ws.ws_sold_date_sk = d_1.d_date_sk)
                                                     Buffers: shared hit=2496 
read=1873
                                                     ->  Parallel Seq Scan on 
web_sales ws  (cost=0.00..6181.43 rows=299743 width=8) (actual 
time=0.005..26.446 rows=239794.67 loops=3)
                                                           Buffers: shared 
hit=1311 read=1873
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=27.079..27.080 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_1  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.033..26.791 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                                       SubPlan 4
                                         ->  Gather  (cost=2683.76..18287.89 
rows=16419 width=4) (actual time=11.472..831.348 rows=16874.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=2043 
read=5521
                                               ->  Hash Join  
(cost=1683.76..15645.99 rows=6841 width=4) (actual time=18.591..824.234 
rows=5624.67 loops=3)
                                                     Hash Cond: 
(cs.cs_sold_date_sk = d_2.d_date_sk)
                                                     Buffers: shared hit=2043 
read=5521
                                                     ->  Parallel Seq Scan on 
catalog_sales cs  (cost=0.00..12385.45 rows=600645 width=8) (actual 
time=0.334..455.274 rows=480516.00 loops=3)
                                                           Buffers: shared 
hit=858 read=5521
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=18.205..18.206 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_2  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.048..17.917 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                           ->  Index Scan using customer_address_pkey on 
customer_address ca  (cost=0.29..0.34 rows=1 width=4) (actual time=0.009..0.009 
rows=0.90 loops=6409)
                                 Index Cond: (ca_address_sk = 
c.c_current_addr_sk)
                                 Filter: ((ca_county)::text = ANY ('{"Rush 
County","Toole County","Jefferson County","Dona Ana County","La Porte 
County"}'::text[]))
                                 Rows Removed by Filter: 0
                                 Index Searches: 6409
                                 Buffers: shared hit=19227
 Planning:
   Buffers: shared hit=316
 Planning Time: 5.683 ms
 Execution Time: 41263.299 ms
(98 rows)
====



# 2. Setting: enable_hashagg = OFF,  enable_groupagg = ON
# GroupAgg was used
# Execution time was 44 sec
====
                                                                                
          QUERY PLAN                                                            
                                
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=161514.62..7793138.25 rows=100 width=66) (actual 
time=10197.188..44152.766 rows=100.00 loops=1)
   Buffers: shared hit=118758 read=27882, temp read=1482 written=11040
   ->  GroupAggregate  (cost=161514.62..1689497722.34 rows=22136 width=66) 
(actual time=10197.187..44152.676 rows=100.00 loops=1)
         Group Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, 
cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
         Buffers: shared hit=118758 read=27882, temp read=1482 written=11040
         ->  Nested Loop  (cost=161514.62..1689497113.60 rows=22136 width=34) 
(actual time=10129.162..44151.929 rows=101.00 loops=1)
               Join Filter: (cd.cd_demo_sk = c.c_current_cdemo_sk)
               Rows Removed by Join Filter: 197184348
               Buffers: shared hit=118758 read=27882, temp read=1482 
written=11040
               ->  Gather Merge  (cost=125190.47..348899.28 rows=1920800 
width=38) (actual time=8741.033..8900.320 rows=34276.00 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     Buffers: shared hit=5469 read=10449, temp read=1482 
written=11040
                     ->  Sort  (cost=124190.45..126191.28 rows=800333 width=38) 
(actual time=8499.285..8571.085 rows=11989.67 loops=3)
                           Sort Key: cd.cd_gender, cd.cd_marital_status, 
cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
                           Sort Method: external merge  Disk: 30368kB
                           Buffers: shared hit=5469 read=10449, temp read=1482 
written=11040
                           Worker 0:  Sort Method: external merge  Disk: 29104kB
                           Worker 1:  Sort Method: external merge  Disk: 28560kB
                           ->  Parallel Seq Scan on customer_demographics cd  
(cost=0.00..23831.33 rows=800333 width=38) (actual time=2.220..766.199 
rows=640266.67 loops=3)
                                 Buffers: shared hit=5379 read=10449
               ->  Materialize  (cost=36324.15..1051365837.65 rows=22136 
width=4) (actual time=0.032..0.363 rows=5752.84 loops=34276)
                     Storage: Memory  Maximum Storage: 289kB
                     Buffers: shared hit=113289 read=17433
                     ->  Nested Loop  (cost=36324.15..1051365726.97 rows=22136 
width=4) (actual time=1084.423..1391.426 rows=5753.00 loops=1)
                           Buffers: shared hit=113289 read=17433
                           ->  Nested Loop  (cost=36323.86..1051357424.51 
rows=24605 width=8) (actual time=1084.375..1346.642 rows=6409.00 loops=1)
                                 Buffers: shared hit=94062 read=17433
                                 ->  Unique  (cost=36323.57..36487.60 
rows=32807 width=4) (actual time=545.023..574.883 rows=28544.00 loops=1)
                                       Buffers: shared hit=2031 read=11900
                                       ->  Sort  (cost=36323.57..36405.58 
rows=32807 width=4) (actual time=545.019..556.733 rows=33609.00 loops=1)
                                             Sort Key: ss.ss_customer_sk
                                             Sort Method: quicksort  Memory: 
1537kB
                                             Buffers: shared hit=2031 read=11900
                                             ->  Gather  
(cost=2683.76..33862.76 rows=32807 width=4) (actual time=33.440..529.429 
rows=33609.00 loops=1)
                                                   Workers Planned: 2
                                                   Workers Launched: 2
                                                   Buffers: shared hit=2031 
read=11900
                                                   ->  Hash Join  
(cost=1683.76..29582.06 rows=13670 width=4) (actual time=37.334..511.304 
rows=11203.00 loops=3)
                                                         Hash Cond: 
(ss.ss_sold_date_sk = d.d_date_sk)
                                                         Buffers: shared 
hit=2031 read=11900
                                                         ->  Parallel Seq Scan 
on store_sales ss  (cost=0.00..24747.68 rows=1200168 width=8) (actual 
time=4.599..258.454 rows=960134.67 loops=3)
                                                               Buffers: shared 
hit=846 read=11900
                                                         ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=32.247..32.248 
rows=848.00 loops=3)
                                                               Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                               Buffers: shared 
hit=1185
                                                               ->  Seq Scan on 
date_dim d  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.024..31.303 
rows=848.00 loops=3)
                                                                     Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                     Rows 
Removed by Filter: 72201
                                                                     Buffers: 
shared hit=1185
                                 ->  Index Scan using customer_pkey on customer 
c  (cost=0.29..36766.33 rows=1 width=12) (actual time=0.026..0.026 rows=0.22 
loops=28544)
                                       Index Cond: (c_customer_sk = 
ss.ss_customer_sk)
                                       Filter: ((ANY (c_customer_sk = (hashed 
SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1)))
                                       Rows Removed by Filter: 1
                                       Index Searches: 28544
                                       Buffers: shared hit=92031 read=5533
                                       SubPlan 2
                                         ->  Gather  (cost=2683.76..10471.46 
rows=8194 width=4) (actual time=28.354..100.442 rows=8156.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=4369
                                               ->  Hash Join  
(cost=1683.76..8652.06 rows=3414 width=4) (actual time=15.023..74.578 
rows=2718.67 loops=3)
                                                     Hash Cond: 
(ws.ws_sold_date_sk = d_1.d_date_sk)
                                                     Buffers: shared hit=4369
                                                     ->  Parallel Seq Scan on 
web_sales ws  (cost=0.00..6181.43 rows=299743 width=8) (actual 
time=0.007..19.459 rows=239794.67 loops=3)
                                                           Buffers: shared 
hit=3184
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=14.987..14.987 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_1  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.016..14.672 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                                       SubPlan 4
                                         ->  Gather  (cost=2683.76..18287.89 
rows=16419 width=4) (actual time=69.799..443.128 rows=16874.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=2031 
read=5533
                                               ->  Hash Join  
(cost=1683.76..15645.99 rows=6841 width=4) (actual time=66.236..401.988 
rows=5624.67 loops=3)
                                                     Hash Cond: 
(cs.cs_sold_date_sk = d_2.d_date_sk)
                                                     Buffers: shared hit=2031 
read=5533
                                                     ->  Parallel Seq Scan on 
catalog_sales cs  (cost=0.00..12385.45 rows=600645 width=8) (actual 
time=10.580..208.806 rows=480516.00 loops=3)
                                                           Buffers: shared 
hit=846 read=5533
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=55.629..55.630 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_2  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.021..53.491 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                           ->  Index Scan using customer_address_pkey on 
customer_address ca  (cost=0.29..0.34 rows=1 width=4) (actual time=0.006..0.006 
rows=0.90 loops=6409)
                                 Index Cond: (ca_address_sk = 
c.c_current_addr_sk)
                                 Filter: ((ca_county)::text = ANY ('{"Rush 
County","Toole County","Jefferson County","Dona Ana County","La Porte 
County"}'::text[]))
                                 Rows Removed by Filter: 0
                                 Index Searches: 6409
                                 Buffers: shared hit=19227
 Settings: enable_hashagg = 'off'
 Planning:
   Buffers: shared hit=316
 Planning Time: 2.101 ms
 Execution Time: 44168.266 ms
(101 rows)
====



# 3. Setting: enable_hashagg = ON,  enable_groupagg = OFF
# HashAgg was used
# Execution time was 1 sec
====
                                                                                
         QUERY PLAN                                                             
                             
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1051399723.16..1051399723.41 rows=100 width=66) (actual 
time=1096.948..1097.642 rows=100.00 loops=1)
   Buffers: shared hit=133811 read=19929
   ->  Sort  (cost=1051399723.16..1051399778.50 rows=22136 width=66) (actual 
time=1096.946..1097.633 rows=100.00 loops=1)
         Sort Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, 
cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
         Sort Method: top-N heapsort  Memory: 49kB
         Buffers: shared hit=133811 read=19929
         ->  HashAggregate  (cost=1051398655.78..1051398877.14 rows=22136 
width=66) (actual time=1087.378..1090.664 rows=5740.00 loops=1)
               Group Key: cd.cd_gender, cd.cd_marital_status, 
cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
               Batches: 1  Memory Usage: 1049kB
               Buffers: shared hit=133805 read=19929
               ->  Nested Loop  (cost=33945.79..1051398268.40 rows=22136 
width=34) (actual time=702.467..1076.172 rows=5753.00 loops=1)
                     Buffers: shared hit=133805 read=19929
                     ->  Nested Loop  (cost=33945.36..1051363512.22 rows=22136 
width=4) (actual time=702.451..958.299 rows=5753.00 loops=1)
                           Buffers: shared hit=114045 read=16677
                           ->  Nested Loop  (cost=33945.07..1051355209.75 
rows=24605 width=8) (actual time=702.387..921.978 rows=6409.00 loops=1)
                                 Buffers: shared hit=94818 read=16677
                                 ->  HashAggregate  (cost=33944.78..34272.85 
rows=32807 width=4) (actual time=354.993..373.328 rows=28544.00 loops=1)
                                       Group Key: ss.ss_customer_sk
                                       Batches: 1  Memory Usage: 2329kB
                                       Buffers: shared hit=2223 read=11708
                                       ->  Gather  (cost=2683.76..33862.76 
rows=32807 width=4) (actual time=17.988..336.471 rows=33609.00 loops=1)
                                             Workers Planned: 2
                                             Workers Launched: 2
                                             Buffers: shared hit=2223 read=11708
                                             ->  Hash Join  
(cost=1683.76..29582.06 rows=13670 width=4) (actual time=16.964..337.747 
rows=11203.00 loops=3)
                                                   Hash Cond: 
(ss.ss_sold_date_sk = d.d_date_sk)
                                                   Buffers: shared hit=2223 
read=11708
                                                   ->  Parallel Seq Scan on 
store_sales ss  (cost=0.00..24747.68 rows=1200168 width=8) (actual 
time=0.394..157.640 rows=960134.67 loops=3)
                                                         Buffers: shared 
hit=1038 read=11708
                                                   ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=16.492..16.493 
rows=848.00 loops=3)
                                                         Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                         Buffers: shared 
hit=1185
                                                         ->  Seq Scan on 
date_dim d  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.021..15.934 
rows=848.00 loops=3)
                                                               Filter: ((d_moy 
>= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                               Rows Removed by 
Filter: 72201
                                                               Buffers: shared 
hit=1185
                                 ->  Index Scan using customer_pkey on customer 
c  (cost=0.29..36766.33 rows=1 width=12) (actual time=0.018..0.018 rows=0.22 
loops=28544)
                                       Index Cond: (c_customer_sk = 
ss.ss_customer_sk)
                                       Filter: ((ANY (c_customer_sk = (hashed 
SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1)))
                                       Rows Removed by Filter: 1
                                       Index Searches: 28544
                                       Buffers: shared hit=92595 read=4969
                                       SubPlan 2
                                         ->  Gather  (cost=2683.76..10471.46 
rows=8194 width=4) (actual time=72.010..153.327 rows=8156.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=4369
                                               ->  Hash Join  
(cost=1683.76..8652.06 rows=3414 width=4) (actual time=44.282..111.007 
rows=2718.67 loops=3)
                                                     Hash Cond: 
(ws.ws_sold_date_sk = d_1.d_date_sk)
                                                     Buffers: shared hit=4369
                                                     ->  Parallel Seq Scan on 
web_sales ws  (cost=0.00..6181.43 rows=299743 width=8) (actual 
time=0.006..22.145 rows=239794.67 loops=3)
                                                           Buffers: shared 
hit=3184
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=44.238..44.239 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_1  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.015..42.452 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                                       SubPlan 4
                                         ->  Gather  (cost=2683.76..18287.89 
rows=16419 width=4) (actual time=34.268..183.058 rows=16874.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=2595 
read=4969
                                               ->  Hash Join  
(cost=1683.76..15645.99 rows=6841 width=4) (actual time=30.132..174.422 
rows=5624.67 loops=3)
                                                     Hash Cond: 
(cs.cs_sold_date_sk = d_2.d_date_sk)
                                                     Buffers: shared hit=2595 
read=4969
                                                     ->  Parallel Seq Scan on 
catalog_sales cs  (cost=0.00..12385.45 rows=600645 width=8) (actual 
time=3.432..69.160 rows=480516.00 loops=3)
                                                           Buffers: shared 
hit=1410 read=4969
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=26.671..26.672 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_2  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.019..26.377 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                           ->  Index Scan using customer_address_pkey on 
customer_address ca  (cost=0.29..0.34 rows=1 width=4) (actual time=0.005..0.005 
rows=0.90 loops=6409)
                                 Index Cond: (ca_address_sk = 
c.c_current_addr_sk)
                                 Filter: ((ca_county)::text = ANY ('{"Rush 
County","Toole County","Jefferson County","Dona Ana County","La Porte 
County"}'::text[]))
                                 Rows Removed by Filter: 0
                                 Index Searches: 6409
                                 Buffers: shared hit=19227
                     ->  Index Scan using customer_demographics_pkey on 
customer_demographics cd  (cost=0.43..1.57 rows=1 width=38) (actual 
time=0.019..0.019 rows=1.00 loops=5753)
                           Index Cond: (cd_demo_sk = c.c_current_cdemo_sk)
                           Index Searches: 5753
                           Buffers: shared hit=19760 read=3252
 Settings: enable_groupagg = 'off'
 Planning:
   Buffers: shared hit=316
 Planning Time: 21.057 ms
 Execution Time: 1105.672 ms
(91 rows)
====



# 4. Setting: enable_hashagg = OFF,  enable_groupagg = OFF
# GroupAgg was used
# Execution time was 38 sec
====
                                                                                
           QUERY PLAN                                                           
                                 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=161514.62..7793138.25 rows=100 width=66) (actual 
time=7611.145..38354.564 rows=100.00 loops=1)
   Buffers: shared hit=119783 read=26857, temp read=1482 written=11039
   ->  GroupAggregate  (cost=161514.62..1689497722.34 rows=22136 width=66) 
(actual time=7611.128..38354.435 rows=100.00 loops=1)
         Group Key: cd.cd_gender, cd.cd_marital_status, cd.cd_education_status, 
cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
         Buffers: shared hit=119783 read=26857, temp read=1482 written=11039
         ->  Nested Loop  (cost=161514.62..1689497113.60 rows=22136 width=34) 
(actual time=7510.502..38353.114 rows=101.00 loops=1)
               Join Filter: (cd.cd_demo_sk = c.c_current_cdemo_sk)
               Rows Removed by Join Filter: 197184348
               Buffers: shared hit=119783 read=26857, temp read=1482 
written=11039
               ->  Gather Merge  (cost=125190.47..348899.28 rows=1920800 
width=38) (actual time=6363.769..6499.733 rows=34276.00 loops=1)
                     Workers Planned: 2
                     Workers Launched: 2
                     Buffers: shared hit=5833 read=10085, temp read=1482 
written=11039
                     ->  Sort  (cost=124190.45..126191.28 rows=800333 width=38) 
(actual time=6163.457..6232.614 rows=11966.67 loops=3)
                           Sort Key: cd.cd_gender, cd.cd_marital_status, 
cd.cd_education_status, cd.cd_income_band, cd.cd_credit_rating, cd.cd_dep_count
                           Sort Method: external merge  Disk: 28992kB
                           Buffers: shared hit=5833 read=10085, temp read=1482 
written=11039
                           Worker 0:  Sort Method: external merge  Disk: 28752kB
                           Worker 1:  Sort Method: external merge  Disk: 30280kB
                           ->  Parallel Seq Scan on customer_demographics cd  
(cost=0.00..23831.33 rows=800333 width=38) (actual time=0.737..321.713 
rows=640266.67 loops=3)
                                 Buffers: shared hit=5743 read=10085
               ->  Materialize  (cost=36324.15..1051365837.65 rows=22136 
width=4) (actual time=0.025..0.325 rows=5752.84 loops=34276)
                     Storage: Memory  Maximum Storage: 289kB
                     Buffers: shared hit=113950 read=16772
                     ->  Nested Loop  (cost=36324.15..1051365726.97 rows=22136 
width=4) (actual time=865.332..1133.116 rows=5753.00 loops=1)
                           Buffers: shared hit=113950 read=16772
                           ->  Nested Loop  (cost=36323.86..1051357424.51 
rows=24605 width=8) (actual time=863.365..1073.375 rows=6409.00 loops=1)
                                 Buffers: shared hit=94723 read=16772
                                 ->  Unique  (cost=36323.57..36487.60 
rows=32807 width=4) (actual time=393.964..415.541 rows=28544.00 loops=1)
                                       Buffers: shared hit=2128 read=11803
                                       ->  Sort  (cost=36323.57..36405.58 
rows=32807 width=4) (actual time=393.957..402.146 rows=33609.00 loops=1)
                                             Sort Key: ss.ss_customer_sk
                                             Sort Method: quicksort  Memory: 
1537kB
                                             Buffers: shared hit=2128 read=11803
                                             ->  Gather  
(cost=2683.76..33862.76 rows=32807 width=4) (actual time=80.713..349.762 
rows=33609.00 loops=1)
                                                   Workers Planned: 2
                                                   Workers Launched: 2
                                                   Buffers: shared hit=2128 
read=11803
                                                   ->  Hash Join  
(cost=1683.76..29582.06 rows=13670 width=4) (actual time=44.679..308.141 
rows=11203.00 loops=3)
                                                         Hash Cond: 
(ss.ss_sold_date_sk = d.d_date_sk)
                                                         Buffers: shared 
hit=2128 read=11803
                                                         ->  Parallel Seq Scan 
on store_sales ss  (cost=0.00..24747.68 rows=1200168 width=8) (actual 
time=9.033..125.081 rows=960134.67 loops=3)
                                                               Buffers: shared 
hit=943 read=11803
                                                         ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=35.605..35.606 
rows=848.00 loops=3)
                                                               Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                               Buffers: shared 
hit=1185
                                                               ->  Seq Scan on 
date_dim d  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.036..35.172 
rows=848.00 loops=3)
                                                                     Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                     Rows 
Removed by Filter: 72201
                                                                     Buffers: 
shared hit=1185
                                 ->  Index Scan using customer_pkey on customer 
c  (cost=0.29..36766.33 rows=1 width=12) (actual time=0.022..0.022 rows=0.22 
loops=28544)
                                       Index Cond: (c_customer_sk = 
ss.ss_customer_sk)
                                       Filter: ((ANY (c_customer_sk = (hashed 
SubPlan 2).col1)) OR (ANY (c_customer_sk = (hashed SubPlan 4).col1)))
                                       Rows Removed by Filter: 1
                                       Index Searches: 28544
                                       Buffers: shared hit=92595 read=4969
                                       SubPlan 2
                                         ->  Gather  (cost=2683.76..10471.46 
rows=8194 width=4) (actual time=39.476..124.540 rows=8156.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=4369
                                               ->  Hash Join  
(cost=1683.76..8652.06 rows=3414 width=4) (actual time=20.456..94.877 
rows=2718.67 loops=3)
                                                     Hash Cond: 
(ws.ws_sold_date_sk = d_1.d_date_sk)
                                                     Buffers: shared hit=4369
                                                     ->  Parallel Seq Scan on 
web_sales ws  (cost=0.00..6181.43 rows=299743 width=8) (actual 
time=0.020..28.916 rows=239794.67 loops=3)
                                                           Buffers: shared 
hit=3184
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=20.396..20.397 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_1  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.019..20.094 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                                       SubPlan 4
                                         ->  Gather  (cost=2683.76..18287.89 
rows=16419 width=4) (actual time=56.442..326.649 rows=16874.00 loops=1)
                                               Workers Planned: 2
                                               Workers Launched: 2
                                               Buffers: shared hit=2595 
read=4969
                                               ->  Hash Join  
(cost=1683.76..15645.99 rows=6841 width=4) (actual time=64.537..301.087 
rows=5624.67 loops=3)
                                                     Hash Cond: 
(cs.cs_sold_date_sk = d_2.d_date_sk)
                                                     Buffers: shared hit=2595 
read=4969
                                                     ->  Parallel Seq Scan on 
catalog_sales cs  (cost=0.00..12385.45 rows=600645 width=8) (actual 
time=11.234..130.365 rows=480516.00 loops=3)
                                                           Buffers: shared 
hit=1410 read=4969
                                                     ->  Hash  
(cost=1673.36..1673.36 rows=832 width=4) (actual time=53.242..53.243 
rows=848.00 loops=3)
                                                           Buckets: 1024  
Batches: 1  Memory Usage: 38kB
                                                           Buffers: shared 
hit=1185
                                                           ->  Seq Scan on 
date_dim d_2  (cost=0.00..1673.36 rows=832 width=4) (actual time=0.025..52.328 
rows=848.00 loops=3)
                                                                 Filter: 
((d_moy >= 1) AND (d_moy <= 4) AND (d_year = 2002))
                                                                 Rows Removed 
by Filter: 72201
                                                                 Buffers: 
shared hit=1185
                           ->  Index Scan using customer_address_pkey on 
customer_address ca  (cost=0.29..0.34 rows=1 width=4) (actual time=0.009..0.009 
rows=0.90 loops=6409)
                                 Index Cond: (ca_address_sk = 
c.c_current_addr_sk)
                                 Filter: ((ca_county)::text = ANY ('{"Rush 
County","Toole County","Jefferson County","Dona Ana County","La Porte 
County"}'::text[]))
                                 Rows Removed by Filter: 0
                                 Index Searches: 6409
                                 Buffers: shared hit=19227
 Settings: enable_hashagg = 'off', enable_groupagg = 'off'
 Planning:
   Buffers: shared hit=316
 Planning Time: 10.736 ms
 Execution Time: 38358.196 ms
(101 rows)
====

Reply via email to