Re: [I] [Bug] Optimizer could choose HashAggregate to dedup based on cost. [cloudberry]

via GitHub Wed, 06 Nov 2024 23:01:08 -0800


aeschbacherEUF commented on issue #659:
URL: https://github.com/apache/cloudberry/issues/659#issuecomment-2461465573


   Hi Team,
   
   Eddie provided us with a test RPM package containing the single fix for 
issue #676 (already cherry-picked and built into v1.6.0):
   [Implement 3-phase aggregation with DEDUP HashAgg for DISTINCT. 
#676](https://github.com/apache/cloudberry/pull/676)
   
   We conducted some real-data testing, hoping for better performance with 
optimizer=on (needed for our typical workload) and gp_eager_distinct_dedup=on. 
Unfortunately, the PostgreSQL optimizer still outperforms ORCA by a factor of 
two. We've attached the EXPLAIN ANALYZE output for your reference.
   
[fix_659_distinct_cdb161.txt](https://github.com/user-attachments/files/17657284/fix_659_distinct_cdb161.txt)
   
   
   `set gp_eager_distinct_dedup = on;
   set optimizer = on;
   set optimizer_enable_multiple_distinct_aggs = off;
   explain (analyze, verbose, costs, timing, buffers) select count(distinct 
gsp_id) from eft_com_call_statistic;
   
                                                                           
QUERY PLAN                                                                      
    
   
-------------------------------------------------------------------------------------------------------------------------------------------------------------
    Finalize Aggregate  (cost=0.00..1252.01 rows=1 width=8) (actual 
time=9613.853..9613.854 rows=1 loops=1)
      Output: count(DISTINCT gsp_id)
      ->  Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..1252.01 rows=1 
width=8) (actual time=9408.391..9613.841 rows=2 loops=1)
            Output: (PARTIAL count(DISTINCT gsp_id))
            ->  Partial Aggregate  (cost=0.00..1252.01 rows=1 width=8) (actual 
time=9612.691..9612.693 rows=1 loops=1)
                  Output: PARTIAL count(DISTINCT gsp_id)
                  ->  Seq Scan on ebi_data.eft_com_call_statistic  
(cost=0.00..1103.70 rows=3958250 width=19) (actual time=0.222..940.759 
rows=3958373 loops=1)
                        Output: gsp_id
    Settings: effective_cache_size = '8GB', optimizer = 'on'
    Planning:
      Buffers: shared hit=2
    Planning Time: 5.153 ms
      (slice0)    Executor memory: 37K bytes.
      (slice1)    Executor memory: 130705K bytes avg x 2x(0) workers, 130705K 
bytes max (seg0).
    Memory used:  131072kB
    Optimizer: Pivotal Optimizer (GPORCA)
    Execution Time: 9614.687 ms
   (17 rows)
   
   Time: 9620,598 ms (00:09,621)`
   
   `set gp_eager_distinct_dedup = on;
   set optimizer = on;
   set optimizer_enable_multiple_distinct_aggs = off;
   explain (analyze, verbose, costs, timing, buffers) WITH cte_gsp_counts AS (
     SELECT DISTINCT gsp_id
     FROM eft_com_call_statistic
   )
   SELECT COUNT(*)
   FROM cte_gsp_counts;
                                                                               
QUERY PLAN                                                                      
       
   
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Finalize Aggregate  (cost=0.00..1721.24 rows=1 width=8) (actual 
time=3438.953..3438.955 rows=1 loops=1)
      Output: count()
      ->  Gather Motion 2:1  (slice1; segments: 2)  (cost=0.00..1721.24 rows=1 
width=8) (actual time=3423.515..3438.943 rows=2 loops=1)
            Output: (PARTIAL count())
            ->  Partial Aggregate  (cost=0.00..1721.24 rows=1 width=8) (actual 
time=3433.644..3433.645 rows=1 loops=1)
                  Output: PARTIAL count()
                  ->  HashAggregate  (cost=0.00..1721.24 rows=3958250 width=1) 
(actual time=1993.244..3310.569 rows=3958365 loops=1)
                        Output: gsp_id
                        Group Key: eft_com_call_statistic.gsp_id
                        Planned Partitions: 4
                        work_mem: 262433kB  Segments: 2  Max: 131217kB (segment 
0)  Workfile: (2 spilling)
                        ->  Seq Scan on ebi_data.eft_com_call_statistic  
(cost=0.00..1103.70 rows=3958250 width=19) (actual time=0.376..915.290 
rows=3958373 loops=1)
                              Output: gsp_id
    Settings: effective_cache_size = '8GB', optimizer = 'on'
    Planning:
      Buffers: shared hit=6
    Planning Time: 4.713 ms
      (slice0)    Executor memory: 24626K bytes.
      (slice1)    Executor memory: 123097K bytes avg x 2x(0) workers, 123117K 
bytes max (seg0).  Work_mem: 131217K bytes max.
    Memory used:  131072kB
    Optimizer: Pivotal Optimizer (GPORCA)
    Execution Time: 3448.772 ms
   (22 rows)
   
   Time: 3454,306 ms (00:03,454)`
   
   Best regards,
   
   Andreas


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [Bug] Optimizer could choose HashAggregate to dedup based on cost. [cloudberry]

Reply via email to