Hi,
I have two similar queries that calculate "group by" summaries over a huge 
table (74.6mil rows).
The only difference between two queries is the number of columns that group by 
is performed on.
This difference is causing two different plans which are vary so very much in 
performance.
Postgres is 8.4.4. on Linux 64bit. Work_mem is 4GB for both queries and 
effective_cache_size = 30GB (server has 72GB RAM).
Both queries are 100% time on CPU (data is all in buffer cache or OS cache).
My questions are:

1)      Is there a way to force plan that uses hashaggregate for the second 
query?

2)      I am not trying to achieve any particular execution time for the query, 
but I noticed that when "disk sort" kicks in  (and that happens eventually once 
the dataset is large enough) the query drastically slows down, even if there is 
no physical IO going on. I wonder if it's possible to have predictable 
performance rather than sudden drop.

3)      Why hashAggregate plan uses so much less memory (work_mem) than the 
plan with groupAggregate/sort? HashAggregate plan for Query1 works even with 
work_mem='2GB'; The second plan decides to use disk sort even with 
work_mem='4GB'. Why sort is so memory greedy? Are there any plans to address 
the sorting memory efficiency issues?

Thank you!

Query1:
explain analyze
smslocate_edw-#   SELECT
smslocate_edw-#     month_code,
smslocate_edw-#     short_code,
smslocate_edw-#     gateway_carrier_id,
smslocate_edw-#     mp_code,
smslocate_edw-#     partner_id,
smslocate_edw-#     master_company_id,
smslocate_edw-#     ad_id,
smslocate_edw-#     sc_name_id,
smslocate_edw-#     sc_sports_league_id,
smslocate_edw-#     sc_sports_alert_type,
smslocate_edw-#     al_widget_id,
smslocate_edw-#     keyword_id,
smslocate_edw-#     cp_id,
smslocate_edw-#     sum(coalesce(message_count,0)),          -- message_cnt
smslocate_edw-#     sum(coalesce(message_sellable_count,0)), -- 
message_sellable_cnt
smslocate_edw-#     sum(coalesce(ad_cost_sum,0)),            -- ad_cost_sum
smslocate_edw-#     NULL::int4, --count(distinct device_number),           -- 
unique_user_cnt
smslocate_edw-#     NULL::int4, --count(distinct case when 
message_sellable_count <> 0 then device_number end), -- unique_user_sellable_cnt
smslocate_edw-#     NULL,                                    -- 
unique_user_first_time_cnt
smslocate_edw-#     1,  -- ALL
smslocate_edw-#     CURRENT_TIMESTAMP
smslocate_edw-#   from staging.agg_phones_monthly_snapshot
smslocate_edw-#   group by
smslocate_edw-#     month_code,
smslocate_edw-#     short_code,
smslocate_edw-#     gateway_carrier_id,
smslocate_edw-#     mp_code,
smslocate_edw-#     partner_id,
smslocate_edw-#     master_company_id,
smslocate_edw-#     ad_id,
smslocate_edw-#     sc_name_id,
smslocate_edw-#     sc_sports_league_id,
smslocate_edw-#     sc_sports_alert_type,
smslocate_edw-#     al_widget_id,
smslocate_edw-#     keyword_id,
smslocate_edw-#     cp_id
smslocate_edw-# ;
                                                                                
 QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------
 HashAggregate  (cost=5065227.32..5214455.48 rows=7461408 width=64) (actual 
time=183289.883..185213.565 rows=2240716 loops=1)
   ->  Append  (cost=0.00..2080664.40 rows=74614073 width=64) (actual 
time=0.030..58952.749 rows=74614237 loops=1)
         ->  Seq Scan on agg_phones_monthly  (cost=0.00..11.50 rows=1 
width=102) (actual time=0.002..0.002 rows=0 loops=1)
               Filter: (month_code = '2010M04'::bpchar)
         ->  Seq Scan on agg_phones_monthly_2010m04 agg_phones_monthly  
(cost=0.00..2080652.90 rows=74614072 width=64) (actual time=0.027..42713.387 
rows=74614237 loops=1)
               Filter: (month_code = '2010M04'::bpchar)
 Total runtime: 185519.997 ms
(7 rows)

Time: 185684.396 ms

Query2:
explain analyze
smslocate_edw-#     SELECT
smslocate_edw-#     month_code,
smslocate_edw-#     gateway_carrier_id,
smslocate_edw-#     sum(coalesce(message_count,0)),          -- message_cnt
smslocate_edw-#     sum(coalesce(message_sellable_count,0)), -- 
message_sellable_cnt
smslocate_edw-#     sum(coalesce(ad_cost_sum,0)),            -- ad_cost_sum
smslocate_edw-#     count(distinct device_number),           -- unique_user_cnt
smslocate_edw-#     count(distinct case when message_sellable_count <> 0 then 
device_number end), -- unique_user_sellable_cnt
smslocate_edw-#     NULL,                                    -- 
unique_user_first_time_cnt
smslocate_edw-#     15, -- CARRIER
smslocate_edw-#     CURRENT_TIMESTAMP
smslocate_edw-#   from staging.agg_phones_monthly_snapshot
smslocate_edw-#   group by
smslocate_edw-#     month_code,
smslocate_edw-#     gateway_carrier_id
smslocate_edw-# ;
                                                                                
       QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------
 GroupAggregate  (cost=13877783.42..15371164.88 rows=40000 width=37) (actual 
time=1689525.151..2401444.441 rows=116 loops=1)
   ->  Sort  (cost=13877783.42..14064318.61 rows=74614073 width=37) (actual 
time=1664233.243..1716472.931 rows=74614237 loops=1)
         Sort Key: dw.agg_phones_monthly.month_code, 
dw.agg_phones_monthly.gateway_carrier_id
         Sort Method:  external merge  Disk: 3485424kB
         ->  Result  (cost=0.00..2080664.40 rows=74614073 width=37) (actual 
time=0.008..84421.927 rows=74614237 loops=1)
               ->  Append  (cost=0.00..2080664.40 rows=74614073 width=37) 
(actual time=0.007..64724.486 rows=74614237 loops=1)
                     ->  Seq Scan on agg_phones_monthly  (cost=0.00..11.50 
rows=1 width=574) (actual time=0.000..0.000 rows=0 loops=1)
                           Filter: (month_code = '2010M04'::bpchar)
                     ->  Seq Scan on agg_phones_monthly_2010m04 
agg_phones_monthly  (cost=0.00..2080652.90 rows=74614072 width=37) (actual 
time=0.005..48199.938 rows=74614237 loops=1)
                           Filter: (month_code = '2010M04'::bpchar)
 Total runtime: 2402137.632 ms
(11 rows)

Time: 2402139.642 ms

Reply via email to