On Tue, Mar 9, 2021 at 6:53 PM Tom Lane <t...@sss.pgh.pa.us> wrote:

> Radoslav Nedyalkov <rnedyal...@gmail.com> writes:
> > On Tue, Mar 9, 2021 at 6:03 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> >> Perhaps the accumulation is happening on the client side?  libpq doesn't
> >> have any provision for spilling a result set to disk.
>
> > Ah, I named it result set wrongly perhaps.
> > These are queries , part of a larger ETL function or statement which at
> the
> > end just write to a table.
> > The huge join is an intermediate step.
>
> Hm.  What's the query plan look like?
>
> The only thing I'm aware of that could consume unbounded memory
> on the server side is hash aggregation.  (v13 has improved that
> situation, but I'm guessing you are running some older version.)
> The planner won't choose hash aggregation if it estimates that
> the hash table would need to exceed work_mem, but its estimate
> could be wrong.  If that's the scenario, you could back off
> work_mem until hash aggregation isn't used, or you could try to
> improve the planner's estimates.  If your ETL process involves
> intermediate tables, you might need to explicitly ANALYZE those
> after you fill them so that the planner has a better idea how
> to plan the later steps.
>
>                         regards, tom lanea


Thanks Tom,
Query and plan attached.

Rado
db=# EXPLAIN 
db-#         CREATE TEMP TABLE gross_set_merchants AS (
db(#                 WITH gs_merchants as (
db(#                         SELECT
db(#                                 u.merchant_id,
db(#                                 uel.user_id
db(#                         FROM public.user_event_logs as uel
db(#                         JOIN users as u ON uel.user_id = u.id
db(#                         where 
db(#                                 --u.merchant_id = 1030616
db(#                                 uel.created_at >= '2018-01-01'
db(#                                 AND uel.category = 'settings'
db(#                                 AND uel.description IN ('Gross settlement 
enabled','Gross settlement disabled')
db(#                         GROUP BY 1,2
db(#                 ), setting_change as (
db(#                         SELECT
db(#                                 gsm.merchant_id,
db(#                                 cer.date,
db(#                                 SUM(CASE 
db(#                                         WHEN uel.description = 'Gross 
settlement enabled' THEN 1
db(#                                         WHEN uel.description = 'Gross 
settlement disabled' THEN -1
db(#                                         ELSE 0
db(#                                 END) change_setting
db(#                         FROM external.currency_exchange_rates as cer
db(#                         CROSS JOIN gs_merchants as gsm --full join, having 
all days for all gs merchants
db(#                         LEFT JOIN public.user_event_logs as uel ON 
uel.created_at::date = cer.date 
db(#                                 AND gsm.user_id = uel.user_id 
db(#                                 AND uel.created_at >= '2018-01-01'
db(#                                 AND uel.category = 'settings'
db(#                                 AND uel.description IN ('Gross settlement 
enabled','Gross settlement disabled')
db(#                         WHERE cer.currency = 'GBP'
db(#                                 AND cer.date >= '2018-01-01'
db(#                         GROUP BY 1,2
db(#                                 
db(#                 )
db(#                 
db(#                 SELECT
db(#                         sc.merchant_id,
db(#                         sc.date,
db(#                         SUM(change_setting) OVER (PARTITION BY 
sc.merchant_id ORDER BY date) as gs_on
db(#                 FROM setting_change as sc); 
                                                                                
                                             QUERY PLAN                         
                                                                                
                     
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=183885316.97..183890372.97 rows=252800 width=40)
   CTE gs_merchants
     ->  Group  (cost=2475689.43..2534384.63 rows=567218 width=8)
           Group Key: u.merchant_id, uel.user_id
           ->  Gather Merge  (cost=2475689.43..2532021.22 rows=472682 width=8)
                 Workers Planned: 2
                 ->  Group  (cost=2474689.41..2476461.97 rows=236341 width=8)
                       Group Key: u.merchant_id, uel.user_id
                       ->  Sort  (cost=2474689.41..2475280.26 rows=236341 
width=8)
                             Sort Key: u.merchant_id, uel.user_id
                             ->  Parallel Hash Join  
(cost=265272.50..2453595.37 rows=236341 width=8)
                                   Hash Cond: (uel.user_id = u.id)
                                   ->  Parallel Seq Scan on user_event_logs uel 
 (cost=0.00..2187702.48 rows=236341 width=4)
                                         Filter: ((created_at >= '2018-01-01 
00:00:00'::timestamp without time zone) AND ((description)::text = ANY 
('{"Gross settlement enabled","Gross settlement disabled"}'::text[])) AND 
((category)::text = 'settings'::text))
                                   ->  Parallel Hash  
(cost=209818.33..209818.33 rows=4436333 width=8)
                                         ->  Parallel Seq Scan on users u  
(cost=0.00..209818.33 rows=4436333 width=8)
   CTE setting_change
     ->  HashAggregate  (cost=181320662.52..181323190.52 rows=252800 width=16)
           Group Key: gsm.merchant_id, cer.date
           ->  Merge Left Join  (cost=161250580.17..170174828.82 rows=891666696 
width=71)
                 Merge Cond: ((gsm.user_id = uel_1.user_id) AND (cer.date = 
((uel_1.created_at)::date)))
                 ->  Sort  (cost=158950948.16..161180114.90 rows=891666696 
width=12)
                       Sort Key: gsm.user_id, cer.date
                       ->  Nested Loop  (cost=0.29..11157645.37 rows=891666696 
width=12)
                             ->  CTE Scan on gs_merchants gsm  
(cost=0.00..11344.36 rows=567218 width=8)
                             ->  Materialize  (cost=0.29..471.24 rows=1572 
width=4)
                                   ->  Index Only Scan using 
currency_exchange_rates_pk on currency_exchange_rates cer  (cost=0.29..463.38 
rows=1572 width=4)
                                         Index Cond: ((date >= 
'2018-01-01'::date) AND (currency = 'GBP'::text))
                 ->  Sort  (cost=2299632.01..2301050.06 rows=567218 width=75)
                       Sort Key: uel_1.user_id, ((uel_1.created_at)::date)
                       ->  Gather  (cost=1000.00..2245424.28 rows=567218 
width=75)
                             Workers Planned: 2
                             ->  Parallel Seq Scan on user_event_logs uel_1  
(cost=0.00..2187702.48 rows=236341 width=75)
                                   Filter: ((created_at >= '2018-01-01 
00:00:00'::timestamp without time zone) AND ((description)::text = ANY 
('{"Gross settlement enabled","Gross settlement disabled"}'::text[])) AND 
((category)::text = 'settings'::text))
   ->  Sort  (cost=27741.81..28373.81 rows=252800 width=16)
         Sort Key: sc.merchant_id, sc.date
         ->  CTE Scan on setting_change sc  (cost=0.00..5056.00 rows=252800 
width=16)
(37 rows)

Time: 15.080 ms
db=# 

Reply via email to