On Sun, May 7, 2023 at 8:42 PM Laurenz Albe <laurenz.a...@cybertec.at> wrote:
> On Sat, 2023-05-06 at 03:14 +0200, Marc Millas wrote: > > postgres 14.2 on Linux redhat > > > > temp_file_limit set around 210 GB. > > > > a select request with 2 left join have crashed the server (oom killer) > after the postgres > > disk occupation did grow from 15TB to 16 TB. > > > > What are the cases where postgres may grow without caring about > temp_file_limit ? > > That's too little information for a decent answer. > One obvious answer is: if it is not writing temporary files. > > Yours, > Laurenz Albe > Logical ... so here is a little more info: db=# analyze myschema.table_a; ANALYZE db=# with ta as(select 'myschema' as s, 'table_a' as t), p as (select * from information_schema.columns, ta where table_schema=ta.s and table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta where schemaname=ta.s and relname=ta.t) select staattnum, column_name, stanullfrac, stadistinct from tableid, pg_statistic join p on p.ordinal_position=staattnum where starelid=tableid.relid order by staattnum; staattnum | column_name | stanullfrac | stadistinct -----------+--------------+-------------+------------- 1 | col_ne | 0 | -0.6100224 2 | col_brg | 0.0208 | 6 3 | col_ano | 0 | 447302 4 | col_ine | 0 | -0.5341927 5 | col_cha | 0 | 11 (5 rows) db=# select count(*) from myschema.table_a; count ---------- 13080776 (1 row) db=# select count(distinct col_ano) from myschema.table_a; count ---------- 10149937 (1 row) // stats looks somewhat absurd, as analyze guess 20 times less distinct values as a select distinct does on column col_ano... db=# analyze myschema.table_b; ANALYZE db=# with ta as(select 'myschema' as s, 'table_b' as t), p as (select * from information_schema.columns, ta where table_schema=ta.s and table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta where schemaname=ta.s and relname=ta.t) select staattnum, column_name, stanullfrac, stadistinct from tableid, pg_statistic join p on p.ordinal_position=staattnum where starelid=tableid.relid order by staattnum; staattnum | column_name | stanullfrac | stadistinct -----------+-----------------+-------------+------------- 1 | col_nerg | 0 | 161828 2 | col_ibi | 0 | 362161 3 | col_imi | 0 | 381023 4 | col_ipi | 0 | 391915 5 | col_ne | 0 | -0.53864235 6 | col_ano | 0 | 482225 (6 rows) db=# select count(*) from myschema.table_b; count ---------- 14811485 (1 row) db=# select count(distinct col_ano) from myschema.table_b; count ---------- 10149937 (1 row) //same remark db=# explain select distinct t1.col_ine, case when t2.col_ibi is null then t3.col_ibi else t2.col_ibi end from myschema.table_a t1 left join myschema.table_b t2 on t1.col_ano=t2.col_ano Left join myschema.table_b t3 on t1.NUM_ENQ=t3.NUM_ENQ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Unique (cost=72377463163.02..201012533981.80 rows=1021522829864 width=97) -> Gather Merge (cost=72377463163.02..195904919832.48 rows=1021522829864 width=97) Workers Planned: 5 -> Sort (cost=72377462162.94..72888223577.87 rows=204304565973 width=97) Sort Key: t1.col_ine, (CASE WHEN (t2.col_ibi IS NULL) THEN t3.col_ibi ELSE t2.col_ibi END) -> Parallel Hash Left Join (cost=604502.76..1276224253.51 rows=204304565973 width=97) Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text) -> Parallel Hash Left Join (cost=300803.38..582295.38 rows=4857277 width=52) Hash Cond: ((t1.col_ne)::text = (t3.col_ne)::text) -> Parallel Seq Scan on table_a t1 (cost=0.00..184052.35 rows=2616335 width=53) -> Parallel Hash (cost=243466.06..243466.06 rows=2965306 width=31) -> Parallel Seq Scan on table_b t3 (cost=0.00..243466.06 rows=2965306 width=31) -> Parallel Hash (cost=243466.06..243466.06 rows=2965306 width=34) -> Parallel Seq Scan on table_b t2 (cost=0.00..243466.06 rows=2965306 width=34) JIT: Functions: 19 Options: Inlining true, Optimization true, Expressions true, Deforming true (17 rows) //so.. the planner guess that those 2 join will generate 1000 billions rows... //so, I try to change stats db=# alter table myschema.table_a alter column col_ano set (n_distinct=-0.8); ALTER TABLE db=# analyze myschema.table_a; ANALYZE db=# with ta as(select 'myschema' as s, 'table_a' as t), p as (select * from information_schema.columns, ta where table_schema=ta.s and table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta where schemaname=ta.s and relname=ta.t) select staattnum, column_name, stanullfrac, stadistinct from tableid, pg_statistic join p on p.ordinal_position=staattnum where starelid=tableid.relid order by staattnum; staattnum | column_name | stanullfrac | stadistinct -----------+--------------+-------------+------------- 1 | col_ne | 0 | -0.6694066 2 | col_brg | 0.0224 | 6 3 | col_ano | 0 | -0.8 4 | col_ine | 0 | -0.542651 5 | col_cha | 0 | 12 (5 rows) db=# alter table myschema.table_b alter column col_ano set (n_distinct=-0.8); ALTER TABLE db=# analyze myschema.table_b; ANALYZE db=# with ta as(select 'myschema' as s, 'table_b' as t), p as (select * from information_schema.columns, ta where table_schema=ta.s and table_name=ta.t), tableid as(select relid from pg_stat_user_tables, ta where schemaname=ta.s and relname=ta.t) select staattnum, column_name, stanullfrac, stadistinct from tableid, pg_statistic join p on p.ordinal_position=staattnum where starelid=tableid.relid order by staattnum; staattnum | column_name | stanullfrac | stadistinct -----------+-----------------+-------------+------------- 1 | col_nerg | 0 | 166151 2 | col_ibi | 0 | 374699 3 | col_imi | 0 | 394296 4 | col_ipi | 0 | 404515 5 | col_ne | 0 | -0.47193906 6 | col_ano | 0 | -0.8 (6 rows) //doing the explain of the joins again db=# explain select distinct t1.col_ine, case when t2.col_ibi is null then t3.col_ibi else t2.col_ibi end from myschema.table_a t1 left join myschema.table_b t2 on t1.col_ano=t2.col_ano Left join myschema.table_b t3 on t1.NUM_ENQ=t3.NUM_ENQ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Unique (cost=62824833042.98..174732492390.53 rows=888686328950 width=97) -> Gather Merge (cost=62824833042.98..170289060745.78 rows=888686328950 width=97) Workers Planned: 5 -> Sort (cost=62824832042.91..63269175207.38 rows=177737265790 width=97) Sort Key: t1.col_ine, (CASE WHEN (t2.col_ibi IS NULL) THEN t3.col_ibi ELSE t2.col_ibi END) -> Parallel Hash Left Join (cost=604274.78..1148041043.61 rows=177737265790 width=97) Hash Cond: ((t1.col_ano)::text = (t2.col_ano)::text) -> Parallel Hash Left Join (cost=300691.39..581280.67 rows=4424044 width=53) Hash Cond: ((t1.col_ne)::text = (t3.col_ne)::text) -> Parallel Seq Scan on table_a t1 (cost=0.00..184049.73 rows=2616073 width=54) -> Parallel Hash (cost=243427.84..243427.84 rows=2961484 width=31) -> Parallel Seq Scan on table_b t3 (cost=0.00..243427.84 rows=2961484 width=31) -> Parallel Hash (cost=243427.84..243427.84 rows=2961484 width=34) -> Parallel Seq Scan on table_b t2 (cost=0.00..243427.84 rows=2961484 width=34) JIT: Functions: 19 Options: Inlining true, Optimization true, Expressions true, Deforming true (17 rows) //almost equivalent. //temp_file_limit is set to 210 GB. // if I run the request, or explain analyze it (same thing) the disk occupation grows of 1.1 TB, and the RAM occupation grows until oom kills //BTW, if I simplify the select to keep only the first join, the number of estimated rows is still ""huge"" and I didnt try to execute it. thanks