[ 
https://issues.apache.org/jira/browse/IMPALA-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-13052:
-------------------------------------
    Description: 
Sampling aggregates (sample, appx_median, histogram) return a string that can 
be quite large, but the planner assumes it to have a fixed small size.

Examples:
select sample(l_orderkey) from tpch.lineitem;
according to plan: row-size=12B
in reality: TotalBytesSent: 254.45 KB (this is  single row sent by a host)

select appx_median(l_orderkey) from tpch.lineitem;
according to plan: row-size= 8B
in reality: TotalBytesSent: 254.68 KB (this is  single row sent by a host)

select histogram(l_orderkey) from tpch.lineitem;
according to plan: row-size=12B
in reality: TotalBytesSent: 254.35 KB (this is  single row sent by a host)

This may be also relevant for datasketches functions, haven't checked those  
yet.

This can lead to highly underestimating the memory needs of grouping 
aggregators:
select appx_median(l_shipmode) from lineitem group by l_orderkey order by 1 
limit 1
04:AGGREGATE  FINALIZE Peak Mem:  2.19 GB   Est. Peak Mem:  18.00 MB
01:AGGREGATE STREAMING  Peak Mem:   2.37 GB   Est. Peak Mem:  45.79 MB

Enforcing PREAGG_BYTES_LIMIT also doesn't seem to work well -setting a 40MB 
limit decreased peak mem to 1.5 GB. My guess is that the pre-aggregation logic 
is not prepared for aggregation states that grow during the execution, so it 
can decide to not add another group to the hash table, but can't deny 
increasing an existing one's state.


  was:
Sampling aggregates (sample, appx_median, histogram) return a string that can 
be quite large, but the planner assumes it to have a fixed small size.

Examples:
select sample(l_orderkey) from tpch.lineitem;
according to plan: row-size=12B
in reality: TotalBytesSent: 254.45 KB (this is  single row sent by a host)

select appx_median(l_orderkey) from tpch.lineitem;
according to plan: row-size= 8B
in reality: TotalBytesSent: 254.68 KB (this is  single row sent by a host)

select histogram(l_orderkey) from tpch.lineitem;
according to plan: row-size=12B
in reality: TotalBytesSent: 254.35 KB (this is  single row sent by a host)

This may be also relevant for datasketches functions, haven't checked thos  yet.

This can lead to highly underestimating the memory needs of grouping 
aggregators:
select appx_median(l_shipmode) from lineitem group by l_orderkey order by 1 
limit 1
04:AGGREGATE  FINALIZE Peak Mem:  2.19 GB   Est. Peak Mem:  18.00 MB
01:AGGREGATE STREAMING  Peak Mem:   2.37 GB   Est. Peak Mem:  45.79 MB


> Sampling aggregate result sizes are underestimated
> --------------------------------------------------
>
>                 Key: IMPALA-13052
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13052
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Csaba Ringhofer
>            Priority: Major
>
> Sampling aggregates (sample, appx_median, histogram) return a string that can 
> be quite large, but the planner assumes it to have a fixed small size.
> Examples:
> select sample(l_orderkey) from tpch.lineitem;
> according to plan: row-size=12B
> in reality: TotalBytesSent: 254.45 KB (this is  single row sent by a host)
> select appx_median(l_orderkey) from tpch.lineitem;
> according to plan: row-size= 8B
> in reality: TotalBytesSent: 254.68 KB (this is  single row sent by a host)
> select histogram(l_orderkey) from tpch.lineitem;
> according to plan: row-size=12B
> in reality: TotalBytesSent: 254.35 KB (this is  single row sent by a host)
> This may be also relevant for datasketches functions, haven't checked those  
> yet.
> This can lead to highly underestimating the memory needs of grouping 
> aggregators:
> select appx_median(l_shipmode) from lineitem group by l_orderkey order by 1 
> limit 1
> 04:AGGREGATE  FINALIZE Peak Mem:  2.19 GB   Est. Peak Mem:  18.00 MB
> 01:AGGREGATE STREAMING  Peak Mem:   2.37 GB   Est. Peak Mem:  45.79 MB
> Enforcing PREAGG_BYTES_LIMIT also doesn't seem to work well -setting a 40MB 
> limit decreased peak mem to 1.5 GB. My guess is that the pre-aggregation 
> logic is not prepared for aggregation states that grow during the execution, 
> so it can decide to not add another group to the hash table, but can't deny 
> increasing an existing one's state.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to