[
https://issues.apache.org/jira/browse/IMPALA-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pooja Nilangekar reassigned IMPALA-7791:
----------------------------------------
Assignee: Pooja Nilangekar
> Aggregation Node memory estimates don't account for number of fragment
> instances
> --------------------------------------------------------------------------------
>
> Key: IMPALA-7791
> URL: https://issues.apache.org/jira/browse/IMPALA-7791
> Project: IMPALA
> Issue Type: Sub-task
> Affects Versions: Impala 3.1.0
> Reporter: Pooja Nilangekar
> Assignee: Pooja Nilangekar
> Priority: Major
>
> AggregationNode's memory estimates are calculated based on the input
> cardinality of the node, without accounting for the division of input data
> across fragment instances. This results in very high memory estimates. In
> reality, the nodes often use only a part of this memory.
> Example query:
> {code:java}
> [localhost:21000] default> select distinct * from tpch.lineitem limit 5;
> {code}
> Summary:
> {code:java}
> +--------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem
> | Est. Peak Mem | Detail
>
>
>
>
> |
> +--------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> | 04:EXCHANGE | 1 | 21.24us | 21.24us | 5 | 5 | 48.00 KB
> | 16.00 KB | UNPARTITIONED
>
>
>
>
> |
> | 03:AGGREGATE | 3 | 5.11s | 5.15s | 15 | 5 | 576.21
> MB | 1.62 GB | FINALIZE
>
>
>
>
> |
> | 02:EXCHANGE | 3 | 709.75ms | 728.91ms | 6.00M | 6.00M | 5.46 MB
> | 10.78 MB |
> HASH(tpch.lineitem.l_orderkey,tpch.lineitem.l_partkey,tpch.lineitem.l_suppkey,tpch.lineitem.l_linenumber,tpch.lineitem.l_quantity,tpch.lineitem.l_extendedprice,tpch.lineitem.l_discount,tpch.lineitem.l_tax,tpch.lineitem.l_returnflag,tpch.lineitem.l_linestatus,tpch.lineitem.l_shipdate,tpch.lineitem.l_commitdate,tpch.lineitem.l_receiptdate,tpch.lineitem.l_shipinstruct,tpch.lineitem.l_shipmode,tpch.lineitem.l_comment)
> |
> | 01:AGGREGATE | 3 | 4.37s | 4.70s | 6.00M | 6.00M | 36.77 MB
> | 1.62 GB | STREAMING
>
>
>
>
> |
> | 00:SCAN HDFS | 3 | 437.14ms | 480.60ms | 6.00M | 6.00M | 65.51 MB
> | 264.00 MB | tpch.lineitem
>
>
>
>
> |
> +--------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
> {code}
> The plan estimates 3.50 GB memory per host but the query ends up with a peak
> memory usage of 682.07 MB.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]