Pooja Nilangekar created IMPALA-7791:
----------------------------------------

             Summary: Aggregation Node memory estimates don't account for 
number of fragment instances
                 Key: IMPALA-7791
                 URL: https://issues.apache.org/jira/browse/IMPALA-7791
             Project: IMPALA
          Issue Type: Sub-task
    Affects Versions: Impala 3.1.0
            Reporter: Pooja Nilangekar


AggregationNode's memory estimates are calculated based on the input 
cardinality of the node, without accounting for the division of input data 
across fragment instances. This results in very high memory estimates. In 
reality, the nodes often use only a part of this memory.   

Example query:

{code:java}
[localhost:21000] default> select distinct * from tpch.lineitem limit 5; 
{code}

Summary: 

{code:java}
+--------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Operator     | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem  
| Est. Peak Mem | Detail                                                        
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
+--------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 04:EXCHANGE  | 1      | 21.24us  | 21.24us  | 5     | 5          | 48.00 KB  
| 16.00 KB      | UNPARTITIONED                                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
| 03:AGGREGATE | 3      | 5.11s    | 5.15s    | 15    | 5          | 576.21 MB 
| 1.62 GB       | FINALIZE                                                      
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
| 02:EXCHANGE  | 3      | 709.75ms | 728.91ms | 6.00M | 6.00M      | 5.46 MB   
| 10.78 MB      | 
HASH(tpch.lineitem.l_orderkey,tpch.lineitem.l_partkey,tpch.lineitem.l_suppkey,tpch.lineitem.l_linenumber,tpch.lineitem.l_quantity,tpch.lineitem.l_extendedprice,tpch.lineitem.l_discount,tpch.lineitem.l_tax,tpch.lineitem.l_returnflag,tpch.lineitem.l_linestatus,tpch.lineitem.l_shipdate,tpch.lineitem.l_commitdate,tpch.lineitem.l_receiptdate,tpch.lineitem.l_shipinstruct,tpch.lineitem.l_shipmode,tpch.lineitem.l_comment)
 |
| 01:AGGREGATE | 3      | 4.37s    | 4.70s    | 6.00M | 6.00M      | 36.77 MB  
| 1.62 GB       | STREAMING                                                     
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
| 00:SCAN HDFS | 3      | 437.14ms | 480.60ms | 6.00M | 6.00M      | 65.51 MB  
| 264.00 MB     | tpch.lineitem                                                 
                                                                                
                                                                                
                                                                                
                                                                                
                                    |
+--------------+--------+----------+----------+-------+------------+-----------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
{code}


The plan estimates 3.50 GB memory per host but the query ends up with a peak 
memory usage of 682.07 MB. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to