Riza Suminto created IMPALA-14574:
-------------------------------------
Summary: Lower memory estimate by analyzing Pipeline Membership
Key: IMPALA-14574
URL: https://issues.apache.org/jira/browse/IMPALA-14574
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Reporter: Riza Suminto
Assignee: Riza Suminto
IMPALA-7231 group PlanNodes into a set of Pipelines and display that
information in query profile like this:
{code:java}
in pipelines: 07(GETNEXT), 01(OPEN) {code}
A meeting point between GETNEXT and OPEN pipeline is usually a blocking
operator, where all PlanNode operators that belongs to GETNEXT pipeline must
wait until all operators in OPEN pipeline finish.
An example of this are HASH JOIN,
{code:java}
03:HASH JOIN [LEFT OUTER JOIN, BROADCAST]
| hash-table-id=00
| hash predicates: i1.i_manufact = i_manufact
| fk/pk conjuncts: none
| other predicates: zeroifnull(count(*)) > CAST(0 AS BIGINT)
| mem-estimate=0B mem-reservation=0B spill-buffer=64.00KB thread-reservation=0
| tuple-ids=0,2N row-size=90B cardinality=10.20K
| in pipelines: 00(GETNEXT), 07(OPEN)
{code}
Final AGGREGATION,
{code:java}
03:HASH JOIN [LEFT OUTER JOIN, BROADCAST]
10:AGGREGATE [FINALIZE]
| group by: (i_product_name)
| mem-estimate=10.00MB mem-reservation=1.94MB spill-buffer=64.00KB
thread-reservation=0
| tuple-ids=4 row-size=32B cardinality=10.20K
| in pipelines: 10(GETNEXT), 00(OPEN)
{code}
SORT/TOPN,
{code:java}
05:TOP-N [LIMIT=100]
| order by: (i_product_name) ASC
| mem-estimate=3.10KB mem-reservation=0B thread-reservation=0
| tuple-ids=5 row-size=32B cardinality=100
| in pipelines: 05(GETNEXT), 10(OPEN)
{code}
And so on.
Currently, Impala estimate memory usage of query by simply adding memory
estimate for all query fragments. Impala should able to estimate lower memory
by analyzing this pipeline dependencies in query plan tree. Fragments that
belongs to GETNEXT pipeline is less likely to consume all of its memory
allotment until all OPEN pipelines that adjacent to that GETNEXT pipeline
finish.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)