[
https://issues.apache.org/jira/browse/IMPALA-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gabor Kaszab reassigned IMPALA-9671:
------------------------------------
Assignee: (was: Gabor Kaszab)
> Improve SINGULAR ROW SRC Node Explain Output
> --------------------------------------------
>
> Key: IMPALA-9671
> URL: https://issues.apache.org/jira/browse/IMPALA-9671
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Shant Hovsepian
> Priority: Minor
> Labels: complextype, observability
>
> For queries that involve more than one level of unnesting with complex/nested
> types the explain output can be tricky to read and reason about. The SUBPLAN
> node produces a tree shape that's not quite the same as other node types. In
> particular it can be tricky to understand what a SINGULAR ROW SRC node is
> acting on or producing.
> Currently the explain output for a SINGULAR ROW SRC doesn't provide any
> reference on what it's doing. It may not be a guarantee but leaf nodes in an
> Impala plan tree are usually annotated with the input source they are working
> on in square brackets "[]", for example SCAN and UNNEST nodes, but SINGULAR
> ROW SRC provides no such annotation. It would be great to fix this so that in
> explain strings.
> {{SINGULAR ROW SRC }}
> _{{becomes}}_
> {{SINGULAR ROW SRC [input]}}
> Take the query below (SET EXPLAIN_LEVEL=3):
>
> {code:java}
> Query: explain select c_custkey, o_orderkey, l_partkey from customer c,
> c.c_orders o, o.o_lineitems as li
> +----------------------------------------------------------------------------------------+
> | Explain String
> |
> +----------------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=16.00MB Threads=3
> |
> | Per-Host Resource Estimates: Memory=274MB
> |
> | Analyzed query: SELECT c_custkey, o_orderkey, l_partkey FROM
> |
> | tpch_nested_parquet.customer c, c.c_orders o, o.o_lineitems li
> |
> |
> |
> | F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |
> | | Per-Host Resources: mem-estimate=10.06MB mem-reservation=0B
> thread-reservation=1 |
> | PLAN-ROOT SINK
> |
> | | output exprs: c_custkey, o_orderkey, l_partkey
> |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | |
> |
> | 09:EXCHANGE [UNPARTITIONED]
> |
> | | mem-estimate=10.06MB mem-reservation=0B thread-reservation=0
> |
> | | tuple-ids=2,1,0 row-size=48B cardinality=15.00M
> |
> | | in pipelines: 00(GETNEXT)
> |
> | |
> |
> | F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1
> |
> | Per-Host Resources: mem-estimate=264.00MB mem-reservation=16.00MB
> thread-reservation=2 |
> | 01:SUBPLAN
> |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | | tuple-ids=2,1,0 row-size=48B cardinality=15.00M
> |
> | | in pipelines: 00(GETNEXT)
> |
> | |
> |
> | |--08:NESTED LOOP JOIN [CROSS JOIN]
> |
> | | | mem-estimate=20B mem-reservation=0B thread-reservation=0
> |
> | | | tuple-ids=2,1,0 row-size=48B cardinality=100
> |
> | | | in pipelines: 00(GETNEXT)
> |
> | | |
> |
> | | |--02:SINGULAR ROW SRC
> |
> | | | parent-subplan=01
> |
> | | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | | | tuple-ids=0 row-size=20B cardinality=1
> |
> | | | in pipelines: 00(GETNEXT)
> |
> | | |
> |
> | | 04:SUBPLAN
> |
> | | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | | | tuple-ids=2,1 row-size=28B cardinality=100
> |
> | | | in pipelines: 00(GETNEXT)
> |
> | | |
> |
> | | |--07:NESTED LOOP JOIN [CROSS JOIN]
> |
> | | | | mem-estimate=20B mem-reservation=0B thread-reservation=0
> |
> | | | | tuple-ids=2,1 row-size=28B cardinality=10
> |
> | | | | in pipelines: 00(GETNEXT)
> |
> | | | |
> |
> | | | |--05:SINGULAR ROW SRC
> |
> | | | | parent-subplan=04
> |
> | | | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | | | | tuple-ids=1 row-size=20B cardinality=1
> |
> | | | | in pipelines: 00(GETNEXT)
> |
> | | | |
> |
> | | | 06:UNNEST [o.o_lineitems li]
> |
> | | | parent-subplan=04
> |
> | | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | | | tuple-ids=2 row-size=0B cardinality=10
> |
> | | | in pipelines: 00(GETNEXT)
> |
> | | |
> |
> | | 03:UNNEST [c.c_orders o]
> |
> | | parent-subplan=01
> |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | | tuple-ids=1 row-size=0B cardinality=10
> |
> | | in pipelines: 00(GETNEXT)
> |
> | |
> |
> | 00:SCAN HDFS [tpch_nested_parquet.customer c, RANDOM]
> |
> | HDFS partitions=1/1 files=4 size=289.13MB
> |
> | predicates: !empty(c.c_orders)
> |
> | predicates on o: !empty(o.o_lineitems)
> |
> | stored statistics:
> |
> | table: rows=150.00K size=289.13MB
> |
> | columns missing stats: c_orders
> |
> | extrapolated-rows=disabled max-scan-range-rows=50.11K
> |
> | mem-estimate=264.00MB mem-reservation=16.00MB thread-reservation=1
> |
> | tuple-ids=0 row-size=20B cardinality=150.00K
> |
> | in pipelines: 00(GETNEXT)
> |
> +----------------------------------------------------------------------------------------+
> {code}
>
> It's easy to figure out what node 05 is doing but kind of tricky to
> understand what 02 is doing.
> One option would be for 02 to have the following annotation or something else
> more informative:
>
> {{SINGULAR ROW SRC [c.c_orders o, o.o_lineitems li]}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]