[
https://issues.apache.org/jira/browse/IMPALA-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-7293:
----------------------------------
Description:
There's currently no way to tell in the explain plan what the contents of each
tuple are. At explain_level>=2 we include "tuple-ids" but no information about
what is actually in the tuples.
{noformat}
[localhost:21000] default> explain select min(regexp_replace(l_comment, ".",
"x"))
from tpch.lineitem; summary;
Query: explain select min(regexp_replace(l_comment, ".", "x"))
from tpch.lineitem
+---------------------------------------------------------------------------------------+
| Explain String
|
+---------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=3
|
| Per-Host Resource Estimates: Memory=284.00MB
|
|
|
| F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
| | Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B
thread-reservation=1 |
| PLAN-ROOT SINK
|
| | mem-estimate=0B mem-reservation=0B thread-reservation=0
|
| |
|
| 03:AGGREGATE [FINALIZE]
|
| | output: min:merge(regexp_replace(l_comment, '.', 'x'))
|
| | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
thread-reservation=0 |
| | tuple-ids=1 row-size=16B cardinality=1
|
| |
|
| 02:EXCHANGE [UNPARTITIONED]
|
| | mem-estimate=0B mem-reservation=0B thread-reservation=0
|
| | tuple-ids=1 row-size=16B cardinality=1
|
| |
|
| F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
| Per-Host Resources: mem-estimate=274.00MB mem-reservation=8.00MB
thread-reservation=2 |
| 01:AGGREGATE
|
| | output: min(regexp_replace(l_comment, '.', 'x'))
|
| | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
thread-reservation=0 |
| | tuple-ids=1 row-size=16B cardinality=1
|
| |
|
| 00:SCAN HDFS [tpch.lineitem, RANDOM]
|
| partitions=1/1 files=1 size=718.94MB
|
| stored statistics:
|
| table: rows=6001215 size=718.94MB
|
| columns: all
|
| extrapolated-rows=disabled max-scan-range-rows=1068457
|
| mem-estimate=264.00MB mem-reservation=8.00MB thread-reservation=1
|
| tuple-ids=0 row-size=42B cardinality=6001215
|
+---------------------------------------------------------------------------------------+
Fetched 32 row(s) in 0.01s
Summary not available
{noformat}
We already have a debugString() methods that prints a human-readable
representation. We could start off by printing a tuple descriptor per line at
the end of the explain plan and tweaking it a little where necessary to make it
more readable, e.g. hiding non-materialized slots.
was:
There's currently no way to tell in the explain plan what the contents of each
tuple are. At explain_level>=2 we include "tuple-ids" but no information about
what is actually in the tuples.
{noformat}
[localhost:21000] default> explain select min(regexp_replace(l_comment, ".",
"x"))
from tpch.lineitem; summary;
Query: explain select min(regexp_replace(l_comment, ".", "x"))
from tpch.lineitem
+---------------------------------------------------------------------------------------+
| Explain String
|
+---------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=3
|
| Per-Host Resource Estimates: Memory=284.00MB
|
|
|
| F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|
| | Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B
thread-reservation=1 |
| PLAN-ROOT SINK
|
| | mem-estimate=0B mem-reservation=0B thread-reservation=0
|
| |
|
| 03:AGGREGATE [FINALIZE]
|
| | output: min:merge(regexp_replace(l_comment, '.', 'x'))
|
| | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
thread-reservation=0 |
| | tuple-ids=1 row-size=16B cardinality=1
|
| |
|
| 02:EXCHANGE [UNPARTITIONED]
|
| | mem-estimate=0B mem-reservation=0B thread-reservation=0
|
| | tuple-ids=1 row-size=16B cardinality=1
|
| |
|
| F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
|
| Per-Host Resources: mem-estimate=274.00MB mem-reservation=8.00MB
thread-reservation=2 |
| 01:AGGREGATE
|
| | output: min(regexp_replace(l_comment, '.', 'x'))
|
| | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
thread-reservation=0 |
| | tuple-ids=1 row-size=16B cardinality=1
|
| |
|
| 00:SCAN HDFS [tpch.lineitem, RANDOM]
|
| partitions=1/1 files=1 size=718.94MB
|
| stored statistics:
|
| table: rows=6001215 size=718.94MB
|
| columns: all
|
| extrapolated-rows=disabled max-scan-range-rows=1068457
|
| mem-estimate=264.00MB mem-reservation=8.00MB thread-reservation=1
|
| tuple-ids=0 row-size=42B cardinality=6001215
|
+---------------------------------------------------------------------------------------+
Fetched 32 row(s) in 0.01s
Summary not available
{noformat}
> Show tuple layout in explain plan
> ---------------------------------
>
> Key: IMPALA-7293
> URL: https://issues.apache.org/jira/browse/IMPALA-7293
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Tim Armstrong
> Priority: Major
> Labels: observability
>
> There's currently no way to tell in the explain plan what the contents of
> each tuple are. At explain_level>=2 we include "tuple-ids" but no information
> about what is actually in the tuples.
> {noformat}
> [localhost:21000] default> explain select min(regexp_replace(l_comment, ".",
> "x"))
> from tpch.lineitem; summary;
> Query: explain select min(regexp_replace(l_comment, ".", "x"))
> from tpch.lineitem
> +---------------------------------------------------------------------------------------+
> | Explain String
> |
> +---------------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=8.00MB Threads=3
> |
> | Per-Host Resource Estimates: Memory=284.00MB
> |
> |
> |
> | F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |
> | | Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B
> thread-reservation=1 |
> | PLAN-ROOT SINK
> |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | |
> |
> | 03:AGGREGATE [FINALIZE]
> |
> | | output: min:merge(regexp_replace(l_comment, '.', 'x'))
> |
> | | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
> thread-reservation=0 |
> | | tuple-ids=1 row-size=16B cardinality=1
> |
> | |
> |
> | 02:EXCHANGE [UNPARTITIONED]
> |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> | | tuple-ids=1 row-size=16B cardinality=1
> |
> | |
> |
> | F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> |
> | Per-Host Resources: mem-estimate=274.00MB mem-reservation=8.00MB
> thread-reservation=2 |
> | 01:AGGREGATE
> |
> | | output: min(regexp_replace(l_comment, '.', 'x'))
> |
> | | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
> thread-reservation=0 |
> | | tuple-ids=1 row-size=16B cardinality=1
> |
> | |
> |
> | 00:SCAN HDFS [tpch.lineitem, RANDOM]
> |
> | partitions=1/1 files=1 size=718.94MB
> |
> | stored statistics:
> |
> | table: rows=6001215 size=718.94MB
> |
> | columns: all
> |
> | extrapolated-rows=disabled max-scan-range-rows=1068457
> |
> | mem-estimate=264.00MB mem-reservation=8.00MB thread-reservation=1
> |
> | tuple-ids=0 row-size=42B cardinality=6001215
> |
> +---------------------------------------------------------------------------------------+
> Fetched 32 row(s) in 0.01s
> Summary not available
> {noformat}
> We already have a debugString() methods that prints a human-readable
> representation. We could start off by printing a tuple descriptor per line at
> the end of the explain plan and tweaking it a little where necessary to make
> it more readable, e.g. hiding non-materialized slots.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]