alamb opened a new issue #779:
URL: https://github.com/apache/arrow-datafusion/issues/779
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
Now that we have `EXPLAIN <query>` we know what plan DataFusion *will*
execute. However, there is no particularly easy way to see what actually did
happen (e.g how many rows were actually read / filtered by each operator).
**Describe the solution you'd like**
I would like to extend DataFusion's EXPLAIN functionality to also include
the ability to actually run the plan, capture metrics, and display them
I imagine something like the following (adding the `executed_plan` row)
```
> EXPLAIN ANALYZE SELECT * from foo;
+---------------+--------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+--------------------------------------------------------------------------+
| logical_plan | Projection: #foo.x
|
| | TableScan: foo projection=Some([0])
|
| physical_plan | ProjectionExec: expr=[x@0 as x]
|
| | RepartitionExec: partitioning=RoundRobinBatch(16)
|
| | CsvExec: source=Path(/tmp/foo.csv: [/tmp/foo.csv]),
has_header=false |
| executed_plan | ProjectionExec: num_rows=2 exec_ms=6
|
| | RepartitionExec: num_rows=2 exec_ms=4
|
| | CsvExec: num_rows=2, exec_ms=300 |
+---------------+--------------------------------------------------------------------------+
```
2 rows in set. Query took 0.002 seconds.
**Additional context**
We probably need something like
https://github.com/apache/arrow-datafusion/issues/679 completed prior to doing
this?
cc @Dandandan and @andygrove
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]