alamb opened a new issue #733: URL: https://github.com/apache/arrow-datafusion/issues/733
**Describe the bug** The output of `EXPLAIN VERBOSE` does not include all the different passes nor final physical plan. **To Reproduce** run `EXPLAIN VERBOSE SELECT ...` **Expected behavior** I expect all the optimizer passes to be shown as well as the physical plan. Actually only `projection_push_down` and `simplify_expressions` are shown. This is despite the fact I know (by putting `println` in the code) that the other passes such as `aggregate_statistics` are being run) **Additional context** I was working to add some tests in IOx based on explain plans and I expected to see the results of statistics replacement in the explain plan (aka I expected to see `count(*)` be rewritten to `num_rows` by AggregateStatistics in https://github.com/apache/arrow-datafusion/blob/master/datafusion/src/optimizer/aggregate_statistics.rs#L41 Not only was the optimizer pass not included in the explain verbose, but its results were not reflected in the explain plan Here is an example of what came out: ``` : EXPLAIN VERBOSE SELECT count(*) from h2o; +-----------------------------------------+-----------------------------------------------------------------------------+ | plan_type | plan | +-----------------------------------------+-----------------------------------------------------------------------------+ | logical_plan | Projection: #COUNT(UInt8(1)) | | | Aggregate: groupBy=[[]], aggr=[[COUNT(UInt8(1))]] | | | TableScan: h2o projection=None | | logical_plan after projection_push_down | Projection: #COUNT(UInt8(1)) | | | Aggregate: groupBy=[[]], aggr=[[COUNT(UInt8(1))]] | | | TableScan: h2o projection=Some([0]) | | logical_plan after simplify_expressions | Projection: #COUNT(UInt8(1)) | | | Aggregate: groupBy=[[]], aggr=[[COUNT(UInt8(1))]] | | | TableScan: h2o projection=Some([0]) | | physical_plan | ProjectionExec: expr=[COUNT(UInt8(1))@0 as COUNT(UInt8(1))] | | | HashAggregateExec: mode=Final, gby=[], aggr=[COUNT(UInt8(1))] | | | HashAggregateExec: mode=Partial, gby=[], aggr=[COUNT(UInt8(1))] | | | ProjectionExec: expr=[city@0 as city] | | | DeduplicateExec: [city@0 ASC,state@1 ASC,time@2 ASC] | | | SortExec: [city@0 ASC,state@1 ASC,time@2 ASC] | | | IOxReadFilterNode: table_name=h2o, chunks=1 predicate=Predicate | +-----------------------------------------+-----------------------------------------------------------------------------+ ``` Here is what should have happened (note the removal of the actual scan), when I added the call to `optimize_explain` in AggregateStatistics: ``` +-----------------------------------------+-------------------------------------------------------------+ | plan_type | plan | +-----------------------------------------+-------------------------------------------------------------+ | logical_plan | Projection: #COUNT(UInt8(1)) | | | Aggregate: groupBy=[[]], aggr=[[COUNT(UInt8(1))]] | | | TableScan: h2o projection=None | | logical_plan after aggregate_statistics | Projection: #COUNT(UInt8(1)) | | | Projection: UInt64(3) AS COUNT(Uint8(1)) | | | EmptyRelation | | logical_plan after projection_push_down | Projection: #COUNT(UInt8(1)) | | | Projection: UInt64(3) AS COUNT(Uint8(1)) | | | EmptyRelation | | logical_plan after simplify_expressions | Projection: #COUNT(UInt8(1)) | | | Projection: UInt64(3) AS COUNT(Uint8(1)) | | | EmptyRelation | | physical_plan | ProjectionExec: expr=[COUNT(UInt8(1))@0 as COUNT(Uint8(1))] | | | ProjectionExec: expr=[3 as COUNT(Uint8(1))] | | | EmptyExec: produce_one_row=true | +-----------------------------------------+-------------------------------------------------------------+ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
