alamb opened a new pull request #7959: URL: https://github.com/apache/arrow/pull/7959
In order to help users and developers understand what DataFusion's planner is doing, this PR adds an `"EXPLAIN PLAN"` feature. All other database systems I have worked with have such a feature (e.g. see [MySql](https://dev.mysql.com/doc/refman/8.0/en/explain-output.html)). Example printout (the plans printed are simply the `std::fmt::Debug` representation of the plan structures: ``` > explain SELECT status, COUNT(1) FROM http_api_requests_total WHERE path = '/api/v2/write' GROUP BY status; +--------------+----------------------------------------------------------+ | plan_type | plan | +--------------+----------------------------------------------------------+ | logical_plan | Aggregate: groupBy=[[#status]], aggr=[[COUNT(UInt8(1))]] | | | Selection: #path Eq Utf8("/api/v2/write") | | | TableScan: http_api_requests_total projection=None | +--------------+----------------------------------------------------------+ 1 rows in set. Query took 0 seconds. ``` And an example `EXPLAIN VERBOSE`: ``` > explain verbose SELECT status, COUNT(1) FROM http_api_requests_total WHERE path = '/api/v2/write' GROUP BY status; +----------------------+----------------------------------------------------------------+ | plan_type | plan | +----------------------+----------------------------------------------------------------+ | logical_plan | Aggregate: groupBy=[[#status]], aggr=[[COUNT(UInt8(1))]] | | | Selection: #path Eq Utf8("/api/v2/write") | | | TableScan: http_api_requests_total projection=None | | projection_push_down | Aggregate: groupBy=[[#status]], aggr=[[COUNT(UInt8(1))]] | | | Selection: #path Eq Utf8("/api/v2/write") | | | TableScan: http_api_requests_total projection=Some([6, 8]) | | type_coercion | Aggregate: groupBy=[[#status]], aggr=[[COUNT(UInt8(1))]] | | | Selection: #path Eq Utf8("/api/v2/write") | | | TableScan: http_api_requests_total projection=Some([6, 8]) | | physical_plan | HashAggregateExec { | | | group_expr: [ | | | Column { | | | name: "status", | | | }, | | | ], | | | aggr_expr: [ | | | Count { | | | expr: Literal { | | | value: UInt8( | | | 1, | | | ), | | | }, | | | }, | | | ], | | | input: SelectionExec { | | | expr: BinaryExpr { | | | left: Column { | | | name: "path", | | | }, | | | op: Eq, | | | right: Literal { | | | value: Utf8( | | | "/api/v2/write", | | | ), | | | }, | | | }, | | | input: DataSourceExec { | | | schema: Schema { | | | fields: [ | | | Field { | | | name: "path", | | | data_type: Utf8, | | | nullable: true, | | | dict_id: 0, | | | dict_is_ordered: false, | | | }, | | | Field { | | | name: "status", | | | data_type: Utf8, | | | nullable: true, | | | dict_id: 0, | | | dict_is_ordered: false, | | | }, | | | ], | | | metadata: {}, | | | }, | | | partitions.len: 1, | | | }, | | | }, | | | schema: Schema { | | | fields: [ | | | Field { | | | name: "status", | | | data_type: Utf8, | | | nullable: true, | | | dict_id: 0, | | | dict_is_ordered: false, | | | }, | | | Field { | | | name: "COUNT(UInt8(1))", | | | data_type: UInt64, | | | nullable: true, | | | dict_id: 0, | | | dict_is_ordered: false, | | | }, | | | ], | | | metadata: {}, | | | }, | | | } | +----------------------+----------------------------------------------------------------+ 4 row in set. Query took 0 seconds. ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
