alamb opened a new pull request #8619: URL: https://github.com/apache/arrow/pull/8619
# Rationale: I have been tracking down potential issues DataFusion for my work project, and I have found myself wanting to print out the state of the logical_plan several times. The existing debug formatting is ok, but it was missing a few key items: 1. Schema information (as in when did columns appear / disappear in the plan) 2. A visual representation (graphviz) # Open questions: 1. Would it be better to split the visitor into `visitor.rs` and display code into `display.rs`? I am torn -- this is all logically part of logical_plan, but the module is getting kind of big. # Changes: This PR adds several additional formatting options to logical plans in addition to the existing indent. Examples are included below To do so it also provides a generalized "Visitor" pattern for walking logical plan nodes, as well as a general pattern to display logical plan nodes with multiple potential formats. Note it should be straight forward to get this wired up into EXPALIN as well: https://issues.apache.org/jira/browse/ARROW-9746 ## Existing Formatting Here is what master currently allows: ``` Projection: #id Filter: #state Eq Utf8(\"CO\")\ CsvScan: employee.csv projection=Some([0, 3]) ``` ## With Schema Information. This PR adds a dump with schema information: ``` Projection: #id [id:Int32]\ Filter: #state Eq Utf8(\"CO\") [id:Int32, state:Utf8]\ TableScan: employee.csv projection=Some([0, 3]) [id:Int32, state:Utf8]"; ``` ## As Graphviz This PR adds the ability to display plans using [Graphviz](http://www.graphviz.org) Here is an example GraphViz plan that comes out: ``` // Begin DataFusion GraphViz Plan (see https://graphviz.org) digraph { subgraph cluster_1 { graph[label="LogicalPlan"] 2[label="Projection: #id"] 3[label="Filter: #state Eq Utf8(_CO_)"] 2 -> 3 [arrowhead=none, arrowtail=normal, dir=back] 4[label="TableScan: employee.csv projection=Some([0, 3])"] 3 -> 4 [arrowhead=none, arrowtail=normal, dir=back] } subgraph cluster_5 { graph[label="Detailed LogicalPlan"] 6[label="Projection: #id\nSchema: [id:Int32]"] 7[label="Filter: #state Eq Utf8(_CO_)\nSchema: [id:Int32, state:Utf8]"] 6 -> 7 [arrowhead=none, arrowtail=normal, dir=back] 8[label="TableScan: employee.csv projection=Some([0, 3])\nSchema: [id:Int32, state:Utf8]"] 7 -> 8 [arrowhead=none, arrowtail=normal, dir=back] } } // End DataFusion GraphViz Plan ``` Here is what that looks like rendered: <img width="1679" alt="Screen Shot 2020-11-09 at 2 30 07 PM" src="https://user-images.githubusercontent.com/490673/98606322-0f891880-22b5-11eb-8e1c-669ce85f0f52.png"> ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org