houqp commented on pull request #1067:
URL: https://github.com/apache/arrow-datafusion/pull/1067#issuecomment-933018902


   > 
https://github.com/apache/arrow-datafusion/blob/master/benchmarks/src/bin/tpch.rs#L1092
 is ok and get a plan, the plan is correct?
   
   Yes, i think the logical plan you got there is correct.
   
   > Which part makes Logical Plan Proto to Logical Plan get errors, now I 
don't know how to fix it.
   
   I believe this is due to the alias patching is done at the sql planner 
layer, not the plan builder layer. We only have this alias info in the sql 
planner right now (parsed from the sql query). When you patch the projection in 
the SQL planner with the alias, this alias info is lost in the logical plan. In 
other words, it's not possible to reconstruct the same alias patched projection 
plan node using only the information provided by the logical plan tree.
   
   If you look at the children of the patched projection, there is no 
mentioning of that subquery alias b at all.
   
   ```
   |               |     TableScan: a projection=Some([0, 1])                   
                                                                                
                                              |
   |               |     Projection: #customer.column_1, #customer.column_2     
                                                                                
                                              |
   |               |       TableScan: customer projection=Some([0, 1])     
   ```
   
   In order to make the full plan serializable without access to the raw SQL 
query, we need to add the subquery alias to the logical plan tree as well. 
During protobuf plan ser/de in ballista, we don't pass along the SQL query, but 
only the planned logical plan from the SQL planner.
   
   I can see two ways to accomplish this:
   
   First approach is to add an optional alias field to our projection plan node 
similar to what we have with union:
   
   
https://github.com/apache/arrow-datafusion/blob/2f04d67156ec91afa628a3ed47003b8f992450bf/datafusion/src/logical_plan/plan.rs#L160
   
   Then we can perform the schema qualifier patch in the plan builder's project 
method similar to what we do with union alias:
   
   
https://github.com/apache/arrow-datafusion/blob/2f04d67156ec91afa628a3ed47003b8f992450bf/datafusion/src/logical_plan/builder.rs#L584
   
   Second approach is to introduce a new type of Alias plan node that we can 
use to wrap any plan node to perform this qualifier patching logic.
   
   I think adding an alias field to the projection plan node would be simpler. 
@alamb @Dandandan @jorgecarleitao @andygrove WDTY?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to