[GitHub] [arrow-datafusion] xudong963 commented on issue #1228: Extract logical plans in LogicalPlan as independent struct

GitBox Mon, 15 Nov 2021 15:43:19 -0800


xudong963 commented on issue #1228:
URL: 
https://github.com/apache/arrow-datafusion/issues/1228#issuecomment-969443381



   Hi @rdettai, long time no see. Thanks for your feedback.
   
   Now, this is not just a question of subqueries, but a question of which 
design is better.
   
   During my implementation of subqueries, I encountered some obstacles due to 
the current pattern.
   
   ```rust
   pub enum LogicalPlan {
      ...
       TableScan {
           /// The name of the table
           table_name: String,
           /// The source of the table
           source: Arc<dyn TableProvider>,
           /// Optional column indices to use as a projection
           projection: Option<Vec<usize>>,
           /// The schema description of the output
           projected_schema: DFSchemaRef,
           /// Optional expressions to be used as filters by the table provider
           filters: Vec<Expr>,
           /// Optional limit to skip reading
           limit: Option<usize>,
       },
      ...
   }
   ```
   So I started to think about the disadvantages of this pattern.
   
   First of all, I think the following pattern is more common in Rust, such as 
https://github.com/rust-lang/rust/blob/ad808d95c4839caedc2be76d0ed059dc920ab4b6/src/librustc/hir/mod.rs#L2752
   ```rust
   pub struct TableScanPlan {
       /// The name of the table
       table_name: String,
       /// The source of the table
       source: Arc<dyn TableProvider>,
       /// Optional column indices to use as a projection
       projection: Option<Vec<usize>>,
       /// The schema description of the output
       projected_schema: DFSchemaRef,
       /// Optional expressions to be used as filters by the table provider
       filters: Vec<Expr>,
       /// Optional limit to skip reading
       limit: Option<usize>,
   }
   
   pub enum LogicalPlan {
     ...
     TableScan(TableScanPlan),
     ...
   }
   ```
   
   In addition to the above, it has some other benefits:
   
   1. It's more clearer and flexiable, long-term thinking, maybe we can profit 
from extracting LogicalPlan into structs.
   2. Enum variants are not types, so `LogicalPlan::TableScan` can't be a 
function parameter or stored, you need to pass `Logical Plan`. But if each enum 
variant has its own type, `TableScan` can be passed and stored.
   3. When a match occurs, the following are the match styles corresponding to 
the above two writing methods
      ```rust
      match LogicalPlan {
          TableScan {table_name, source, projection, projected_schema, filters, 
...} => /* some actions */
      }
      ```
   
      ```rust
      match LogicalPlan {
          TableScan(ts) => /* ts.table_name, ts.source, ... */
      }
      ```
   
   4. If multiple enums have the same variants, we need to repeat to write the 
variant with all its fields when using the first pattern.
   
   Sorry, I don't link all the subtasks to an umbrella issue, I'll do it now.
   
   cc @alamb @houqp  and welcome everyone to discuss!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] xudong963 commented on issue #1228: Extract logical plans in LogicalPlan as independent struct

Reply via email to