xudong963 edited a comment on issue #1228:
URL:
https://github.com/apache/arrow-datafusion/issues/1228#issuecomment-969443381
Hi @rdettai, long time no see. Thanks for your feedback.
Now, this is not just a question of subqueries, but a question of which
design is better.
During my implementation of subqueries, I encountered some obstacles due to
the current pattern.
```rust
pub enum LogicalPlan {
...
TableScan {
/// The name of the table
table_name: String,
/// The source of the table
source: Arc<dyn TableProvider>,
/// Optional column indices to use as a projection
projection: Option<Vec<usize>>,
/// The schema description of the output
projected_schema: DFSchemaRef,
/// Optional expressions to be used as filters by the table provider
filters: Vec<Expr>,
/// Optional limit to skip reading
limit: Option<usize>,
},
...
}
```
So I started to think about the disadvantages of this pattern.
First of all, I think the following pattern is more common in Rust, such as
https://github.com/rust-lang/rust/blob/ad808d95c4839caedc2be76d0ed059dc920ab4b6/src/librustc/hir/mod.rs#L2752
```rust
pub struct TableScanPlan {
/// The name of the table
table_name: String,
/// The source of the table
source: Arc<dyn TableProvider>,
/// Optional column indices to use as a projection
projection: Option<Vec<usize>>,
/// The schema description of the output
projected_schema: DFSchemaRef,
/// Optional expressions to be used as filters by the table provider
filters: Vec<Expr>,
/// Optional limit to skip reading
limit: Option<usize>,
}
pub enum LogicalPlan {
...
TableScan(TableScanPlan),
...
}
```
In addition to the above, it has some other benefits:
1. It's more clearer and flexiable, long-term thinking, maybe we can profit
from extracting LogicalPlan into structs.
2. Enum variants are not types, so `LogicalPlan::TableScan` can't be a
function parameter or stored, you need to pass `Logical Plan`. But if each enum
variant has its own type, `TableScan` can be passed and stored.
3. When a match occurs, the following are the match styles corresponding to
the above two writing methods
```rust
match LogicalPlan {
TableScan {table_name, source, projection, projected_schema, filters,
...} => /* some actions */
}
```
```rust
match LogicalPlan {
TableScan(ts) => /* ts.table_name, ts.source, ... */
}
```
4. If multiple enums have the same variants, we need to repeat to write the
variant with all its fields when using the first pattern.
Sorry, I don't link all the subtasks to an umbrella issue, I'll do it now.
above all, I think this is a meaningful and valuable refactoring, despite
major changes to the codebase.
cc @alamb @houqp and welcome everyone to discuss!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]