adriangb opened a new pull request, #20091: URL: https://github.com/apache/datafusion/pull/20091
## Motivation Currently, `TableScan` stores projections as column indices (`Option<Vec<usize>>`) which requires constant conversion between indices and expressions throughout the codebase. By storing expressions directly, we: 1. **Simplify the data model** - projections are naturally expressions 2. **Enable future consolidation** of filter expressions into projections 3. **Reduce conversion overhead** in optimization passes This is the first of two PRs splitting PR #20061, which consolidates both projections and filters. This PR focuses solely on the projection type change to make the refactoring easier to review. ## Changes ### Core Type Change - Changed `TableScan.projection` from `Option<Vec<usize>>` to `Option<Vec<Expr>>` - Each expression is a simple `Expr::Column` reference ### New APIs - Added `TableScanBuilder` for constructing `TableScan` nodes with expression-based projections directly - Added `projection_indices_from_exprs()` helper in `utils.rs` to convert expressions back to indices when needed (for physical planning and serialization) ### Backward Compatibility - `TableScan::try_new()` still accepts `Option<Vec<usize>>` and converts indices to expressions internally - `LogicalPlanBuilder::scan*` methods still accept indices for backward compatibility ### Updated Components - **optimize_projections**: Updated to work with expression-based projections using `TableScanBuilder` - **physical_planner**: Converts expressions back to indices for `ScanArgs` - **proto serialization**: Extracts column names from expressions for serialization - **substrait**: Converts between expressions and Substrait field indices - **SQL unparser**: Extracts column names from projection expressions ## Related Issues / PRs - Split from #20061 (which consolidates both projections and filters) - Enables future work on filter consolidation in TableScan ## Test Plan - [x] `cargo check` passes for all affected crates - [x] `cargo test -p datafusion-expr` passes - [x] `cargo test -p datafusion-optimizer` passes - [x] Proto and substrait crates compile successfully 🤖 Generated with [Claude Code](https://claude.ai/code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
