adriangb opened a new pull request, #20091:
URL: https://github.com/apache/datafusion/pull/20091

   ## Motivation
   
   Currently, `TableScan` stores projections as column indices 
(`Option<Vec<usize>>`) which requires constant conversion between indices and 
expressions throughout the codebase. By storing expressions directly, we:
   
   1. **Simplify the data model** - projections are naturally expressions
   2. **Enable future consolidation** of filter expressions into projections  
   3. **Reduce conversion overhead** in optimization passes
   
   This is the first of two PRs splitting PR #20061, which consolidates both 
projections and filters. This PR focuses solely on the projection type change 
to make the refactoring easier to review.
   
   ## Changes
   
   ### Core Type Change
   - Changed `TableScan.projection` from `Option<Vec<usize>>` to 
`Option<Vec<Expr>>`
   - Each expression is a simple `Expr::Column` reference
   
   ### New APIs
   - Added `TableScanBuilder` for constructing `TableScan` nodes with 
expression-based projections directly
   - Added `projection_indices_from_exprs()` helper in `utils.rs` to convert 
expressions back to indices when needed (for physical planning and 
serialization)
   
   ### Backward Compatibility
   - `TableScan::try_new()` still accepts `Option<Vec<usize>>` and converts 
indices to expressions internally
   - `LogicalPlanBuilder::scan*` methods still accept indices for backward 
compatibility
   
   ### Updated Components
   - **optimize_projections**: Updated to work with expression-based 
projections using `TableScanBuilder`
   - **physical_planner**: Converts expressions back to indices for `ScanArgs`
   - **proto serialization**: Extracts column names from expressions for 
serialization
   - **substrait**: Converts between expressions and Substrait field indices
   - **SQL unparser**: Extracts column names from projection expressions
   
   ## Related Issues / PRs
   
   - Split from #20061 (which consolidates both projections and filters)
   - Enables future work on filter consolidation in TableScan
   
   ## Test Plan
   
   - [x] `cargo check` passes for all affected crates
   - [x] `cargo test -p datafusion-expr` passes
   - [x] `cargo test -p datafusion-optimizer` passes  
   - [x] Proto and substrait crates compile successfully
   
   🤖 Generated with [Claude Code](https://claude.ai/code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to