adriangb opened a new pull request, #18719:
URL: https://github.com/apache/datafusion/pull/18719
## Summary
This PR enhances the physical-expr projection handling with several
improvements needed for better projection management in datasources.
## Changes
1. **Add trait implementations**:
- Added `PartialEq` and `Eq` for `ProjectionExpr`
- Added `PartialEq` and `Eq` for `ProjectionExprs`
2. **Add `project_batch()` method**:
- Efficiently projects `RecordBatch` with pre-computed schema
- Handles empty projections correctly
- Reduces schema projection overhead for repeated calls
3. **Fix `update_expr()` bug**:
- **Bug**: Previously returned `None` for literal expressions (no column
references)
- **Fix**: Now returns `Some(expr)` for both `Unchanged` and
`RewrittenValid` states
- **Impact**: Critical for queries like `SELECT 1 FROM table` where no
file columns are needed
4. **Change `from_indices()` signature**:
- Changed from `&SchemaRef` to `&Schema` for consistency
5. **Add comprehensive tests**:
- `test_merge_empty_projection_with_literal()` - Reproduces roundtrip
issue
- `test_update_expr_with_literal()` - Tests literal handling
- `test_update_expr_with_complex_literal_expr()` - Tests mixed expressions
## Part of
This PR is part of #18627 - a larger effort to refactor projection handling
in DataFusion.
## Testing
All tests pass:
- ✅ New projection tests
- ✅ Existing physical-expr test suite
- ✅ Doc tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]