andygrove opened a new pull request, #3464:
URL: https://github.com/apache/datafusion-comet/pull/3464

   ## Summary
   
   This is a **work-in-progress** draft PR for testing fixes in CI. It builds 
on the DataFusion 52 migration work in PR #3052 with fixes for failing 
Rust-level tests.
   
   ### Fixes included:
   - **Date32 +/- Int8/Int16/Int32 arithmetic**: Use 
`SparkDateAdd`/`SparkDateSub` UDFs since DF52's arrow-arith only supports 
`Date32 +/- Interval` types, not raw integers
   - **Schema adapter nested types**: Replace `equals_datatype` with 
`PartialEq` (`==`) so struct field name differences are detected and 
`spark_parquet_convert` is invoked for field-name-based selection
   - **Schema adapter complex nested casts**: Add fallback path 
(`wrap_all_type_mismatches`) when the default adapter fails for complex nested 
type casts (List<Struct>, Map)
   - **Schema adapter CastColumnExpr replacement**: Route Struct/List/Map casts 
through `CometCastColumnExpr` with `spark_parquet_convert`, simple scalars 
through Spark Cast
   - **Dictionary unpack tests**: Restructure polling to handle DF52's 
`FilterExec` batch coalescer which accumulates rows before returning
   
   ### Known remaining issues (Spark-level test failures being investigated):
   - **SortMergeJoin LeftSemi duplicate rows**: DF52 had a major SMJ rewrite 
(datafusion#18875) that may cause duplicate rows in LeftSemi joins
   - **NaN predicate expression**: `SELECT c1 FROM test_table WHERE c3 >= NaN` 
returns 0 rows instead of expected 1 row
   
   ### Relationship to #3052
   This branch is based on `comphead/df52` with `apache/main` merged in. The 
fixes here address Rust-level test failures discovered during the DF52 
migration.
   
   ## Test plan
   - [ ] CI passes for Rust-level tests (140 native tests)
   - [ ] Investigate and fix remaining Spark-level test failures
   - [ ] Verify no regressions in existing test suites
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to