andygrove opened a new pull request, #3675: URL: https://github.com/apache/datafusion-comet/pull/3675
## Which issue does this PR close? Related to: #3645, #3644, #3346, #3332, #3330, #3180, #3173, #3016, #2649, #2646, #1897, #1729, #1630 ## Rationale for this change Several expressions are currently marked as Compatible (Spark-compatible) but have open correctness issues that can produce incorrect results. These expressions should fall back to Spark by default to prevent silent data corruption, and can be explicitly enabled by users who understand the trade-offs via `allowIncompatible=true`. ## What changes are included in this PR? **Expressions marked as Incompatible (9 expressions across 6 serde files):** | Expression | Issue(s) | Condition | |---|---|---| | `ArrayContains` | #3346 | Always (empty array with literal) | | `GetArrayItem` | #3330, #3332 | Always (index handling bugs) | | `ArrayRemove` | #3173 | Always (null element removal) | | `Hour`, `Minute`, `Second` | #3180 | Only for TimestampNTZ inputs | | `TruncTimestamp` | #2649 | Only for non-UTC timezones | | `Ceil`, `Floor` | #1729 | Only for Decimal type inputs | | `Tan` | #1897 | Always (negative zero) | | `Corr` | #2646 | Always (null vs NaN) | | `StructsToJson` | #3016 | Always (Infinity values) | Where possible, the incompatibility is conditional on the specific input type that triggers the bug (e.g., Hour/Minute/Second are only incompatible for TimestampNTZ, Ceil/Floor only for Decimal, TruncTimestamp only for non-UTC timezones). **Documentation updates:** - `expressions.md`: Updated Spark-Compatible status from "Yes" to "No" for all affected expressions, with compatibility notes linking to tracking issues - `compatibility.md`: Added detailed "Incompatible Expressions" subsections organized by category (Array, Date/Time, Math, Aggregate, Struct) with descriptions and issue links ## How are these changes tested? These changes only add `getSupportLevel` overrides (which cause expressions to fall back to Spark by default) and update documentation. The existing test suite covers the fallback mechanism. No new behavioral logic is introduced. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
