andygrove opened a new pull request, #3675:
URL: https://github.com/apache/datafusion-comet/pull/3675

   ## Which issue does this PR close?
   
   Related to: #3645, #3644, #3346, #3332, #3330, #3180, #3173, #3016, #2649, 
#2646, #1897, #1729, #1630
   
   ## Rationale for this change
   
   Several expressions are currently marked as Compatible (Spark-compatible) 
but have open correctness issues that can produce incorrect results. These 
expressions should fall back to Spark by default to prevent silent data 
corruption, and can be explicitly enabled by users who understand the 
trade-offs via `allowIncompatible=true`.
   
   ## What changes are included in this PR?
   
   **Expressions marked as Incompatible (9 expressions across 6 serde files):**
   
   | Expression | Issue(s) | Condition |
   |---|---|---|
   | `ArrayContains` | #3346 | Always (empty array with literal) |
   | `GetArrayItem` | #3330, #3332 | Always (index handling bugs) |
   | `ArrayRemove` | #3173 | Always (null element removal) |
   | `Hour`, `Minute`, `Second` | #3180 | Only for TimestampNTZ inputs |
   | `TruncTimestamp` | #2649 | Only for non-UTC timezones |
   | `Ceil`, `Floor` | #1729 | Only for Decimal type inputs |
   | `Tan` | #1897 | Always (negative zero) |
   | `Corr` | #2646 | Always (null vs NaN) |
   | `StructsToJson` | #3016 | Always (Infinity values) |
   
   Where possible, the incompatibility is conditional on the specific input 
type that triggers the bug (e.g., Hour/Minute/Second are only incompatible for 
TimestampNTZ, Ceil/Floor only for Decimal, TruncTimestamp only for non-UTC 
timezones).
   
   **Documentation updates:**
   - `expressions.md`: Updated Spark-Compatible status from "Yes" to "No" for 
all affected expressions, with compatibility notes linking to tracking issues
   - `compatibility.md`: Added detailed "Incompatible Expressions" subsections 
organized by category (Array, Date/Time, Math, Aggregate, Struct) with 
descriptions and issue links
   
   ## How are these changes tested?
   
   These changes only add `getSupportLevel` overrides (which cause expressions 
to fall back to Spark by default) and update documentation. The existing test 
suite covers the fallback mechanism. No new behavioral logic is introduced.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to