yadavay-amzn commented on PR #56244: URL: https://github.com/apache/spark/pull/56244#issuecomment-4627464289
@yyanyy Great catch on the NULL semantics — you're right. Spark's struct equality uses `InterpretedOrdering` which treats null=null within fields as equal (returns TRUE), while `EqualTo(null, null)` returns NULL. Fixed: the decomposition now uses `EqualNullSafe` (`<=>`) for per-field comparisons, which matches the struct equality semantics exactly: - `null <=> null` → true (matches struct behavior) - `null <=> 2` → false (matches struct behavior) The only remaining discrepancy is when the entire struct itself is null (original returns NULL, decomposed returns FALSE) — but since our rule only fires in Filter context, this is harmless (both NULL and FALSE exclude the row from WHERE). Also added a width guard (max 100 fields) to prevent stack overflow on very wide/deeply nested structs, per your second concern. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
