n0r0shi opened a new pull request, #20781:
URL: https://github.com/apache/datafusion/pull/20781

   ## Which issue does this PR close?
   
   Closes https://github.com/apache/datafusion/issues/15914 (partial — adds one 
more Spark-compatible function)
   
   Related: https://github.com/apache/datafusion-comet/issues/3645
   
   ## Rationale
   
   Spark's `arrays_overlap` uses three-valued null logic, which differs from 
DataFusion's built-in `array_has_any`:
   
   | Input | Spark `arrays_overlap` | DataFusion `array_has_any` |
   |-------|----------------------|--------------------------|
   | `[1, 2]`, `[2, 3]` | `true` | `true` |
   | `[1, 2]`, `[3, 4]` | `false` | `false` |
   | `[1, NULL]`, `[3]` | `null` | `false` |
   | `[1, 2]`, `[3, NULL]` | `null` | `false` |
   | `[1, NULL]`, `[1, 3]` | `true` | `true` |
   
   In Spark, when there's no definite overlap but either array contains a null 
element, the result is `null`.
   
   ## What changes are included in this PR?
   
   Adds `SparkArraysOverlap` to the `datafusion-spark` crate, following the 
same pattern as `SparkArrayContains`: delegate to DataFusion's `array_has_any`, 
then patch rows where the result is `false` and either input array contains 
null elements to `null`.
   
   ## Are these changes tested?
   
   Unit tests covering:
   - Definite overlap → `true`
   - No overlap, no nulls → `false`
   - No overlap, null in left → `null`
   - No overlap, null in right → `null`
   - Overlap with nulls present → `true` (definite match trumps null)
   - Null list → `null`
   - Multi-row mixed cases
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to