andygrove commented on PR #4726:
URL: 
https://github.com/apache/datafusion-comet/pull/4726#issuecomment-4810139987

   Nice fix. The short-circuit logic matches Spark's `ArrayInsert.eval` 
exactly, and evaluating `src` before `pos` also lines up error precedence with 
Spark when more than one argument can throw.
   
   One suggestion on test coverage. The new `test_array_insert_short_circuit` 
case nicely exercises skipping `pos` (and `item`) when `src` is `NULL`. There's 
a second, distinct path that the new `&& pos_value.is_valid(row)` guard on 
`evaluate_item` protects: `src` non-null, `pos` `NULL`, and an `item` that 
would throw if evaluated. Spark short-circuits `item` there too (the inner `if 
(value2 != null)` check), so it returns `NULL` rather than raising. It might be 
worth pinning that down with something like:
   
   ```sql
   statement
   INSERT INTO test_array_insert_short_circuit VALUES
     (NULL, 0),
     (array(1), 1),
     (array(1), 2)
   
   query
   SELECT array_insert(arr, CAST(NULL AS INT), element_at(array(1), idx))
   FROM test_array_insert_short_circuit
   ```
   
   Here `pos` is `NULL` for every row, so `item = element_at(array(1), idx)` 
should never be evaluated, including the `idx = 0` and `idx = 2` rows that 
would otherwise throw. Without the `pos` validity check on `evaluate_item` this 
would error, so it guards that branch directly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to