[GitHub] [arrow-datafusion] tustvold commented on pull request #5811: Scalar arithmetic should return error when overflows.

via GitHub Mon, 03 Apr 2023 06:16:46 -0700


tustvold commented on PR #5811:
URL: 
https://github.com/apache/arrow-datafusion/pull/5811#issuecomment-1494309259


   > mostly for consistency with the semantics of arrow-rs
   
   So this is not consistent with the arrow kernels that DataFusion makes use 
of, in particular, DataFusion is currently using the unchecked arithmetic 
kernels, which do not return an error on overflow. So this PR will introduce 
inconsistency between ScalarValue arithmetic, and any arithmetic involving 
arrays.
   
   The major reason I'm a little apprehensive about changing this is that the 
checked kernels are at least an order of magnitude slower in the presence of 
nulls, as LLVM can't vectorise them correctly
   
   ```
   add(0.1)                time:   [7.4481 µs 7.4539 µs 7.4608 µs]
   Found 6 outliers among 100 measurements (6.00%)
     2 (2.00%) high mild
     4 (4.00%) high severe
   
   add_checked(0.1)        time:   [101.62 µs 106.31 µs 111.40 µs]
   ```
   
   I definitely think whatever semantics we settle on should be consistent and 
well documented, but I don't have a strong opinion what it should be. However, 
I do feel that changing the current semantics warrants some communication due 
to the major performance regression it would entail.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] tustvold commented on pull request #5811: Scalar arithmetic should return error when overflows.

Reply via email to