tustvold commented on PR #5811:
URL:
https://github.com/apache/arrow-datafusion/pull/5811#issuecomment-1494309259
> mostly for consistency with the semantics of arrow-rs
So this is not consistent with the arrow kernels that DataFusion makes use
of, in particular, DataFusion is currently using the unchecked arithmetic
kernels, which do not return an error on overflow. So this PR will introduce
inconsistency between ScalarValue arithmetic, and any arithmetic involving
arrays.
The major reason I'm a little apprehensive about changing this is that the
checked kernels are at least an order of magnitude slower in the presence of
nulls, as LLVM can't vectorise them correctly
```
add(0.1) time: [7.4481 µs 7.4539 µs 7.4608 µs]
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
add_checked(0.1) time: [101.62 µs 106.31 µs 111.40 µs]
```
I definitely think whatever semantics we settle on should be consistent and
well documented, but I don't have a strong opinion what it should be. However,
I do feel that changing the current semantics warrants some communication due
to the major performance regression it would entail.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]