js8544 commented on issue #37678:
URL: https://github.com/apache/arrow/issues/37678#issuecomment-1719364274
I wrote a simple kernel exec that checks the nullity only when overflow
occurs:
```cpp
Status AddCheckedExec(KernelContext* ctx, const ExecSpan& span, ExecResult*
result) {
int64_t length = span.length;
auto left_arr = span[0].array;
auto right_arr = span[1].array;
auto* out_span = result->array_span_mutable();
auto* left_it = left_arr.GetValues<int32_t>(1);
auto* right_it = right_arr.GetValues<int32_t>(1);
auto* out_it = out_span->GetValues<int32_t>(1);
for (int64_t i = 0; i < length; ++i) {
if (ARROW_PREDICT_FALSE(AddWithOverflow(*left_it++, *right_it++,
out_it++))) {
auto left_valid = left_arr.IsValid(i);
auto right_valid = right_arr.IsValid(i);
if (left_valid && right_valid) {
return Status::Invalid("overflow");
}
}
}
return Status::OK();
}
```
This achieves similar 2.5x time as the one above.
> 1. detect potential overflow over all K pairs of values
> 2. if no overflow was detected, add all K pairs or values
I'm not sure how to detect overflow without performing the actual addition.
So I implemented the following algo:
1. Compute all additions and save the overflow status to a vector.
2. Loop the overflow vector and check for nullity.
```cpp
Status AddCheckedExec(KernelContext* ctx, const ExecSpan& span, ExecResult*
result) {
int64_t length = span.length;
auto left_arr = span[0].array;
auto right_arr = span[1].array;
auto* out_span = result->array_span_mutable();
auto* left_it = left_arr.GetValues<int32_t>(1);
auto* right_it = right_arr.GetValues<int32_t>(1);
auto* out_it = out_span->GetValues<int32_t>(1);
std::vector<int> overflow(length, false);
for (int64_t i = 0; i < length; ++i) {
overflow[i] = AddWithOverflow(*left_it++, *right_it++, out_it++);
}
for (int64_t i = 0; i < length; ++i) {
if (ARROW_PREDICT_FALSE(overflow[i])) {
auto left_valid = left_arr.IsValid(i);
auto right_valid = right_arr.IsValid(i);
if (left_valid && right_valid) {
return Status::Invalid("overflow");
}
}
}
return Status::OK();
}
```
This is however 5x slower. Writing to `overflow[i]` seems to be the
bottleneck here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]