thisisnic commented on a change in pull request #10624:
URL: https://github.com/apache/arrow/pull/10624#discussion_r669142885
##########
File path: r/src/compute.cpp
##########
@@ -316,6 +316,19 @@ std::shared_ptr<arrow::compute::FunctionOptions>
make_compute_options(
return std::make_shared<Options>(max_splits, reverse);
}
+ if (func_name == "utf8_slice_codeunits") {
+ using Options = arrow::compute::SliceOptions;
+ int64_t start = 1;
+ int64_t stop = -1;
Review comment:
Sorry, I didn't explain this properly, my fault! What I mean is that
here stop has been set to `-1` but the default C++ value isn't `-1`, so the
default value here probably shouldn't be `-1` either.
Check out this line of code here that shows the default value of `stop` in
the C++: https://github.com/apache/arrow/blob/7114c4b6fdb63
9b3500d77cfd66649af8c5c5e6b/cpp/src/arrow/compute/api_scalar.h#L205-L206
The default value of `stop` is `std::numeric_limits<int64_t>::max()` which
is the biggest int64. So if you don't supply a value for `stop` and just use
the default, this allows you to slice to the end of the string.
In some of the other functions in this file, this line has been used to set
the default values to the ones from the C++ code:
`std::make_shared<Options>(Options::Defaults());`
Maybe instead of manually setting the value of stop to `-1`, if you use the
line above, it might make it easier to fix some of the failing tests as now
you'll be able to slice to the end of a string. If this doesn't make sense,
let me know!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]