edponce commented on a change in pull request #11023:
URL: https://github.com/apache/arrow/pull/11023#discussion_r706434101
##########
File path: cpp/src/arrow/compute/kernels/scalar_string.cc
##########
@@ -2357,6 +2584,79 @@ void AddSplit(FunctionRegistry* registry) {
#endif
}
+template <typename Type1, typename Type2>
+struct StrRepeatTransform : public StringBinaryTransformBase {
+ using ArrayType1 = typename TypeTraits<Type1>::ArrayType;
+ using ArrayType2 = typename TypeTraits<Type2>::ArrayType;
+
+ int64_t MaxCodeunits(int64_t inputs, int64_t input_ncodeunits,
+ const std::shared_ptr<Scalar>& input2) override {
+ auto nrepeats = static_cast<int64_t>(UnboxScalar<Type2>::Unbox(*input2));
+ return std::max(input_ncodeunits * nrepeats, int64_t(0));
+ }
+
+ int64_t MaxCodeunits(int64_t inputs, int64_t input_ncodeunits,
+ const std::shared_ptr<ArrayData>& data2) override {
+ ArrayType2 array2(data2);
+ // Ideally, we would like to calculate the exact output size by iterating
over
+ // all strings offsets and summing each length multiplied by the
corresponding repeat
+ // value, but this requires traversing the data twice (now and during
transform).
+ // The upper limit is to assume that all strings are repeated the max
number of
+ // times knowing that a resize operation is performed at end of execution.
+ auto max_nrepeats =
+ static_cast<int64_t>(**std::max_element(array2.begin(), array2.end()));
+ return std::max(input_ncodeunits * max_nrepeats, int64_t(0));
+ }
+
+ int64_t Transform(const uint8_t* input, int64_t input_string_ncodeunits,
+ const std::shared_ptr<Scalar>& input2, uint8_t* output) {
Review comment:
No, because the `ExecXXX` method in `StringBinaryTransformExecBase`
can't deduce what is the type of the second parameter to compute function, only
if it is a Scalar or Array.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]