edponce commented on a change in pull request #11023:
URL: https://github.com/apache/arrow/pull/11023#discussion_r706925612



##########
File path: cpp/src/arrow/compute/kernels/scalar_string.cc
##########
@@ -2357,6 +2584,79 @@ void AddSplit(FunctionRegistry* registry) {
 #endif
 }
 
+template <typename Type1, typename Type2>
+struct StrRepeatTransform : public StringBinaryTransformBase {
+  using ArrayType1 = typename TypeTraits<Type1>::ArrayType;
+  using ArrayType2 = typename TypeTraits<Type2>::ArrayType;
+
+  int64_t MaxCodeunits(int64_t inputs, int64_t input_ncodeunits,
+                       const std::shared_ptr<Scalar>& input2) override {
+    auto nrepeats = static_cast<int64_t>(UnboxScalar<Type2>::Unbox(*input2));
+    return std::max(input_ncodeunits * nrepeats, int64_t(0));
+  }
+
+  int64_t MaxCodeunits(int64_t inputs, int64_t input_ncodeunits,
+                       const std::shared_ptr<ArrayData>& data2) override {
+    ArrayType2 array2(data2);
+    // Ideally, we would like to calculate the exact output size by iterating 
over
+    // all strings offsets and summing each length multiplied by the 
corresponding repeat
+    // value, but this requires traversing the data twice (now and during 
transform).
+    // The upper limit is to assume that all strings are repeated the max 
number of
+    // times knowing that a resize operation is performed at end of execution.
+    auto max_nrepeats =
+        static_cast<int64_t>(**std::max_element(array2.begin(), array2.end()));
+    return std::max(input_ncodeunits * max_nrepeats, int64_t(0));
+  }
+
+  int64_t Transform(const uint8_t* input, int64_t input_string_ncodeunits,
+                    const std::shared_ptr<Scalar>& input2, uint8_t* output) {

Review comment:
       After using `GetView()` now we do have the raw data type, so it works. 
Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to