alamb commented on code in PR #8658:
URL: https://github.com/apache/arrow-rs/pull/8658#discussion_r2445703077


##########
arrow-buffer/src/buffer/mutable.rs:
##########
@@ -222,6 +222,75 @@ impl MutableBuffer {
         }
     }
 
+    /// Adding to this mutable buffer `slice_to_repeat` repeated 
`repeat_count` times.
+    ///
+    /// # Example
+    ///
+    /// ## Repeat the same string bytes multiple times
+    /// ```
+    /// # use arrow_buffer::buffer::MutableBuffer;
+    /// let mut buffer = MutableBuffer::new(0);
+    /// let bytes_to_repeat = b"ab";
+    /// buffer.repeat_slice_n_times(bytes_to_repeat, 3);
+    /// assert_eq!(buffer.as_slice(), b"ababab");
+    /// ```
+    pub fn repeat_slice_n_times<T: ArrowNativeType>(
+        &mut self,
+        slice_to_repeat: &[T],
+        repeat_count: usize,
+    ) {
+        if repeat_count == 0 || slice_to_repeat.is_empty() {
+            return;
+        }
+
+        let bytes_to_repeat = size_of_val(slice_to_repeat);
+
+        // Ensure capacity
+        self.reserve(repeat_count * bytes_to_repeat);
+
+        // Save the length before we do all the copies to know where to start 
from
+        let length_before = self.len;
+
+        // Copy the initial slice once so we can use doubling strategy on it
+        self.extend_from_slice(slice_to_repeat);
+
+        // This tracks how much bytes we have added by repeating so far
+        let added_repeats_length = bytes_to_repeat;
+        assert_eq!(
+            self.len - length_before,
+            added_repeats_length,
+            "should copy exactly the same number of bytes"
+        );
+
+        // Number of times the slice was repeated
+        let mut already_repeated_times = 1;
+
+        // We will use doubling strategy to fill the buffer in 
log(repeat_count) steps
+        while already_repeated_times < repeat_count {
+            // How many slices can we copy in this iteration
+            // (either double what we have, or just the remaining ones)
+            let number_of_slices_to_copy =
+                already_repeated_times.min(repeat_count - 
already_repeated_times);
+            let number_of_bytes_to_copy = number_of_slices_to_copy * 
bytes_to_repeat;
+
+            unsafe {
+                // Get to the start of the data before we started copying 
anything
+                let src = self.data.as_ptr().add(length_before) as *const u8;

Review Comment:
   rustc can probably figure it out, but `src` is the same for all loop 
iterations so could be pulled out of the loop I think



##########
arrow-buffer/src/buffer/mutable.rs:
##########
@@ -222,6 +222,75 @@ impl MutableBuffer {
         }
     }
 
+    /// Adding to this mutable buffer `slice_to_repeat` repeated 
`repeat_count` times.

Review Comment:
   I am wondering how much the unsafe log copying here makes a difference, vs 
ensuring `reserve` is called correctly. 
   
   Did you measure with code that was like:
   
   ```rust
   reserve(slice.len() * repeat_count);
   for _ in 0..repeat_count {
     buf.extend_from_slice(slice_to_repeat)
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to