Dandandan commented on code in PR #7309:
URL: https://github.com/apache/arrow-rs/pull/7309#discussion_r2023700374


##########
arrow-array/src/builder/generic_bytes_builder.rs:
##########
@@ -129,6 +129,48 @@ impl<T: ByteArrayType> GenericByteBuilder<T> {
         self.offsets_builder.append(self.next_offset());
     }
 
+    /// Appends array values and null to this builder as is
+    /// (this means that underlying null values are copied as is).
+    #[inline]
+    pub fn append_array(&mut self, array: &GenericByteArray<T>) {
+        if array.len() == 0 {
+            return;
+        }
+
+        let offsets = array.offsets();
+
+        // If the offsets are contiguous, we can append them directly avoiding 
the need to align
+        // them
+        if self.next_offset() == offsets[0] {
+            self.offsets_builder.append_slice(&offsets[1..]);
+        } else {
+            // Shifting all the offsets
+            let shift: T::Offset = self.next_offset() - offsets[0];
+
+            // Creating intermediate offsets instead of pushing each offset is 
faster
+            // (even if we make MutableBuffer to avoid updating length on each 
push
+            //  and reserve the necessary capacity, it's still slower)
+            let mut intermediate = Vec::with_capacity(offsets.len() - 1);

Review Comment:
   I think many of these instances could be changed to use `Vec` (and use 
optimized extend, etc. from them) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to