Dandandan commented on code in PR #7309: URL: https://github.com/apache/arrow-rs/pull/7309#discussion_r2023700374
########## arrow-array/src/builder/generic_bytes_builder.rs: ########## @@ -129,6 +129,48 @@ impl<T: ByteArrayType> GenericByteBuilder<T> { self.offsets_builder.append(self.next_offset()); } + /// Appends array values and null to this builder as is + /// (this means that underlying null values are copied as is). + #[inline] + pub fn append_array(&mut self, array: &GenericByteArray<T>) { + if array.len() == 0 { + return; + } + + let offsets = array.offsets(); + + // If the offsets are contiguous, we can append them directly avoiding the need to align + // them + if self.next_offset() == offsets[0] { + self.offsets_builder.append_slice(&offsets[1..]); + } else { + // Shifting all the offsets + let shift: T::Offset = self.next_offset() - offsets[0]; + + // Creating intermediate offsets instead of pushing each offset is faster + // (even if we make MutableBuffer to avoid updating length on each push + // and reserve the necessary capacity, it's still slower) + let mut intermediate = Vec::with_capacity(offsets.len() - 1); Review Comment: I think many of these instances could be changed to use `Vec` (and use optimized extend, etc. from them) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org