houqp commented on a change in pull request #7365:
URL: https://github.com/apache/arrow/pull/7365#discussion_r436336846



##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -577,6 +632,81 @@ where
         self
     }
 
+    /// Appends data from other arrays into the builder
+    ///
+    /// This is most useful when concatenating arrays of the same type into a 
builder.
+    fn append_data(&mut self, data: &[ArrayDataRef]) -> Result<()> {
+        if !check_array_data_type(&self.data_type(), data) {
+            return Err(ArrowError::InvalidArgumentError(
+                "Cannot append data to builder if data types are 
different".to_string(),
+            ));
+        }
+        // determine the latest offset on the builder
+        let mut cum_offset = if self.offsets_builder.len() == 0 {
+            0
+        } else {
+            // peek into buffer to get last appended offset
+            let buffer = self.offsets_builder.buffer.data();
+            let len = self.offsets_builder.len();
+            let (start, end) = ((len - 1) * 4, len * 4);
+            let slice = &buffer[start..end];
+            i32::from_le_bytes(slice.try_into().unwrap())
+        };
+        for array in data {
+            if array.child_data().len() != 1 {

Review comment:
       I think we can leave the memory allocation performance optimization to 
future PRs given that number of arrays in `data: &[ArrayDataRef]` should not be 
too large.
   
   But I do think we should make sure error handling behavior is consistent, 
i.e. invalid input should not lead to partial append to the array. If we are to 
use ArrayDataRef as input type, then I feel like we do need to have custom 
validation logic for each array type as you mentioned.
   
   The idea of using `ArrayRef` as input type is interesting. It does look like 
a simpler interface for end users and could simplify the error handling logic. 
What's the downside of using `ArrayRef` here compared to `ArrayDataRef`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to