tustvold commented on code in PR #2237:
URL: https://github.com/apache/arrow-rs/pull/2237#discussion_r933983922


##########
parquet/src/arrow/array_reader/complex_object_array.rs:
##########
@@ -135,23 +135,44 @@ where
             .iter_mut()
             .for_each(|buf| buf.truncate(num_read));
 
-        self.def_levels_buffer = def_levels_buffer;
-        self.rep_levels_buffer = rep_levels_buffer;
+        if let Some(mut def_levels_buffer) = def_levels_buffer {
+            match &mut self.def_levels_buffer {
+                None => {
+                    self.def_levels_buffer = Some(def_levels_buffer);
+                }
+                Some(buf) => buf.append(&mut def_levels_buffer),
+            }
+        }
+
+        if let Some(mut rep_levels_buffer) = rep_levels_buffer {
+            match &mut self.rep_levels_buffer {
+                None => {
+                    self.rep_levels_buffer = Some(rep_levels_buffer);
+                }
+                Some(buf) => buf.append(&mut rep_levels_buffer),
+            }
+        }
+
+        self.data_buffer.append(&mut data_buffer);
 
+        Ok(num_read)
+    }
+
+    fn consume_batch(&mut self) -> Result<ArrayRef> {
         let data: Vec<Option<T::T>> = if self.def_levels_buffer.is_some() {
-            data_buffer
-                .into_iter()
+            self.data_buffer

Review Comment:
   Perhaps we could use std::mem::take to avoid a copy here?



##########
parquet/src/arrow/array_reader/complex_object_array.rs:
##########
@@ -160,6 +181,10 @@ where
             array = arrow::compute::cast(&array, &self.data_type)?;
         }
 
+        self.data_buffer = vec![];
+        self.def_levels_buffer = None;
+        self.rep_levels_buffer = None;

Review Comment:
   I think this will break, as RecordReader assumes the definition levels live 
until the next call to consume_batch?
   
   I'm not sure we actually have test coverage of say a nullable StructArray 
containing a DecimalArray 🤔



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to