valkum commented on code in PR #8573:
URL: https://github.com/apache/arrow-rs/pull/8573#discussion_r2414222202


##########
parquet/src/arrow/array_reader/byte_array_dictionary.rs:
##########
@@ -175,9 +179,6 @@ where
         let buffer = self.record_reader.consume_record_data();
         let null_buffer = self.record_reader.consume_bitmap_buffer();
         let array = buffer.into_array(null_buffer, &self.data_type)?;
-
-        self.def_levels_buffer = self.record_reader.consume_def_levels();
-        self.rep_levels_buffer = self.record_reader.consume_rep_levels();
         self.record_reader.reset();

Review Comment:
   Thanks for the quick action on this.
   During my debugging of the bug, I also came to the conclusion that advancing 
the rep and buffers would make sense, but I am lacking familiarity with these 
internals.
   
   The docs for `reset` state that this should be called after consuming data. 
This isn't called in the shortcut case now. I am lacking a bit of context here, 
and the docs of GenericRecordReader aren't helping.
   
   During my debug session, in the error case, `num_values()` and 
`num_records()` were both 0 already at that point. So resetting isn't doing 
much. 
   But I am wondering if we can get to a state where `num_records > num_values`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to