HippoBaro commented on code in PR #9848:
URL: https://github.com/apache/arrow-rs/pull/9848#discussion_r3494355469


##########
parquet/src/arrow/record_reader/mod.rs:
##########
@@ -62,6 +62,19 @@ pub struct GenericRecordReader<V, CV> {
     num_records: usize,
     /// Capacity hint for pre-allocating buffers based on batch size
     capacity_hint: usize,
+    /// Number of values in the values buffer (may differ from num_values when
+    /// padding_threshold is set, since list-level padding is excluded).
+    values_written: usize,
+    /// When set, `pad_nulls` only pads item-level nulls (def >= threshold)

Review Comment:
   Agreed. I added 
   
   ```rust
       /// Definition-level threshold used for selective null padding.
       ///
       /// With full padding (`None`), the leaf values buffer has one slot for 
each
       /// decoded definition level. This includes placeholders for null or 
empty
       /// parent lists, which parent `ListArrayReader`s later have to filter 
out
       /// before computing offsets.
       ///
       /// With selective padding (`Some(threshold)`), the threshold is the 
nearest
       /// enclosing list/map definition level. Entries with `def < threshold`
       /// describe a null/empty parent and are skipped entirely. Entries with
       /// `def >= threshold` belong to an actual child item slot: real values 
are
       /// copied, and item-level nulls are padded. The companion 
`compact_bitmap`
       /// has the same compact length and becomes the leaf null bitmap.
       padding_threshold: Option<i16>,
   ```
   
   which hopefully helps the reader follow along. Let me know if you'd like to 
change that.
   
   Ref: 
https://github.com/HippoBaro/arrow-rs/blob/6598016d3ce76145594d913c4de468e28b9587a6/parquet/src/arrow/record_reader/mod.rs#L68-L81



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to