alamb commented on code in PR #6105:
URL: https://github.com/apache/arrow-rs/pull/6105#discussion_r1693392323
##########
parquet/src/file/metadata/mod.rs:
##########
@@ -1045,6 +1216,28 @@ impl ColumnIndexBuilder {
self.null_counts.push(null_count);
}
+ /// Append the given page-level histograms to the [`ColumnIndex`]
histograms.
+ /// Does nothing if the `ColumnIndexBuilder` is not in the `valid` state.
+ pub fn append_histograms(
+ &mut self,
+ repetition_level_histogram: &Option<LevelHistogram>,
+ definition_level_histogram: &Option<LevelHistogram>,
+ ) {
+ if !self.valid {
+ return;
+ }
+ if let Some(ref rep_lvl_hist) = repetition_level_histogram {
+ let hist =
self.repetition_level_histograms.get_or_insert(Vec::new());
+ hist.reserve(rep_lvl_hist.len());
+ hist.extend(rep_lvl_hist.values());
Review Comment:
I went back and forth with this -- I think I was thinking this was building
the thrift structure so we should use the thrift representation (`Vec<i64>`)
but I may have gotten that wrong as I find the naming quite confusing
How about we explore doing it as a follow on PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]