fatemehp commented on code in PR #14603:
URL: https://github.com/apache/arrow/pull/14603#discussion_r1049255563
##########
cpp/src/parquet/column_reader.h:
##########
@@ -55,6 +56,29 @@ static constexpr uint32_t kDefaultMaxPageHeaderSize = 16 *
1024 * 1024;
// 16 KB is the default expected page header size
static constexpr uint32_t kDefaultPageHeaderSize = 16 * 1024;
+// \brief DataPageStats stores encoded statistics and number of values/rows for
+// a page.
+struct PARQUET_EXPORT DataPageStats {
+ DataPageStats(EncodedStatistics* encoded_statistics, int32_t num_values,
+ std::optional<int32_t> num_rows)
+ : encoded_statistics(encoded_statistics),
+ is_stats_set(encoded_statistics->is_set()),
+ num_values(num_values),
+ num_rows(num_rows) {}
+
+ // Encoded statistics extracted from the page header.
+ EncodedStatistics* encoded_statistics;
+ // False if there were no encoded statistics in the page header.
+ bool is_stats_set;
Review Comment:
Good point. I removed is_stats_set, and added a comment that
encoded_statistics would be nullptr in that case. Also, added a test for it.
##########
cpp/src/parquet/column_reader.h:
##########
@@ -115,11 +141,27 @@ class PARQUET_EXPORT PageReader {
bool always_compressed = false,
const CryptoContext* ctx = NULLPTR);
+ // If data_page_filter_ is present (not null), NextPage() will call the
Review Comment:
Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]