pitrou commented on code in PR #14603:
URL: https://github.com/apache/arrow/pull/14603#discussion_r1042229280


##########
cpp/src/parquet/column_reader.h:
##########
@@ -115,11 +116,30 @@ class PARQUET_EXPORT PageReader {
                                           bool always_compressed = false,
                                           const CryptoContext* ctx = NULLPTR);
 
+  // If skip_page_callback_ is present (not null), NextPage() will call the
+  // callback function exactly once per page in the order the pages appear in
+  // the column. If the callback function returns true the page will be
+  // skipped. The callback will be called only if the page type is DATA_PAGE or
+  // DATA_PAGE_V2. Dictionary pages will not be skipped.
+  // This setter must be called at most once to set the callback.
+  // \note API EXPERIMENTAL
+  void set_skip_page_callback(
+      std::function<bool(const DataPageStats&)> skip_page_callback) {
+    if (skip_page_callback_) {
+      throw ParquetException("set_skip_page_callback was called more than 
once");

Review Comment:
   Well, unless we do a similar thing for other configuration properties, this 
seems a bit gratuitous.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to