mapleFU opened a new issue, #38880: URL: https://github.com/apache/arrow/issues/38880
### Describe the enhancement requested `parquet::ColumnReader::HasNextInternal` might call `ReadNewPage` to check the record boundary. ```c++ bool HasNextInternal() { // Either there is no data page available yet, or the data page has been // exhausted if (num_buffered_values_ == 0 || num_decoded_values_ == num_buffered_values_) { if (!ReadNewPage() || num_buffered_values_ == 0) { return false; } } return true; } ``` And `ReadNewPage` will call: ```c++ // Advance to the next data page bool ReadNewPage() { // Loop until we find the next data page. while (true) { current_page_ = pager_->NextPage(); if (!current_page_) { // EOS return false; } ``` When having `data_page_filter`, in v1 format, seems that `NextPage` might filter the data-page? ### Component(s) C++, Parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org