[PR] [POC for #9934] parquet: add ReverseSerializedPageReader [arrow-rs]

via GitHub Wed, 06 May 2026 20:57:45 -0700


zhuqi-lucas opened a new pull request, #9937:
URL: https://github.com/apache/arrow-rs/pull/9937


   # Which issue does this PR close?
   
   Tracking issue: #9934 — *parquet: Support reverse page iteration for reverse 
streaming with filter pushdown*.
   
   This PR is a **proof-of-concept (Phase 1)** for the building block proposed 
in that issue. It is **not** intended for immediate merge — its purpose is to 
ground the design discussion in real code, demonstrate feasibility, and gather 
feedback on the API shape before a final PR.
   
   # Rationale for this change
   
   DataFusion has merged a row-group-level reverse scan 
(apache/datafusion#18817) for `ORDER BY DESC LIMIT N` queries on 
ascending-sorted Parquet files. The current granularity (~128MB per row group) 
means high first-batch latency and high peak memory; for `WHERE filter ORDER BY 
DESC LIMIT N`, where `RowSelection` cannot pre-select matching rows, the 
reverse stream is the only correct strategy.
   
   A page-level reverse iterator is the missing primitive. The key insight (per 
#9934) is to separate **two distinct concepts**:
   
   | | Possible? |
   |---|---|
   | Reverse-decoding rows *within* a single page | ❌ No (RLE / dict / delta / 
bit-packing are forward streams) |
   | Iterating page **traversal order** in reverse, decoding each page forward 
| ✅ Yes (via `OffsetIndex.page_locations`) |
   
   This PR implements the second.
   
   # What changes are included in this PR?
   
   A new `pub mod reverse_serialized_reader` exposing 
`ReverseSerializedPageReader<R: ChunkReader>`:
   
   - Implements `PageReader + Iterator<Item = Result<Page>>`
   - Emits the dictionary page (if any) first, then data pages from the last 
`PageLocation` to the first
   - Reuses existing `decode_page`, `ThriftSliceInputProtocol`, and `Codec` 
plumbing — **no changes to `serialized_reader.rs`**
   
   ```rust
   let reader = ReverseSerializedPageReader::new(
       chunk_reader,
       column_chunk_metadata,
       offset_index,
   )?;
   for page in reader {
       // page is decoded forward, but pages are emitted in reverse order
   }
   ```
   
   ## Limitations (Phase 1 POC, intentional)
   
   - Encryption is not supported.
   - `peek_next_page` does not populate `num_rows`.
   - Requires the column chunk to have an `OffsetIndex` (page index).
   
   ## Test coverage (20 tests)
   
   | Category | Tests |
   |---|---|
   | Page buffer / metadata equivalence | UNCOMPRESSED / SNAPPY / GZIP / ZSTD; 
V1 + V2 pages; with / without dict |
   | Value-level decode equivalence (via `ColumnReaderImpl`) | INT32 / INT64 / 
BYTE_ARRAY; with / without dict; V1 + V2 |
   | NULL handling | OPTIONAL columns, V1 + V2, ~30% nulls |
   | State machine | `peek_next_page` over full lifecycle; `skip_next_page` 
with / without dict |
   | Edge cases | Single-data-page chunks; 50k-row stress (> 20 pages); 
`Iterator` impl |
   
   The page-level tests verify byte-identical buffers between forward 
(`SerializedPageReader`) and reverse readers. The decode-level tests further 
verify that, after page-boundary slicing and re-arrangement, `ColumnReaderImpl` 
produces identical values and `def_levels`.
   
   # Why is this an RFC / draft?
   
   The remaining work for end-to-end usefulness sits **above** this primitive:
   
   - **Phase 2**: a `ParquetRecordBatchReader` mode that aligns batch 
boundaries with page boundaries and reverses each page's `RecordBatch` after 
decoding, so the final stream is fully reversed. This requires deciding whether 
to integrate with `ParquetRecordBatchReaderBuilder` or expose a separate 
reverse-reader path.
   - **Phase 3**: filter + limit early termination (the killer use case from 
#9934).
   
   Specific feedback I'd appreciate:
   
   1. **API surface**: low-level `ReverseSerializedPageReader` (this PR) vs. a 
high-level `ArrowReaderBuilder::with_reverse_pages(true)` — or both?
   2. **Module placement**: keep as a sibling module 
`file/reverse_serialized_reader.rs`, or fold the reverse path into 
`serialized_reader.rs` itself?
   3. **Encryption**: extend `SerializedPageReaderContext` to be reusable, or 
defer encryption support to a later PR?
   4. **`peek_next_page` semantics**: precise `num_rows` requires threading 
`total_rows` through; worth doing in this PR?
   
   # Are there any user-facing changes?
   
   A new public type 
`parquet::file::reverse_serialized_reader::ReverseSerializedPageReader`. No 
existing APIs are changed.
   
   # Verification
   
   ```
   cargo build -p parquet
   cargo check -p parquet --all-features --tests
   cargo test  -p parquet --lib reverse_serialized_reader   # 20 passed
   cargo clippy -p parquet --all-targets -- -D warnings
   cargo fmt -p parquet --check
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [POC for #9934] parquet: add ReverseSerializedPageReader [arrow-rs]

Reply via email to