zhuqi-lucas opened a new pull request, #9937:
URL: https://github.com/apache/arrow-rs/pull/9937
# Which issue does this PR close?
Tracking issue: #9934 — *parquet: Support reverse page iteration for reverse
streaming with filter pushdown*.
This PR is a **proof-of-concept (Phase 1)** for the building block proposed
in that issue. It is **not** intended for immediate merge — its purpose is to
ground the design discussion in real code, demonstrate feasibility, and gather
feedback on the API shape before a final PR.
# Rationale for this change
DataFusion has merged a row-group-level reverse scan
(apache/datafusion#18817) for `ORDER BY DESC LIMIT N` queries on
ascending-sorted Parquet files. The current granularity (~128MB per row group)
means high first-batch latency and high peak memory; for `WHERE filter ORDER BY
DESC LIMIT N`, where `RowSelection` cannot pre-select matching rows, the
reverse stream is the only correct strategy.
A page-level reverse iterator is the missing primitive. The key insight (per
#9934) is to separate **two distinct concepts**:
| | Possible? |
|---|---|
| Reverse-decoding rows *within* a single page | ❌ No (RLE / dict / delta /
bit-packing are forward streams) |
| Iterating page **traversal order** in reverse, decoding each page forward
| ✅ Yes (via `OffsetIndex.page_locations`) |
This PR implements the second.
# What changes are included in this PR?
A new `pub mod reverse_serialized_reader` exposing
`ReverseSerializedPageReader<R: ChunkReader>`:
- Implements `PageReader + Iterator<Item = Result<Page>>`
- Emits the dictionary page (if any) first, then data pages from the last
`PageLocation` to the first
- Reuses existing `decode_page`, `ThriftSliceInputProtocol`, and `Codec`
plumbing — **no changes to `serialized_reader.rs`**
```rust
let reader = ReverseSerializedPageReader::new(
chunk_reader,
column_chunk_metadata,
offset_index,
)?;
for page in reader {
// page is decoded forward, but pages are emitted in reverse order
}
```
## Limitations (Phase 1 POC, intentional)
- Encryption is not supported.
- `peek_next_page` does not populate `num_rows`.
- Requires the column chunk to have an `OffsetIndex` (page index).
## Test coverage (20 tests)
| Category | Tests |
|---|---|
| Page buffer / metadata equivalence | UNCOMPRESSED / SNAPPY / GZIP / ZSTD;
V1 + V2 pages; with / without dict |
| Value-level decode equivalence (via `ColumnReaderImpl`) | INT32 / INT64 /
BYTE_ARRAY; with / without dict; V1 + V2 |
| NULL handling | OPTIONAL columns, V1 + V2, ~30% nulls |
| State machine | `peek_next_page` over full lifecycle; `skip_next_page`
with / without dict |
| Edge cases | Single-data-page chunks; 50k-row stress (> 20 pages);
`Iterator` impl |
The page-level tests verify byte-identical buffers between forward
(`SerializedPageReader`) and reverse readers. The decode-level tests further
verify that, after page-boundary slicing and re-arrangement, `ColumnReaderImpl`
produces identical values and `def_levels`.
# Why is this an RFC / draft?
The remaining work for end-to-end usefulness sits **above** this primitive:
- **Phase 2**: a `ParquetRecordBatchReader` mode that aligns batch
boundaries with page boundaries and reverses each page's `RecordBatch` after
decoding, so the final stream is fully reversed. This requires deciding whether
to integrate with `ParquetRecordBatchReaderBuilder` or expose a separate
reverse-reader path.
- **Phase 3**: filter + limit early termination (the killer use case from
#9934).
Specific feedback I'd appreciate:
1. **API surface**: low-level `ReverseSerializedPageReader` (this PR) vs. a
high-level `ArrowReaderBuilder::with_reverse_pages(true)` — or both?
2. **Module placement**: keep as a sibling module
`file/reverse_serialized_reader.rs`, or fold the reverse path into
`serialized_reader.rs` itself?
3. **Encryption**: extend `SerializedPageReaderContext` to be reusable, or
defer encryption support to a later PR?
4. **`peek_next_page` semantics**: precise `num_rows` requires threading
`total_rows` through; worth doing in this PR?
# Are there any user-facing changes?
A new public type
`parquet::file::reverse_serialized_reader::ReverseSerializedPageReader`. No
existing APIs are changed.
# Verification
```
cargo build -p parquet
cargo check -p parquet --all-features --tests
cargo test -p parquet --lib reverse_serialized_reader # 20 passed
cargo clippy -p parquet --all-targets -- -D warnings
cargo fmt -p parquet --check
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]