joechenrh commented on issue #865: URL: https://github.com/apache/arrow-go/issues/865#issuecomment-4842211021
A sketch of the approach — would appreciate a sanity check on the shape before I build it out. **Goal:** opt-in streaming decode for large data pages so peak memory tracks the requested batch size rather than the uncompressed page size. Off by default, no behavior change for existing readers. **Scope (first cut):** Data Page V1 + V2, `PLAIN` encoding, types `int32/int64/float/double/FLBA/byte_array`. Codecs: `UNCOMPRESSED/GZIP/BROTLI/ZSTD` only — kept as an explicit allowlist rather than a [`StreamingCodec`](https://github.com/apache/arrow-go/blob/d17f6870a0569b670d985e6eac60741af621bc88/parquet/compress/compress.go#L98) assertion, since Snappy's `NewReader` is the framed format while Parquet stores raw blocks, and LZ4_RAW has no streaming reader. Everything else falls back to the current materialized path. **Core idea — a `ValueBuffer` source between page and decoder.** Today decoders read from an in-memory `[]byte` ([`SetData`](https://github.com/apache/arrow-go/blob/d17f6870a0569b670d985e6eac60741af621bc88/parquet/internal/encoding/types.go#L37)) sliced out of the fully-decompressed page ([`decompress`](https://github.com/apache/arrow-go/blob/d17f6870a0569b670d985e6eac60741af621bc88/parquet/file/page_reader.go#L501)). Introduce: ```go type ValueBuffer interface { Bytes() []byte // currently buffered; valid until next Fill Fill(consumed, need int) ([]byte, error) // drop consumed, ensure >= need contiguous (grows for oversized values) Stable() bool // true => caller may zero-copy alias (buffer never slides) } ``` Two implementations: a materialized one wrapping the existing page buffer (never refills, `Stable()=true`) and a streaming one wrapping `codec.NewReader(io.LimitReader(...))`. PLAIN decoders are reworked to index `Bytes()` directly and `Fill` only on underflow, so the materialized path is the degenerate "never refill" case and existing behavior is preserved (to be confirmed by benchmarks). **Pages — no new type.** [`DataPageV1`/`DataPageV2`](https://github.com/apache/arrow-go/blob/d17f6870a0569b670d985e6eac60741af621bc88/parquet/file/page_reader.go#L135) gain a `valueSource` field (non-nil ⇒ streaming), and the [`DataPage`](https://github.com/apache/arrow-go/blob/d17f6870a0569b670d985e6eac60741af621bc88/parquet/file/page_reader.go#L107) interface gains: ```go LevelData() []byte // bytes for the level decoders ValueSource(levelBytes int) encoding.ValueBuffer // source for the value decoder ``` rep/def levels stay fully materialized (small); a streaming page's `Data()` returns nil. The streaming-vs-materialized branch lives inside these two methods, so the column reader stays branch-free. **Page reader.** Eligible pages skip `decompress()` in [`Next()`](https://github.com/apache/arrow-go/blob/d17f6870a0569b670d985e6eac60741af621bc88/parquet/file/page_reader.go#L740). V2 levels are uncompressed with header byte-lengths; for V1 the rep|def region is peeled off the decompressed stream using exactly the length rule [`LevelDecoder.SetData`](https://github.com/apache/arrow-go/blob/d17f6870a0569b670d985e6eac60741af621bc88/parquet/internal/encoding/levels.go#L187) already uses (RLE 4-byte prefix / BIT_PACKED computed), leaving the stream at the first value byte. The value stream wraps an `io.LimitReader` sized to the compressed region; on page release it is drained and the decompressor closed, so the reader lands on the next header even if the caller skipped values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
