Alec Mocatta created ARROW-4314:
-----------------------------------
Summary: Strongly-typed reading of Parquet data
Key: ARROW-4314
URL: https://issues.apache.org/jira/browse/ARROW-4314
Project: Apache Arrow
Issue Type: New Feature
Components: Rust
Reporter: Alec Mocatta
See the proposal I made onĀ [~csun]'s repository
[here|https://github.com/sunchao/parquet-rs/issues/205] for more details.
This aims to let the user opt in to strong typing and substantial performance
improvements (2x-7x, see
[here|https://github.com/sunchao/parquet-rs/issues/205#issuecomment-446016254])
by optionally specifying the type of the records that they are iterating over.
It is currently a work in progress. All pre-existing tests succeed, bar those
in src/record/api.rs which are commented out as they require reworking. Where
relevant, pre-existing tests and benchmarks have been duplicated to make new
strongly-typed tests and benchmarks, which all also succeed. I've tried to
maintain pre-existing APIs where possible. Some changes have been made to
better align with prior art in the Rust ecosystem.
Any feedback while I continue working on it very welcome! Looking forward to
hopefully seeing this merged when it's ready.
[https://github.com/alecmocatta/arrow]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)