alamb opened a new issue, #8164: URL: https://github.com/apache/arrow-rs/issues/8164
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** - part of https://github.com/apache/arrow-rs/issues/8000 The current [`ParquetMetaDataReader`](https://docs.rs/parquet/latest/parquet/file/metadata/struct.ParquetMetaDataReader.html) is a wonder of software engineering thanks to @etseidl. However, it is somewhat complicated to use as it has both async and sync methods as well as keeps state internally in a non obvious way -- for example do you call `try_parse` or `parse_and_finish`? Or how os `load_via_suffix_and_finish` related? Compared to what came before it, ParquetMetaDataReader is an amazing improvement, but I think we could do better. I ran into this when I discovered that Metadata is needed when implementing a push decoder for Parquet: - https://github.com/apache/arrow-rs/issues/7983 Basically, I want a way to parse the metadata without **ALSO** doing the IO at the same time **Describe the solution you'd like** If we want to truly separate IO and CPU we also need a way to decode the metadata without explicit IO, and hence this PR that provides a way to decode metadata "push style" where it tells you what bytes are needed. It follows the same API as the parquet push decoder **Describe alternatives you've considered** <!-- A clear and concise description of any alternative solutions or features you've considered. --> **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org