etseidl commented on issue #8441: URL: https://github.com/apache/arrow-rs/issues/8441#issuecomment-3353091174
Wow. Thanks @alamb this looks better than I imagined (although I'll admit to being sad I couldn't get the red bar any higher). As a follow-up I'd be interested in exploring skipping whole row groups and column chunks, both by simply skipping them during the read, but also by creating an index into the metadata. This could be done either by a fast pass through the footer bytes, or by modifying the writer to create the index and then write it somewhere before the footer. The trick there is finding it before beginning the footer parse, but I think we could use the new [extension](https://github.com/apache/parquet-format/blob/master/BinaryProtocolExtensions.md) mechanism for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
