ahmarsuhail opened a new issue, #3386: URL: https://github.com/apache/parquet-java/issues/3386
This issue tracks adding of a new module [parquet-io] with the goal of adding IO optimisations to the parquet-java repository. The goal is to have a single place where the following optimisations can be implemented: * Vectored Reads * Reading the tail of a file in one request rather than multiple small requests (Avoid the parquet footer dance, multiple requests for the pageIndex) * Small Parquet files are read in a single request * Sequential prefetching [Parquet Java Input Stream Optimisations](https://docs.google.com/document/d/1Xdlh23tmCs-KvzHhY2RuwFYmc3xntUKcmwb8yxEl78Y/edit?usp=sharing): Doc explains the features this will implement. [Analytics Accelerator for S3](https://docs.google.com/document/d/13shy0RWotwfWC_qQksb95PXdi-vSUCKQyDzjoExQEN0/edit?tab=t.0#heading=h.3lc3p7s26rnw): Doc explains IO optimisations made in the analytics accelerator library. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
