ahmarsuhail opened a new issue, #3386:
URL: https://github.com/apache/parquet-java/issues/3386

   This issue tracks adding of a new module [parquet-io] with the goal of 
adding IO optimisations to the parquet-java repository. 
   
   The goal is to have a single place where the following optimisations can be 
implemented:
   
   * Vectored Reads
   * Reading the tail of a file in one request rather than multiple small 
requests (Avoid the parquet footer dance, multiple requests for the pageIndex)
   * Small Parquet files are read in a single request
   * Sequential prefetching 
   
   [Parquet Java Input Stream 
Optimisations](https://docs.google.com/document/d/1Xdlh23tmCs-KvzHhY2RuwFYmc3xntUKcmwb8yxEl78Y/edit?usp=sharing):
 Doc explains the features this will implement.
   
   [Analytics Accelerator for 
S3](https://docs.google.com/document/d/13shy0RWotwfWC_qQksb95PXdi-vSUCKQyDzjoExQEN0/edit?tab=t.0#heading=h.3lc3p7s26rnw):
 Doc explains IO optimisations made in the analytics accelerator library.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to