[GitHub] [arrow-rs] zeevm commented on pull request #1154: Add `async` arrow parquet reader

GitBox Sat, 05 Feb 2022 05:39:03 -0800


zeevm commented on pull request #1154:
URL: https://github.com/apache/arrow-rs/pull/1154#issuecomment-1030627167



   I see a few issues with this.
   
   First, the notion that the column chunk is the basic i/o unit for Parquet is 
somewhat outdates with the introduction of the index page.
   
   Second, a major premise of Parquet is "read only what you need", where what 
you need is usually dictated by some query engine, so continuously downloading 
in the background for data the client may doesn't even want or need doesn't 
seem right, especially as the cost is complicating all existing client by the 
added "Send" constraint. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] zeevm commented on pull request #1154: Add `async` arrow parquet reader

Reply via email to