[GitHub] [arrow] westonpace commented on issue #36765: [Python][Dataset][Parquet] Enable Pre-Buffering by default for Parquet s3 datasets

via GitHub Wed, 19 Jul 2023 10:24:43 -0700


westonpace commented on issue #36765:
URL: https://github.com/apache/arrow/issues/36765#issuecomment-1642473281


   I did some recent testing a few weeks ago and I also found that 
`buffer_size` was better when it came to memory usage.  `pre_buffer` will fetch 
the entire column for the entire row group (or row groups) that you are reading.
   
   However, if `pre_buffer` is the default for `read_table` then I think it is 
ok to make it the default for datasets.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on issue #36765: [Python][Dataset][Parquet] Enable Pre-Buffering by default for Parquet s3 datasets

Reply via email to