westonpace commented on issue #36765: URL: https://github.com/apache/arrow/issues/36765#issuecomment-1642473281
I did some recent testing a few weeks ago and I also found that `buffer_size` was better when it came to memory usage. `pre_buffer` will fetch the entire column for the entire row group (or row groups) that you are reading. However, if `pre_buffer` is the default for `read_table` then I think it is ok to make it the default for datasets. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
