mapleFU commented on issue #38389: URL: https://github.com/apache/arrow/issues/38389#issuecomment-1779360466
> Hmm, s3fs does some readahead. If you are scanning the whole file, this probably helps. However, Arrow's FS + pre-buffer should be better in general (readahead is actively harmful if you aren't scanning the whole file) In my opinion, when reading the same file, if user is using ArrowFS, before 14.0, the default options is not Lazy, so, the file will pre_buffer all row-groups and all columns at once. I don't know how could be a times faster... Maybe I need to re-produce this myself to take a look 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
