Re: [I] [EPIC] Improve the performance of ListingTable [arrow-datafusion]

via GitHub Mon, 15 Apr 2024 16:12:49 -0700


Lordworms commented on issue #9964:
URL: 
https://github.com/apache/arrow-datafusion/issues/9964#issuecomment-2057962070


   > @Lordworms if i recall correctly, the s3 list call is made on every query, 
and if the number of files is large this can be non-trivial, so if the listed 
files can be cached after creating / the first query then we could save that 
listing time on subsequent queries.
   
   Yeah, I got it, I am implementing it, but I am confused about the caching 
granularity. whether to cache the whole parquet file or a portion of the file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [EPIC] Improve the performance of ListingTable [arrow-datafusion]

Reply via email to