alamb commented on issue #15067: URL: https://github.com/apache/datafusion/issues/15067#issuecomment-2706979844
I think the solution here is to make more requests, each for a smaller amount of data. For example, instead of a single request for 93MB, it could make 23 requests of 4 MB each, or 100 requests for 1MB each The only real question in my mind is where to add this logic (in the parquet reader or as an object store wrapper) The easiest thing for now is probably to make an object store wrapper, like `LimitedRequestSizeObjectStore` above , that makes fewer smaller requests. Longer term it might make sense to look into making the parquet reader more fine grained (as in be able to start decoding pages from a row group before they are all fetched) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org