Tom-Newton commented on PR #14286: URL: https://github.com/apache/datafusion/pull/14286#issuecomment-2613996363
We are still having quite significant problems, but we can't reproduce it reliably and I haven't been personally working on it recently. Anecdotally we think it's more frequent when: 1. Reading between between regions (e.g. compute in Azure US South Central reading Azure US East blob storage). 2. When spawning a large number of parallel jobs that all read the same thing using delta-rs. 3. Reading one particular table where we need to read a larger data volume. delta-rs via object-store is still only reading metadata though (order 100MB) and the larger data volumes loaded using the `pyarrow` Azure filesystem don't seem to suffer the same problem. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org