eeroel commented on PR #37868: URL: https://github.com/apache/arrow/pull/37868#issuecomment-1737643540
> > But that request you linked is also problematic with regards to threading > > Can you expand on this? > I was running some experiments looking at the debug logs, and it seems that these HEAD requests always get executed on the main (?) thread. And from the code it also seems that way, it's not Async. So when a dataset consists of multiple fragments the file reads start effectively in sequence, and the latency from each HEAD adds up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
