cozmo commented on PR #5268: URL: https://github.com/apache/arrow-rs/pull/5268#issuecomment-1878230167
I wanted to play around with some "real world" use cases of S3 Express 1Z serving Parquet, and this seemed like a good place to start. When I got a test project up and running with this branch, I noticed that the same datafusion queries executed against the same files in a normal bucket and a express1z bucket were taking roughly 2x as long with the express1z bucket. When I started to dive into why this was, I noticed that (in my code at least), the `SessionProvider` `fetch_token` method was being executed for every call of `GetClient` `get_request`. I observed this by adding logs to both methods, and seeing them logged 1:1. I'm new to this project, and I know this code is WIP, so it's very possible I was holding it wrong (does the caller need to set up credential caching somehow when initializing the object store?), and also very possible that I misunderstand how this code is supposed to be working. That said, I wanted to flag here in case it's unexpected that `fetch_token` is called that much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
