simonvandel commented on issue #8819: URL: https://github.com/apache/arrow-datafusion/issues/8819#issuecomment-1918960253
Hi @alamb I sometimes have workloads where the time taken to plan a query dominates the time to actually perform the query. This can happen in the trivial case where the TableProviders returned no data. Let's say that it's the same logical plan being used for every workload, just with different parameters. So then, my idea was to generate a logical plan with parameter placeholders, optimize that, and cache it. Then in each actual request, pull the optimized logical plan, and replace the placeholders with actual values. The idea was that optimizing the final logical plan where the placeholders were replaced, would be very fast. This might have been naive. So short story short, I wanted to cache optimized logical plans so they can be reused with different parameters. Note that the optimizer is now quite a bit faster with the latest PRs, so the optimization time might not be a problem anymore. But I can still imagine the use case being relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
