alamb commented on issue #8819:
URL: 
https://github.com/apache/arrow-datafusion/issues/8819#issuecomment-1919574887

   Thank you @simonvandel  -- that makes total sense. 
   
   FWIW the physical optimizer passes to create an Execution Plan still do non 
trivial work even after the LogicalPlan is created.
   
   I agree this usecase is reasonable one where running the optimizer takes non 
trivial time compared to query execution time. In fact this was the original 
usecase for parameterized queries in OLTP engines where the cost of planning 
dominated the cost of actually running the query so reusing a prepared 
statement was an important optimization
   
   The usecase is much less common in classic analytic systems as the query 
execution time was often so much more than even 10s of ms of planning time. 
   
   However, as analytics is pushed everywhere, planning time is more important 
I think the fact that the DataFusion optimizer is so slow makes this even more 
pronounced. Ergo I think making planning faster via #5637 is very important
   
   I am pretty sure from our (InfluxData)'s perspective, the prepared statement 
usecase is not much of a priority (especially compared to making planning 
overall faster), so @appletreeisyellow  likely can't spend a lot of time on 
this issue (though maybe she feels differently). 
   
   Thus, what I suggest is we polish up 
https://github.com/apache/arrow-datafusion/pull/9073 which improves the error 
messages and then we can leave this particular ticket open for anyone else who 
might have the usecase and wants to improve it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to