alamb commented on PR #19462:
URL: https://github.com/apache/datafusion/pull/19462#issuecomment-3700924809

   
   > > In general I would really love to help make DataFusion planning (much) 
faster -- I think we have all the pieces now, but it will take some focused 
profiling effort to knock down the things that consume time to plan
   > 
   > A planning performance boost would make much more sense to me, but in this 
issue and PR, I am only considering the situation where the plan is already 
built, optimized, and ready to be reused as an artifact (since re-planning can 
sometimes run in the background to account for changes in statistics, etc.). 
What do you think about the introduced feature? Can we move the state out of 
the plans to make re-execution cheaper (of-course, under the feature flag, as 
it requires to re-design some `ExecutionPlan` trait methods)?
   
   I think the idea of moving state out of the plans is a nice design in 
theory. However, I am concerned that the practical ability to actually migrate 
the codebase (and all the consumers of DataFusion) to this pattern. 
   
   > It seems this approach implies that we must somehow know that the 
properties remain unchanged for each particular plan. That sounds much harder 
to me than extracting state from the plan and not calling with_new_children at 
all -- in other words, avoiding analysis unless it is required.
   
   I think it could potentially be harder to implement the feature, but I think 
it would be much easier to migrate all downstream consumers (as they wouldn't 
have to do anything)
   
   It might not be as bad as it sounds. For example, what if we added some sort 
of fingerprint (maybe a hash) for EquivalenceProperties that is very fast to 
compute. That would make it simple and straightforward to check for equality 🤔 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to