findepi commented on code in PR #5623:
URL: https://github.com/apache/datafusion/pull/5623#discussion_r1664596130
##########
datafusion/optimizer/src/optimizer.rs:
##########
@@ -330,15 +324,14 @@ impl Optimizer {
}
log_plan(&format!("Optimized plan (pass {i})"), &new_plan);
- // TODO this is an expensive way to see if the optimizer did
anything and
- // it would be better to change the OptimizerRule trait to return
an Option
- // instead
- if old_plan.as_ref() == &new_plan {
+ // HashSet::insert returns, whether the value was newly inserted.
+ let plan_is_fresh =
+ previous_plans.insert(LogicalPlanSignature::new(&new_plan));
+ if !plan_is_fresh {
Review Comment:
@mslapek thanks for response!
> What motivated you to review this merged (one year ago) PR?
I find the source PR as an efficient way to ask a question about the code
and reach people with context -- code author(s) and reviewers -- in one shot.
> Yes, a cycle shows a logical error in the optimizer - because at each step
the performance should be improved.
Agreed!
> At the same time, it does NOT imply a correctness error. The plan probably
will be still correct (but not optimal).
Agreed!
> We should take the perspective of a user. If a data scientist does
research, should we harm the availability of the database?
Suboptimal plan can be orders of magnitude more expensive to execute, so
allowing it to run may cause unavailability for others, but I see your point.
It's especially difficult to transition from bug lenient treatment to more
strict. It should be gradual. I like the idea of having this initially
controlled with a flag. In tests and on CI this flag should be set to "fail".
Then we can switch the flag on for runtime as well.
@alamb @jackwener thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]