Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/10577#discussion_r49159276
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
---
@@ -37,6 +37,13 @@ abstract class Optimizer extends
RuleExecutor[LogicalPlan] {
// SubQueries are only needed for analysis and can be removed before
execution.
Batch("Remove SubQueries", FixedPoint(100),
EliminateSubQueries) ::
+ // - Do the first call of CombineUnions before starting the major
Optimizer rules,
+ // since it can reduce the number of iteration and the other rules
could add/move
+ // extra operators between two adjacent Union operators.
+ // - Call CombineUnions again in Batch("Operator Optimizations"),
+ // since the other rules might make two separate Unions operators
adjacent.
+ Batch("Union", FixedPoint(100),
--- End diff --
Yeah, once is enough. Will reduce it to once.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]