Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/15238#discussion_r80753108
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
---
@@ -191,25 +191,34 @@ object ExtractFiltersAndInnerJoins extends
PredicateHelper {
/**
* A pattern that collects all adjacent unions and returns their children
as a Seq.
+ * If the top union is wrapped in a [[Distinct]], then the [[Distinct]] in
the adjacent unions, if
+ * any, will be eliminated.
*/
object Unions {
- def unapply(plan: LogicalPlan): Option[Seq[LogicalPlan]] = plan match {
- case u: Union => Some(collectUnionChildren(mutable.Stack(u),
Seq.empty[LogicalPlan]))
+ def unapply(plan: LogicalPlan): Option[(Seq[LogicalPlan], Boolean)] =
plan match {
+ case u: Union =>
+ Some(collectUnionChildren(mutable.Stack(u), Seq.empty[LogicalPlan],
false), false)
+ case Distinct(u: Union) =>
+ Some(collectUnionChildren(mutable.Stack(u), Seq.empty[LogicalPlan],
true), true)
case _ => None
}
// Doing a depth-first tree traversal to combine all the union children.
@tailrec
private def collectUnionChildren(
plans: mutable.Stack[LogicalPlan],
- children: Seq[LogicalPlan]): Seq[LogicalPlan] = {
+ children: Seq[LogicalPlan],
+ dedupDistinct: Boolean): Seq[LogicalPlan] = {
--- End diff --
Why is this method so complicated? It seems that we can do without the
stack. A stack only makes sense if you do not want use recursion (you can use a
while loop instead). This comment has nothing to do with this PR, but with the
code as it was to begin with. Could you fix it anyway?
cc @gatorsmile as well
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]