cboumalh commented on code in PR #54236:
URL: https://github.com/apache/spark/pull/54236#discussion_r2801424618


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala:
##########
@@ -566,9 +566,58 @@ abstract class UnionBase extends LogicalPlan {
   }
 }
 
+/**
+ * Trait for union operators that support standard optimizer rules like 
flattening and
+ * projection push-down.
+ *
+ * IMPORTANT: Child order preservation semantics:
+ * - Optimizer rules operating on UnionLike MUST preserve the order of children
+ * - Some implementations (e.g., SequentialStreamingUnion) have semantically 
significant
+ *   child ordering where children are processed sequentially in order
+ * - Other implementations (e.g., Union) have no semantic ordering - children 
are processed
+ *   in parallel and their order is insignificant for correctness
+ * - Order-dependent optimizations (e.g., reordering children for performance) 
should ONLY
+ *   be applied to specific implementations where order is not significant, 
NOT to UnionLike
+ *   in general
+ *
+ * Optimizer rules that are safe for all UnionLike implementations:
+ * - Flattening nested unions (preserves child order within the flattened 
sequence)
+ * - Pushing projections down to children (each child maintains its position)
+ * - Collapsing adjacent projects (order-preserving transformation)
+ *
+ * IMPORTANT: Union and SequentialStreamingUnion should never be merged with 
each other
+ * as they have different execution semantics. Use `isSameType` to check 
compatibility.
+ *
+ * This trait provides:
+ * - byName: controls whether column resolution is by name vs position
+ * - allowMissingCol: allows children to have different columns (requires 
byName)
+ * - children: inherited from LogicalPlan via UnionBase
+ * - isSameType: checks if another UnionLike can be merged with this one
+ */
+trait UnionLike extends LogicalPlan { self: UnionBase =>
+  def byName: Boolean
+  def allowMissingCol: Boolean
+
+  /**
+   * Returns true if the other UnionLike is the same concrete type as this one.
+   * Used during flattening to ensure Union and SequentialStreamingUnion are 
not merged.
+   */
+  def isSameType(other: UnionLike): Boolean
+
+  /**
+   * Creates a new instance of this UnionLike with the specified children.
+   * Implementations should delegate to their case class copy() method.
+   */
+  def withNewChildrenSeq(newChildren: Seq[LogicalPlan]): UnionLike

Review Comment:
   Nice fix, also a small nit, can we just use the function that is provided by 
TreeNode instead of adding it to the trait? 
https://github.com/apache/spark/blob/ac9b01ecff6f26d6f06e97afa059cee0d075ebab/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala#L363



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to