cloud-fan commented on a change in pull request #27842: [SPARK-31078][SQL]
Respect aliases in output ordering
URL: https://github.com/apache/spark/pull/27842#discussion_r389513418
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/AliasAwareOutputPartitioningAndOrdering.scala
##########
@@ -16,16 +16,18 @@
*/
package org.apache.spark.sql.execution
-import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute,
AttributeReference, Expression, NamedExpression}
+import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute,
AttributeReference, Expression, NamedExpression, SortOrder}
import org.apache.spark.sql.catalyst.plans.physical.{HashPartitioning,
Partitioning}
/**
- * A trait that handles aliases in the `outputExpressions` to produce
`outputPartitioning`
- * that satisfies output distribution requirements.
+ * A trait that handles aliases in the `outputExpressions` and
`orderingExpressions` to produce
+ * `outputPartitioning` and `outputOrdering` that satisfy distribution and
ordering requirements.
*/
-trait AliasAwareOutputPartitioning extends UnaryExecNode {
+trait AliasAwareOutputPartitioningAndOrdering extends UnaryExecNode {
protected def outputExpressions: Seq[NamedExpression]
+ protected def orderingExpressions: Seq[SortOrder] = child.outputOrdering
Review comment:
this implicitly indicates that the plan inherits output ordering from its
child. This seems risky to me as `SparkPlan.outputOrdering` is `Nil` by default.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]