Re: [PR] [SPARK-50469][SQL] V1Writes should respect the output ordering [spark]

via GitHub Fri, 03 Jan 2025 08:01:51 -0800


wecharyu commented on code in PR #49027:
URL: https://github.com/apache/spark/pull/49027#discussion_r1901932140



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala:
##########
@@ -958,6 +958,18 @@ case class Sort(
   override protected def withNewChildInternal(newChild: LogicalPlan): Sort = 
copy(child = newChild)
 }
 
+/**
+ * Clustering data within the partition.
+ *
+ * @param cluster The clustering expressions
+ * @param child   Child logical plan
+ */
+case class Clustering(cluster: Seq[SortOrder], child: LogicalPlan) extends 
UnaryNode {

Review Comment:
   Do you mean keeping the `ClusterSpec` of reverted commit? And we convert the 
`Clustering` logical plan to the `SortExec` for both `clusterKeys` and 
`sortKeys`?



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/V1WriteCommandSuite.scala:
##########
@@ -93,9 +93,10 @@ trait V1WriteCommandSuiteBase extends SQLTestUtils {
     assert(optimizedPlan != null)
     // Check whether exists a logical sort node of the write query.
     // If user specified sort matches required ordering, the sort node may not 
at the top of query.
-    assert(optimizedPlan.exists(_.isInstanceOf[Sort]) == hasLogicalSort,
-      s"Expect hasLogicalSort: $hasLogicalSort," +
-        s"Actual: ${optimizedPlan.exists(_.isInstanceOf[Sort])}")
+    def isSort(plan: LogicalPlan): Boolean =
+      plan.isInstanceOf[Sort] || plan.isInstanceOf[Clustering]

Review Comment:
   Only the tests that enable V1 writes and the output ordering not matching 
the required ordering will contain `Clustering`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-50469][SQL] V1Writes should respect the output ordering [spark]

Reply via email to