pan3793 commented on code in PR #52584:
URL: https://github.com/apache/spark/pull/52584#discussion_r2427975373


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##########
@@ -157,6 +153,24 @@ object FileFormatWriter extends Logging {
     val actualOrdering = writeFilesOpt.map(_.child)
       .getOrElse(materializeAdaptiveSparkPlan(plan))
       .outputOrdering
+
+    val requiredOrdering = {
+      // We should first sort by dynamic partition columns, then bucket id, 
and finally sorting
+      // columns.
+      val ordering = partitionColumns.drop(numStaticPartitionCols) ++
+        writerBucketSpec.map(_.bucketIdExpression) ++ sortColumns
+      plan.logicalLink match {

Review Comment:
   > I'm a bit worried about this. In AQE we have a fallback to find logical 
link in the children, so that it's more reliable.
   
   @cloud-fan do you suggest 
   
   ```patch
   - plan.logicalLink match {
   + plan.logicalLink.orElse {
   +   plan.collectFirst { case p if p.logicalLink.isDefined => 
p.logicalLink.get }
   + } match {
   ```
   
   > Shall we remove the adding sort here completly if planned write is enabled 
(`WriteFiles` is present)?
   
   I think the current code has already satisfied your expectation, when 
planned write is enabled:
   1. if concurrent writer is disabled, the calculated required ordering won't 
be used.
   2. if concurrent writer is enabled, the calculated required ordering is only 
used in the concurrent writer step 2.
   
   
https://github.com/apache/spark/blob/29434ea766b0fc3c3bf6eaadb43a8f931133649e/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatDataWriter.scala#L393-L406



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to