zzzzming95 commented on code in PR #38358:
URL: https://github.com/apache/spark/pull/38358#discussion_r1008670694
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala:
##########
@@ -187,8 +188,17 @@ object FileFormatWriter extends Logging {
// We should first sort by partition columns, then bucket id, and finally
sorting columns.
val requiredOrdering =
partitionColumns ++ writerBucketSpec.map(_.bucketIdExpression) ++
sortColumns
+
+ // SPARK-40588: plan may contain an AdaptiveSparkPlanExec, which does not
know
+ // its final plan's ordering, so we have to materialize that plan first
+ def materializeAdaptiveSparkPlan(plan: SparkPlan): SparkPlan = plan match {
Review Comment:
I found a bug similar to this issue. The core of these problems is that
`V1Writes#prepareQuery()` will generate a new Sort .
https://github.com/apache/spark/blob/a2f3958a29ad16a1b3e372156d6c6ae4959d5e8c/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala#L70
In this pr : https://github.com/apache/spark/pull/38356
My solution is that in `V1Writes#prepareQuery()`, the missing ordering field
of Sort in the top layer is also brought.
If `AdaptiveSparkPlanExec` is removed and the plan is extracted, will it
cause other situations that do not meet expectations? For example, removing
'AdaptiveSparkPlanExec' causes some AQE features to fail to take effect?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]