Github user chrysan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19001#discussion_r175350408
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala
---
@@ -156,40 +144,14 @@ object FileFormatWriter extends Logging {
statsTrackers = statsTrackers
)
- // We should first sort by partition columns, then bucket id, and
finally sorting columns.
- val requiredOrdering = partitionColumns ++ bucketIdExpression ++
sortColumns
- // the sort order doesn't matter
- val actualOrdering = plan.outputOrdering.map(_.child)
- val orderingMatched = if (requiredOrdering.length >
actualOrdering.length) {
- false
- } else {
- requiredOrdering.zip(actualOrdering).forall {
- case (requiredOrder, childOutputOrder) =>
- requiredOrder.semanticEquals(childOutputOrder)
- }
- }
-
SQLExecution.checkSQLExecutionId(sparkSession)
// This call shouldn't be put into the `try` block below because it
only initializes and
// prepares the job, any exception thrown from here shouldn't cause
abortJob() to be called.
committer.setupJob(job)
try {
- val rdd = if (orderingMatched) {
- plan.execute()
- } else {
- // SPARK-21165: the `requiredOrdering` is based on the attributes
from analyzed plan, and
- // the physical plan may have different attribute ids due to
optimizer removing some
- // aliases. Here we bind the expression ahead to avoid potential
attribute ids mismatch.
- val orderingExpr = requiredOrdering
- .map(SortOrder(_, Ascending))
- .map(BindReferences.bindReference(_, outputSpec.outputColumns))
- SortExec(
--- End diff --
Removing SortExec here and adding it in EnsureRequirements Strategy will
have impact on many other DataWritingCommands which depends on
FileFormatWriter, like CreateDataSourceTableAsSelectCommand. To fix it code
changes are needed onto such DataWritingCommand implementations to export
requiredDistribution and requiredOrdering.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]