jeanlyn commented on code in PR #41628:
URL: https://github.com/apache/spark/pull/41628#discussion_r1280658515


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelationCommand.scala:
##########
@@ -162,33 +193,36 @@ case class InsertIntoHadoopFsRelationCommand(
                 retainData = true /* already deleted */).run(sparkSession)
             }
           }
+          // The `customPartitionLocations` is getting from writing paths. 
Hence, it need to move
+          // files from `qualifiedOutputPath` to custom location.
+          if (updatedPartitions.nonEmpty && dynamicPartitionOverwrite) {

Review Comment:
   It's doesn't changes the commit protocol. Before this pr, 
`customPartitionLocations` is getting by pull all partitions from metastore. 
This pr set `customPartitionLocations`  to empty(see  
https://github.com/apache/spark/pull/41628/files#diff-15b529afe19e971b138fc604909bcab2e42484babdcea937f41d18cb22d9401dR129),
 because it's unpredictable by lazily getting partitions from the written files 
when using dynamic partition write. Hence, it's need to overwrite into the 
`customPartitionLocations` when written files match these custom partitions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to