Clarkkkkk opened a new pull request #26090: [SPARK-29302]Fix writing file collision in dynamic partition overwrite mode within speculative execution URL: https://github.com/apache/spark/pull/26090 ### What changes were proposed in this pull request? When inserting into a partitioned DataSource table (would not reproduced if using a Hive table) with dynamic partition overwrite and speculative execution, attempts of same task will try to write same files. This PR reuse FileOutputCommitter to avoid write collision, and rename files in staging directory to final output directory using the original logic in HadoopMapReduceCommitProtocol#commitJob. ### Why are the changes needed? Task failed is this circumstance. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This patch is tested by existing tests in org.apache.spark.sql.sources.InsertSuite.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
