turboFei commented on issue #25795: [WIP][SPARK-29037][Core] Spark gives duplicate result when an application was killed URL: https://github.com/apache/spark/pull/25795#issuecomment-532618199 > So #25739 is to support concurrent writes to different locations, and this PR is to detect concurrent writes to the same location and fail fast? > > So #25739 is to support concurrent writes to different locations, and this PR is to detect concurrent writes to the same location and fail fast? > > I am fine with that as long as @turboFei is ok. - Firstly, for dynamicPartitionOverwrite, it should skip FileOutputCommitter. {setupJob/commitJob/abortJob}. And the concurrent support should be resolved by https://github.com/apache/spark/pull/25739 @advancedxy . - Then when dynamicPartitionOverwrite=false , if I detect there is sub dir under $table_output/_temporary exists, it should fast fail.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
