[ https://issues.apache.org/jira/browse/SPARK-29649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun resolved SPARK-29649. ----------------------------------- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26312 [https://github.com/apache/spark/pull/26312] > Stop task set if FileAlreadyExistsException was thrown when writing to output > file > ---------------------------------------------------------------------------------- > > Key: SPARK-29649 > URL: https://issues.apache.org/jira/browse/SPARK-29649 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL > Affects Versions: 3.0.0 > Reporter: L. C. Hsieh > Assignee: L. C. Hsieh > Priority: Major > Fix For: 3.0.0 > > > We already know task attempts that do not clean up output files in staging > directory can cause job failure (SPARK-27194). There was proposals trying to > fix it by changing output filename, or deleting existing output files. These > proposals are not reliable completely. > The difficulty is, as previous failed task attempt wrote the output file, at > next task attempt the output file is still under same staging directory, even > the output file name is different. > If the job will go to fail eventually, there is no point to re-run the task > until max attempts are reached. For the jobs running a lot of time, > re-running the task can waste a lot of time. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org