[ 
https://issues.apache.org/jira/browse/SPARK-29649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-29649.
-----------------------------------
    Fix Version/s: 3.0.0
       Resolution: Fixed

Issue resolved by pull request 26312
[https://github.com/apache/spark/pull/26312]

> Stop task set if FileAlreadyExistsException was thrown when writing to output 
> file
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-29649
>                 URL: https://issues.apache.org/jira/browse/SPARK-29649
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, SQL
>    Affects Versions: 3.0.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>             Fix For: 3.0.0
>
>
> We already know task attempts that do not clean up output files in staging 
> directory can cause job failure (SPARK-27194). There was proposals trying to 
> fix it by changing output filename, or deleting existing output files. These 
> proposals are not reliable completely.
> The difficulty is, as previous failed task attempt wrote the output file, at 
> next task attempt the output file is still under same staging directory, even 
> the output file name is different.
> If the job will go to fail eventually, there is no point to re-run the task 
> until max attempts are reached. For the jobs running a lot of time, 
> re-running the task can waste a lot of time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to