[ 
https://issues.apache.org/jira/browse/SPARK-29649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-29649:
----------------------------------
    Component/s:     (was: SQL)
                 Spark Core

> Stop task set if FileAlreadyExistsException was thrown when writing to output 
> file
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-29649
>                 URL: https://issues.apache.org/jira/browse/SPARK-29649
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: L. C. Hsieh
>            Assignee: L. C. Hsieh
>            Priority: Major
>
> We already know task attempts that do not clean up output files in staging 
> directory can cause job failure (SPARK-27194). There was proposals trying to 
> fix it by changing output filename, or deleting existing output files. These 
> proposals are not reliable completely.
> The difficulty is, as previous failed task attempt wrote the output file, at 
> next task attempt the output file is still under same staging directory, even 
> the output file name is different.
> If the job will go to fail eventually, there is no point to re-run the task 
> until max attempts are reached. For the jobs running a lot of time, 
> re-running the task can waste a lot of time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to