[ 
https://issues.apache.org/jira/browse/SPARK-11328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975361#comment-14975361
 ] 

Yin Huai commented on SPARK-11328:
----------------------------------

The file already exists error was thrown from [this line | 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala#L237]
 when we try to create a record writer.

> Correctly propagate error message in the case of failures when writing parquet
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-11328
>                 URL: https://issues.apache.org/jira/browse/SPARK-11328
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Yin Huai
>
> When saving data to S3 (e.g. saving to parquet), if there is an error during 
> the query execution, the partial file generated by the failed task will be 
> uploaded to S3 and the retries of this task will throw file already exist 
> error. It is very confusing to users because they may think that file already 
> exist error is the error causing the job failure. They can only find the real 
> error in the spark ui (in the stage page).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to