[jira] [Commented] (SPARK-6067) Spark sql hive dynamic partitions job will fail if task fails

Jason Hubbard (JIRA) Thu, 23 Apr 2015 08:54:18 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509250#comment-14509250
 ]


Jason Hubbard commented on SPARK-6067:
--------------------------------------

I was able to test this.  I had problems reproducing myself in a different 
environment and realized I was using a TEXT storage handler instead of a 
PARQUET handler in my tests.  After making this PARQUET, I was able to 
reproduce and then verified the patch worked, the old file was deleted and the 
new worker started to recreate the file.

Just an FYI, it looks like text, sequence, and avro record writers use the file 
system create with either the default overwrite option or specify it directly 
while PARQUET and ORC record writers do not overwrite and would be the ones 
affected by this.

> Spark sql hive dynamic partitions job will fail if task fails
> -------------------------------------------------------------
>
>                 Key: SPARK-6067
>                 URL: https://issues.apache.org/jira/browse/SPARK-6067
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.2.0
>            Reporter: Jason Hubbard
>            Priority: Minor
>         Attachments: job.log
>
>
> When inserting into a hive table from spark sql while using dynamic 
> partitioning, if a task fails it will cause the task to continue to fail and 
> eventually fail the job.
> /mytable/.hive-staging_hive_2015-02-27_11-53-19_573_222-3/-ext-10000/partition=2015-02-04/part-00001
>  for client <ip> already exists
> The task may need to clean up after a failed task to write to the location of 
> the previously failed task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-6067) Spark sql hive dynamic partitions job will fail if task fails

Reply via email to