[ 
https://issues.apache.org/jira/browse/SPARK-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14556246#comment-14556246
 ] 

Imran Rashid commented on SPARK-7829:
-------------------------------------

I'll submit a PR shortly

> SortShuffleWriter writes inconsistent data & index files on stage retry
> -----------------------------------------------------------------------
>
>                 Key: SPARK-7829
>                 URL: https://issues.apache.org/jira/browse/SPARK-7829
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, Spark Core
>    Affects Versions: 1.3.1
>            Reporter: Imran Rashid
>            Assignee: Imran Rashid
>
> When a stage is retried, even if a shuffle map task was successful, it may 
> get retried in any case.  If it happens to get scheduled on the same 
> executor, the old data file is *appended*, while the index file still assumes 
> the data starts in position 0.  This leads to an apparently corrupt shuffle 
> map output, since when the data file is read, the index file points to the 
> wrong location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to