[
https://issues.apache.org/jira/browse/SPARK-29257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939075#comment-16939075
]
Kent Yao commented on SPARK-29257:
----------------------------------
this issue should be not a problem anymore after this fix -
[https://github.com/apache/spark/pull/25620] . Each index and data file will
have a unique name with task attempt id.
> All Task attempts scheduled to the same executor inevitably access the same
> bad disk
> ------------------------------------------------------------------------------------
>
> Key: SPARK-29257
> URL: https://issues.apache.org/jira/browse/SPARK-29257
> Project: Spark
> Issue Type: Improvement
> Components: Shuffle
> Affects Versions: 2.3.4, 2.4.4
> Reporter: Kent Yao
> Priority: Major
> Attachments: image-2019-09-26-16-44-48-554.png
>
>
> We have an HDFS/YARN cluster with about 2k~3k nodes, each node has 8T * 12
> local disks for storage and shuffle. Sometimes, one or more disks get into
> bad status during computations. Sometimes it does cause job level failure,
> sometimes does.
> The following picture shows one failure job caused by 4 task attempts were
> all delivered to the same node and failed with almost the same exception for
> writing the index temporary file to the same bad disk.
>
> This is caused by two reasons:
> # As we can see in the figure the data and the node have the best data
> locality for reading from HDFS. As the default spark.locality.wait(3s) taking
> effect, there is a high probability that those attempts will be scheduled to
> this node.
> # The index file or data file name for a particular shuffle map task is
> fixed. It is formed by the shuffle id, the map id and the noop reduce id
> which is always 0. The root local dir is picked by the fixed file name's
> non-negative hash code % the disk number. Thus, this value is also fixed.
> Even when we have 12 disks in total and only one of them is broken, if the
> broken one is once picked, all the following attempts of this task will
> inevitably pick the broken one.
>
>
> !image-2019-09-26-16-44-48-554.png!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]