[
https://issues.apache.org/jira/browse/SPARK-15371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15288013#comment-15288013
]
Saisai Shao commented on SPARK-15371:
-------------------------------------
I think what you mentioned above is relate to this JIRA SPARK-14963, it is
already fixed and merged into master, the target release version is 2.1.0.
> YARNShuffleService doesn't get current local-dirs from NodeManager
> ------------------------------------------------------------------
>
> Key: SPARK-15371
> URL: https://issues.apache.org/jira/browse/SPARK-15371
> Project: Spark
> Issue Type: Bug
> Components: Shuffle, YARN
> Affects Versions: 1.5.0, 1.5.1, 1.5.2, 1.6.0, 1.6.1, 1.6.2, 2.0.0
> Reporter: Jeff Field
> Priority: Minor
>
> In YarnShuffleService.java, the YarnShuffleService loads in the conf settings
> from YARN to get a list of local directories, and then if it doesn't find an
> existing levelDB file on any of them (for recovery), it will create one in
> the directory that is the first element of the list. Since it isn't asking
> YARN for the current list of healthy local-dirs (rather just the ones in the
> config), if the first directory is a known-bad location to the NodeManager,
> YarnShuffleService will continue to try to use it.
> Removing the bad directory from the config fixes this, but Spark should get a
> current list from YARN instead of using the list from the config. There are
> examples of this in
> https://github.com/apache/hadoop/blob/branch-2.7.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/TestDiskFailures.java
> but I'm not sure the right way for Spark to implement that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]