[
https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900821#comment-15900821
]
Saisai Shao edited comment on SPARK-19812 at 3/8/17 7:40 AM:
-------------------------------------------------------------
[~tgraves], I'm not quite sure what you mean here?
bq. The tests are using files rather then directories so it didn't catch. We
need to fix the test also.
>From my understanding this issues happens when dest dir is not empty and try
>to move with REPLACE_EXISTING. Also be happened when calling rename failed and
>the source dir is not empty directory.
But I cannot imagine how this happened, because if dest dir is not empty, then
it should be returned before, will not go to check old NM local dirs.
was (Author: jerryshao):
[~tgraves], I'm not quite sure what you mean here?
bq. The tests are using files rather then directories so it didn't catch. We
need to fix the test also.
>From my understanding this issues happens when dest dir is not empty and try
>to move with REPLACE_EXISTING, but I cannot imagine how this happened, because
>if dest dir is not empty, then it should be returned before, will not go to
>check old NM local dirs.
> YARN shuffle service fails to relocate recovery DB directories
> --------------------------------------------------------------
>
> Key: SPARK-19812
> URL: https://issues.apache.org/jira/browse/SPARK-19812
> Project: Spark
> Issue Type: Bug
> Components: YARN
> Affects Versions: 2.0.1
> Reporter: Thomas Graves
> Assignee: Thomas Graves
>
> The yarn shuffle service tries to switch from the yarn local directories to
> the real recovery directory but can fail to move the existing recovery db's.
> It fails due to Files.move not doing directories that have contents.
> 2017-03-03 14:57:19,558 [main] ERROR yarn.YarnShuffleService: Failed to move
> recovery file sparkShuffleRecovery.ldb to the path
> /mapred/yarn-nodemanager/nm-aux-services/spark_shuffle
> java.nio.file.DirectoryNotEmptyException:/yarn-local/sparkShuffleRecovery.ldb
> at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:498)
> at
> sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
> at java.nio.file.Files.move(Files.java:1395)
> at
> org.apache.spark.network.yarn.YarnShuffleService.initRecoveryDb(YarnShuffleService.java:369)
> at
> org.apache.spark.network.yarn.YarnShuffleService.createSecretManager(YarnShuffleService.java:200)
> at
> org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:174)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:143)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:262)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:357)
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:636)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
> This used to use f.renameTo and we switched it in the pr due to review
> comments and it looks like didn't do a final real test. The tests are using
> files rather then directories so it didn't catch. We need to fix the test
> also.
> history:
> https://github.com/apache/spark/pull/14999/commits/65de8531ccb91287f5a8a749c7819e99533b9440
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]