Github user jerryshao commented on a diff in the pull request:
https://github.com/apache/spark/pull/18905#discussion_r132896967
--- Diff:
common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java
---
@@ -333,33 +333,63 @@ protected Path getRecoveryPath(String fileName) {
}
/**
+ * Check the chosen DB file available or not.
+ */
+ protected Boolean checkFileAvailable(File file) {
--- End diff --
I'm not sure if it is a thorough way to check disk healthy, in our internal
case, we found that disk is not mounted (due to failure), and trying to write
to this unmounted disk throws permission deny exception.
I'm thinking that disk unwritable is just one case of disk unhealthy, maybe
we should check YARN's disk healthy check mechanism.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]