GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/14371
[SPARK-16736] WiP Core+ SQL superfluous fs calls
## What changes were proposed in this pull request?
A review of the code, working back from Hadoop's `FileSystem.exists()` and
`FileSystem.isDirectory()` code, then removing uses of the calls when
superfluous.
1. delete is harmless if called on a nonexistent path, so don't do any
checks before deletes
1. any `FileSystem.exists()` check before `getFileStatus()` or `open()` is
superfluous as the operation itself does the check. Instead the
`FileNotFoundException` is caught and triggers the downgraded path. When a
`FileNotFoundException` was thrown before, the code still creates a new FNFE
with the error messages. Though now the inner exceptions are nested, for easier
diagnostics.
## How was this patch tested?
Initially, relying on Jenkins test runs.
One troublespot here is that some of the codepaths are clearly error
situations; it's not clear that they have coverage anyway. Trying to create the
failure conditions in tests would be ideal, but it will also be hard.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/steveloughran/spark
cloud/SPARK-16736-superfluous-fs-calls
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/14371.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14371
----
commit d318ff4e4af0b3948eefd81aa5786603b060fa16
Author: Steve Loughran <[email protected]>
Date: 2016-07-26T12:31:28Z
[SPARK-16376] cut out exists() checks when a subsequent operation will
raise an exception anyway; either rethrow the exception or downgrade, depending
on original code
commit 02dc129b9338be6c13041fb71a5d18e866872fd8
Author: Steve Loughran <[email protected]>
Date: 2016-07-26T13:16:56Z
[SPARK-16736] updated
commit e577688c47b3835b40a88dda14f19d236ab29631
Author: Steve Loughran <[email protected]>
Date: 2016-07-26T14:13:14Z
[SPARK-16736] fix style warnings
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]