Josh Rosen created SPARK-8132:
---------------------------------
Summary: Race condition if task is cancelled with interruption
while fetching file dependencies
Key: SPARK-8132
URL: https://issues.apache.org/jira/browse/SPARK-8132
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.3.1, 1.4.0
Reporter: Josh Rosen
This is a borderline impossible-to-reproduce bug:
If {{spark.files.overwrite = false}} (the default) and a Spark executor is
fetching large file dependencies from the driver _and_ the first task that
triggered file dependency loading is cancelled after it has started copying /
moving the downloaded file to its target directory, then the executor may be
put into a bad state where all subsequent tasks fail with errors about refusing
to overwrite an existing file because its contents differ from the file being
fetched.
There are a few ways to mitigate this:
- Set {{spark.files.overwrite = false}}. We should probably remove or
deprecate this configuration: the only reason that it was added was to work
around an obscure Spark 0.8-era bug where Spark would delete files out of the
driver's CWD when running tasks in local mode. This concern may have been
mitigated by other changes. Regardless, there are many environments where this
feature can safely be disabled.
- Disable {{spark.files.useFetchCache}}, which should probably be off by
default (see SPARK-8130); this will shorten the window over which the race can
occur.
- Catch InterruptedException and perform cleanup in our file moving / copying
code; this is somewhat tricky to reason about / get right because the right
behavior differs based on whether we're overwriting or creating a new file.
Given that this can be fixed with conf changes for the cases that i've seen,
I'm not sure that this needs to be a high-priority fix, although I would be
glad to review patches to clean up / audit this code to properly fix this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]