Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1580#issuecomment-50246267
Actually we have also seen this happen multiple times.
A few have them have been fixed, but not all have been identified.
For example, there is incorrect DCL for directory creation in spark.
The tricky bit is preventing creation of directories when shutdown is
happening (either via exit or via driver message).
The consolidated shuffle bug fix includes some attempts to resolve it : but
since we could not identify/nail down all cases.
We introduced something similar : retry in case there is a
FileNotFoundException creating file stream but prevent in case we were
going to shutdown (so that we dont mess up shutdown hooks trying to remove
directories !).
Regards,
Mridul
On Fri, Jul 25, 2014 at 11:41 PM, Matei Zaharia <[email protected]>
wrote:
> But the point above was that the code that creates this object goes
> through DiskBlockManager.getFile, which already creates any non-existent
> directories. So I don't think this will be a problem, unless a directory
is
> deleted exactly in the instant when we return a File and we start writing.
>
> â
> Reply to this email directly or view it on GitHub
> <https://github.com/apache/spark/pull/1580#issuecomment-50184666>.
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---