#2808: createDirectoryIfMissing should be atomic
------------------------------------+---------------------------------------
Reporter: EricKow | Owner: igloo
Type: merge | Status: reopened
Priority: normal | Milestone: 6.10.2
Component: libraries/directory | Version: 6.10.1
Severity: normal | Resolution:
Keywords: | Difficulty: Unknown
Testcase: | Os: Unknown/Multiple
Architecture: Unknown/Multiple |
------------------------------------+---------------------------------------
Comment (by duncan):
Just to clarify for other readers...
`createDirectoryIfMissing` cannot be simultaneously atomic and sensible.
The problem is that POSIX `mkdir` does not distinguish if a directory
already exists vs if a file already exists. We want to fail in one case
and `return ()` in the other.
One way to square this circle is to test for the existence of the
directory separately from the call to `mkdir`.
The original code did the test before. That meant it was non-atomic and a
`createDirectoryIfMissing` raced against itself would sometimes fail with
an exception about the directory already existing.
The current code does not do the test at all. That means that we `return
()` when a file exists and this is not sensible.
The proposed code does the test afterwards. This makes the `mkdir` bit
atomic but we can get a wrong exception if another thread does something
else between the `mkdir` and test. The proposed code goes wrong when
another thread deletes the directory in between the call to
`createDirectory` and `doesDirectoryExist` and in this case throws an
already-exists exception which is a bit confusing. Ideally in this
scenario there would be no exception (though if it were a parent dir that
was deleted then we would expect an exception when it tries to create the
child directory). There are really three cases that we want to distinguish
when `createDirectory` throws a `AlreadyExistsError`:
* a dir exists, in which case we return ();
* a non-directory filesystem object exists, in which case we re-throw
the exception;
* no filesystem object exists, in which case we return ().
It just so happens that we can handle these three cases with just
`doesFileExist` because we want to do the same thing in two cases that
`doesFileExist` identifies. This also relies on the fact that
`doesFileExist` considers all non-directory files to be files, including
symlinks, sockets, fifos, etc. If that ever changes then this trick does
not work. It is more explicit to use the internal `withFileStatus`
function.
Another behavior question is if `createDirectoryIfMissing` should ever
attempt to re-create a directory that it previously created and got
subsequently deleted. The current `createDirectory001` test assumes this
behavior by racing `removeDirectoryRecursive` with
`createDirectoryIfMissing` and expecting no exceptions. If instead we
think that `createDirectoryIfMissing` should create each parent directory
at most once then the test cannot assume never to encounter a does-not-
exist exception. This is because the thread running
`createDirectoryIfMissing` could make a parent dir, the other thread
delete it and the first thread then try to make the child directory inside
the now-deleted parent directory. The proposed code has this behavior.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/2808#comment:10>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs