#2808: createDirectoryIfMissing should be atomic
------------------------------------+---------------------------------------
    Reporter:  EricKow              |        Owner:  igloo           
        Type:  merge                |       Status:  reopened        
    Priority:  normal               |    Milestone:  6.10.2          
   Component:  libraries/directory  |      Version:  6.10.1          
    Severity:  normal               |   Resolution:                  
    Keywords:                       |   Difficulty:  Unknown         
    Testcase:                       |           Os:  Unknown/Multiple
Architecture:  Unknown/Multiple     |  
------------------------------------+---------------------------------------
Comment (by duncan):

 Just to clarify for other readers...

 `createDirectoryIfMissing` cannot be simultaneously atomic and sensible.
 The problem is that POSIX `mkdir` does not distinguish if a directory
 already exists vs if a file already exists. We want to fail in one case
 and `return ()` in the other.

 One way to square this circle is to test for the existence of the
 directory separately from the call to `mkdir`.

 The original code did the test before. That meant it was non-atomic and a
 `createDirectoryIfMissing` raced against itself would sometimes fail with
 an exception about the directory already existing.

 The current code does not do the test at all. That means that we `return
 ()` when a file exists and this is not sensible.

 The proposed code does the test afterwards. This makes the `mkdir` bit
 atomic but we can get a wrong exception if another thread does something
 else between the `mkdir` and test. The proposed code goes wrong when
 another thread deletes the directory in between the call to
 `createDirectory` and `doesDirectoryExist` and in this case throws an
 already-exists exception which is a bit confusing. Ideally in this
 scenario there would be no exception (though if it were a parent dir that
 was deleted then we would expect an exception when it tries to create the
 child directory). There are really three cases that we want to distinguish
 when `createDirectory` throws a `AlreadyExistsError`:
   * a dir exists, in which case we return ();
   * a non-directory filesystem object exists, in which case we re-throw
 the exception;
   * no filesystem object exists, in which case we return ().

 It just so happens that we can handle these three cases with just
 `doesFileExist` because we want to do the same thing in two cases that
 `doesFileExist` identifies. This also relies on the fact that
 `doesFileExist` considers all non-directory files to be files, including
 symlinks, sockets, fifos, etc. If that ever changes then this trick does
 not work. It is more explicit to use the internal `withFileStatus`
 function.

 Another behavior question is if `createDirectoryIfMissing` should ever
 attempt to re-create a directory that it previously created and got
 subsequently deleted. The current `createDirectory001` test assumes this
 behavior by racing `removeDirectoryRecursive` with
 `createDirectoryIfMissing` and expecting no exceptions. If instead we
 think that `createDirectoryIfMissing` should create each parent directory
 at most once then the test cannot assume never to encounter a does-not-
 exist exception. This is because the thread running
 `createDirectoryIfMissing` could make a parent dir, the other thread
 delete it and the first thread then try to make the child directory inside
 the now-deleted parent directory. The proposed code has this behavior.

-- 
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/2808#comment:10>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler
_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to