[jira] Updated: (HADOOP-1513) A likely race condition between the creation of a directory and checking for its existence in the DiskChecker class

Devaraj Das (JIRA) Thu, 21 Jun 2007 22:44:46 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Devaraj Das updated HADOOP-1513:
--------------------------------

    Status: Open  (was: Patch Available)

Ok, I realized that all what I said in my last comment will hold only for an 
mkdir( ) call, but we are making mkdirs( ) call (which internally makes a chain 
of mkdir( ) calls for each component in the path). mkdirs( ) will return false 
if any mkdir( ) call returns false. So here is a case where breaking up the 
expression evaluated within the 'if' statement will not solve the problem. 
{noformat}
    dir.mkdirs();
    if (!dir.exists()) {
        throw new DiskErrorException("can not create directory: " 
                                    + dir.toString());
    }
{noformat}

Two threads/processes (t1 & t2) go inside the mkdirs( ) call and t1 makes the 
first few (successful) calls to mkdir( ), and then t2 gets to run. t2 will 
immediately return error since the first component in the path already exists. 
Now t2 goes to the exists( ) call and that might return false since the entire 
directory tree might have not yet been created by t1. Thus, exception is thrown 
and that is not right. 

We have to make the above exists( ) check for each component in the path if 
mkdir( ) for that component fails.

So we could have a custom implementation of mkdirs( ) called mkdirsExists( ) 
that will return false if the following expression returns false.
{noformat}
   boolean mkdirsExists(String path) {
   ...........
       if (!component.mkdir( ) && !component.exists( ) ) {
          return false;
       }
  ..........
  }
{noformat}

Makes sense ?

> A likely race condition between the creation of a directory and checking for 
> its existence in the DiskChecker class
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1513
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1513
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>            Priority: Critical
>             Fix For: 0.14.0
>
>         Attachments: 1513.patch
>
>
> Got this exception in a job run. It looks like the problem is a race 
> condition between the creation of a directory and checking for its existence. 
> Specifically, the line:
> if (!dir.exists() && !dir.mkdirs()), doesn't seem safe when invoked by 
> multiple processes at the same time. 
> 2007-06-21 07:55:33,583 INFO org.apache.hadoop.mapred.MapTask: 
> numReduceTasks: 1
> 2007-06-21 07:55:33,818 WARN org.apache.hadoop.fs.AllocatorPerContext: 
> org.apache.hadoop.util.DiskChecker$DiskErrorException: can not create 
> directory: /export/crawlspace/kryptonite/ddas/dfs/data/tmp
>       at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:26)
>       at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createPath(LocalDirAllocator.java:211)
>       at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:248)
>       at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:276)
>       at 
> org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:155)
>       at 
> org.apache.hadoop.dfs.DFSClient$DFSOutputStream.newBackupFile(DFSClient.java:1171)
>       at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.(DFSClient.java:1136)
>       at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:342)
>       at 
> org.apache.hadoop.dfs.DistributedFileSystem$RawDistributedFileSystem.create(DistributedFileSystem.java:145)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.(ChecksumFileSystem.java:368)
>       at 
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:443)
>       at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:254)
>       at org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:675)
>       at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:165)
>       at 
> org.apache.hadoop.examples.RandomWriter$Map.map(RandomWriter.java:137)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:189)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1740)
> 2007-06-21 07:55:33,821 WARN org.apache.hadoop.mapred.TaskTracker: Error 
> running child

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1513) A likely race condition between the creation of a directory and checking for its existence in the DiskChecker class

Reply via email to