Matthew Ho created GOBBLIN-1619:
-----------------------------------

             Summary: WriterUtils.mkdirsWithRecursivePermission contains race 
condition and puts unnecessary load on filesystem
                 Key: GOBBLIN-1619
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1619
             Project: Apache Gobblin
          Issue Type: Bug
            Reporter: Matthew Ho


The current implementation recursively calls fs.mkdirs has the following issues:
 * *Race condition for creating parent directories, causing FileNotFound 
exception even when the file exists on file system*

 * {*}HDFS fs.mkdirs atomically creates missing parent directories. Thus, the 
recursive approach is making unnecessary calls.{*}{*}{*}

HDFS, which the current FileSystem interface is built upon, guarantees the 
parents will be created. So all FileSystem class implementations should also 
follow this behavior. 

 

*Note the 
[FileSystem|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html]
 abstract class documentation says the following:*

The behaviour of the filesystem is [specified in the Hadoop documentation. 
|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html]However,
 the normative specification of the behavior of this class is actually HDFS: 
{color:#de350b}if HDFS does not behave the way these Javadocs or the 
specification in the Hadoop documentations define, assume that the 
documentation is incorrect{color}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to