[ 
https://issues.apache.org/jira/browse/HADOOP-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683330#action_12683330
 ] 

He Yongqiang commented on HADOOP-5438:
--------------------------------------

yeah, it would be more POSIX-like if we add this into FileSystem.open(). But 
Hadoop's FileSystem.open() opens an InputStream for reading, and 
FileSytem.create() and FileSystem.append() opens an OutputStream for writting. 
I think it would be more easy to add this to create() or appen(). 
I think we have two options here:
1) add a boolean flag to append() to indicate that should we create the file if 
it does not existes
2) add a Enum Type (let's call it Flag or sth. like that).  And pass this mode 
param to create().  like:
{code}
 public abstract FSDataOutputStream create(Path f,
      FsPermission permission,
  //    boolean overwrite,
      short mode, 
      int bufferSize,
      short replication,
      long blockSize,
      Progressable progress) throws IOException;

static class Flag {
                private static short OVERWRITE = (short) 0x01;
                private static short APPEND = (short) 0x02;
                private short mode;

                public Flag(short mode) {
                        this.mode = mode;
                }
                
                public boolean isOverwrite(){
                        return (this.mode&OVERWRITE)>0;
                }
                
                public boolean isAppend(){
                        return (this.mode&APPEND)>0;
                }
        }
{code}

1) and 2) have the same effect. And 2) is more POSIX-like, but almost every 
supported file systems have implemented the create(), so it need more 
modication to the code.

with 1), we only need to add few lines in FSNameSystem's startFileInternal like 
this:

{code}
 if (append) {
        if (myFile == null) {
+           if(create){
+              startFileInternal(src, permissions,holder, clientMachine,false, 
false, replication,blockSize)  // create the file.
+           } else{
               throw new FileNotFoundException("failed to append to 
non-existent file "
                                       + src + " on client " + clientMachine);
             }
        } else if (myFile.isDirectory()) {
          throw new IOException("failed to append to directory " + src 
                                +" on client " + clientMachine);
        }
      } else if (!dir.isValidToCreate(src)) {
        if (overwrite) {
          delete(src, true);
        } else {
          throw new IOException("failed to create file " + src 
                                +" on client " + clientMachine
                                +" either because the filename is invalid or 
the file exists");
        }
      }
{code}

pls comment on which option should we choose or other options.

> Merge FileSystem.create and FileSystem.append
> ---------------------------------------------
>
>                 Key: HADOOP-5438
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5438
>             Project: Hadoop Core
>          Issue Type: Improvement
>            Reporter: He Yongqiang
>
> Currently, when a user wants to modify a file, the user first calls exists() 
> to know if this file is already there. And then uses create() or append() 
> according to whether the file exists or not.
> the code looks like:
> {code}
> FSDataOutputStream out_1 = null;
> if (fs.exists(path_1))
>    out_1 = fs.append(path_1);
> else
>    out_1 = fs.create(path_1);
> {code}
> . On the performace side,It involes two RPCs. On the easy-of-use side, it is 
> not very convient in contrast to the traditional open interface.
> It will more complicate if there is a overwrite parameter specified. I donot 
> know whether there is a bug about 'overwrite' in 0.19, some times it takes a 
> long time for overwrite creates to reture. So i make the write file code with 
> overwrite param works like:
> {code}
> boolean exists = fs.exists(name);
> if (overwrite) {
>     if (exists)
>        fs.delete(name, true);
>      this.out = fs.create(name, overwrite, bufferSize, replication,
>                                   blockSize, progress);
>      this.currentRowID = 0;
>  } else {
>    if (!exists)
>       this.out = fs.create(name, overwrite, bufferSize,
>                                       replication, blockSize, progress);
>    else
>       this.out = fs.append(name, bufferSize, progress);
> {code}
> Some code statements there are really redundant and not needed, especialy 
> with the delete(). But without deleting first, the overwrite takes a long 
> time to reture.
> BTW, i will create another issue about the overwrite problem. If it is not a 
> bug at all or a duplicate, someone please close it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to