[ 
https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323487#comment-15323487
 ] 

Kai Zheng commented on HADOOP-12756:
------------------------------------

Have looked at the patch, some comments about the codes:

1. There are lots of configuration items introduced in {{Constants}}. It would 
be great if we could follow the pattern used in {{CommonConfigurationKeys}} to 
name the property keys. For example, in
{code}
 // Time until we give up on a connection to oss
 public static final String SOCKET_TIMEOUT = "fs.oss.connection.timeout";
 public static final int DEFAULT_SOCKET_TIMEOUT = 200000;
{code}
SOCKET_TIMEOUT -> SOCKET_TIMEOUT_KEY

2. As discussed above, better to: org.apache.hadoop.fs.oss -> 
org.apache.hadoop.fs.aliyun.oss, OSSFileSystem -> AliyunOssFileSystem.

3. In the delete operation:
{code}
      if (!recursive) {
        FileStatus[] statuses = listStatus(status.getPath());
        if (statuses.length > 0) {
          throw new IOException("Cannot remove " + path + ": Is a directory!");
        } else {
          // Delete empty directory without '-r'
          ossClient.deleteObject(bucketName, key);
          statistics.incrementWriteOps(1);
        }
      } else {
         // A large block here 
      }
{code}
The exception message could be more specific, like: "Cannot remote the 
directory, it's not empty". 
And the comment "Delete empty directory without '-r'" should be a little above.
The large block of codes for delete the directory with '-r' would be better to 
move to a separate function.

{code}
    //TODO: optimize logic here
    try {
      Path pPath = status.getPath().getParent();
      FileStatus pStatus = getFileStatus(pPath);
      if (pStatus.isDirectory()) {
        return true;
      } else {
        throw new IOException("Path " + pPath +
            " is assumed to be a directory!");
      }
    } catch (FileNotFoundException fnfe) {
      return mkdir(bucketName, pathToKey(status.getPath().getParent()));
    }
{code}
Any explanation about this post processing logic after the File/Directory 
object is deleted successfully? Why the parent directory could be not found, 
and then we need to do the mkdir here?

4. Do you want to rename {{validatePath}} to {{validatePathForMkdir}}, as 
indicated by the thrown exception:
{code}
          throw new FileAlreadyExistsException(String.format(
              "Can't make directory for path '%s' since it is a file.", fPart));
{code}

5. Please also check codes in other operations like {{rename}}, similarly.

6. Could we move the very AliyunOss specific logic codes to an utility class to 
simplify the codes in the file system impl and input/output stream?

The tests look complete and great. Thanks!



> Incorporate Aliyun OSS file system implementation
> -------------------------------------------------
>
>                 Key: HADOOP-12756
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12756
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 2.8.0
>            Reporter: shimingfei
>            Assignee: shimingfei
>         Attachments: HADOOP-12756-v02.patch, HADOOP-12756.003.patch, 
> HADOOP-12756.004.patch, HCFS User manual.md, OSS integration.pdf, OSS 
> integration.pdf
>
>
> Aliyun OSS is widely used among China’s cloud users, but currently it is not 
> easy to access data laid on OSS storage from user’s Hadoop/Spark application, 
> because of no original support for OSS in Hadoop.
> This work aims to integrate Aliyun OSS with Hadoop. By simple configuration, 
> Spark/Hadoop applications can read/write data from OSS without any code 
> change. Narrowing the gap between user’s APP and data storage, like what have 
> been done for S3 in Hadoop 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to