[
https://issues.apache.org/jira/browse/HADOOP-12756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323487#comment-15323487
]
Kai Zheng commented on HADOOP-12756:
------------------------------------
Have looked at the patch, some comments about the codes:
1. There are lots of configuration items introduced in {{Constants}}. It would
be great if we could follow the pattern used in {{CommonConfigurationKeys}} to
name the property keys. For example, in
{code}
// Time until we give up on a connection to oss
public static final String SOCKET_TIMEOUT = "fs.oss.connection.timeout";
public static final int DEFAULT_SOCKET_TIMEOUT = 200000;
{code}
SOCKET_TIMEOUT -> SOCKET_TIMEOUT_KEY
2. As discussed above, better to: org.apache.hadoop.fs.oss ->
org.apache.hadoop.fs.aliyun.oss, OSSFileSystem -> AliyunOssFileSystem.
3. In the delete operation:
{code}
if (!recursive) {
FileStatus[] statuses = listStatus(status.getPath());
if (statuses.length > 0) {
throw new IOException("Cannot remove " + path + ": Is a directory!");
} else {
// Delete empty directory without '-r'
ossClient.deleteObject(bucketName, key);
statistics.incrementWriteOps(1);
}
} else {
// A large block here
}
{code}
The exception message could be more specific, like: "Cannot remote the
directory, it's not empty".
And the comment "Delete empty directory without '-r'" should be a little above.
The large block of codes for delete the directory with '-r' would be better to
move to a separate function.
{code}
//TODO: optimize logic here
try {
Path pPath = status.getPath().getParent();
FileStatus pStatus = getFileStatus(pPath);
if (pStatus.isDirectory()) {
return true;
} else {
throw new IOException("Path " + pPath +
" is assumed to be a directory!");
}
} catch (FileNotFoundException fnfe) {
return mkdir(bucketName, pathToKey(status.getPath().getParent()));
}
{code}
Any explanation about this post processing logic after the File/Directory
object is deleted successfully? Why the parent directory could be not found,
and then we need to do the mkdir here?
4. Do you want to rename {{validatePath}} to {{validatePathForMkdir}}, as
indicated by the thrown exception:
{code}
throw new FileAlreadyExistsException(String.format(
"Can't make directory for path '%s' since it is a file.", fPart));
{code}
5. Please also check codes in other operations like {{rename}}, similarly.
6. Could we move the very AliyunOss specific logic codes to an utility class to
simplify the codes in the file system impl and input/output stream?
The tests look complete and great. Thanks!
> Incorporate Aliyun OSS file system implementation
> -------------------------------------------------
>
> Key: HADOOP-12756
> URL: https://issues.apache.org/jira/browse/HADOOP-12756
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs
> Affects Versions: 2.8.0
> Reporter: shimingfei
> Assignee: shimingfei
> Attachments: HADOOP-12756-v02.patch, HADOOP-12756.003.patch,
> HADOOP-12756.004.patch, HCFS User manual.md, OSS integration.pdf, OSS
> integration.pdf
>
>
> Aliyun OSS is widely used among China’s cloud users, but currently it is not
> easy to access data laid on OSS storage from user’s Hadoop/Spark application,
> because of no original support for OSS in Hadoop.
> This work aims to integrate Aliyun OSS with Hadoop. By simple configuration,
> Spark/Hadoop applications can read/write data from OSS without any code
> change. Narrowing the gap between user’s APP and data storage, like what have
> been done for S3 in Hadoop
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]