[
https://issues.apache.org/jira/browse/HADOOP-9377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Yu updated HADOOP-9377:
-----------------------------
Attachment: HADOOP-9377.diff
suggestion of code fix.
svn diff
Index: src/main/java/org/apache/hadoop/fs/ftp/FTPFileSystem.java
===================================================================
--- src/main/java/org/apache/hadoop/fs/ftp/FTPFileSystem.java (revision
1453599)
+++ src/main/java/org/apache/hadoop/fs/ftp/FTPFileSystem.java (working copy)
@@ -380,7 +380,7 @@
FTPFile[] ftpFiles = client.listFiles(absolute.toUri().getPath());
FileStatus[] fileStats = new FileStatus[ftpFiles.length];
for (int i = 0; i < ftpFiles.length; i++) {
- fileStats[i] = getFileStatus(ftpFiles[i], absolute);
+ fileStats[i] = getFileStatus(ftpFiles[i], absolute, workDir);
}
return fileStats;
}
@@ -422,7 +422,7 @@
if (ftpFiles != null) {
for (FTPFile ftpFile : ftpFiles) {
if (ftpFile.getName().equals(file.getName())) { // file found in dir
- fileStat = getFileStatus(ftpFile, parentPath);
+ fileStat = getFileStatus(ftpFile, parentPath, workDir);
break;
}
}
@@ -442,7 +442,7 @@
* @param parentPath
* @return FileStatus
*/
- private FileStatus getFileStatus(FTPFile ftpFile, Path parentPath) {
+ private FileStatus getFileStatus(FTPFile ftpFile, Path parentPath, Path
workDir) {
long length = ftpFile.getSize();
boolean isDir = ftpFile.isDirectory();
int blockReplication = 1;
@@ -456,7 +456,7 @@
String group = ftpFile.getGroup();
Path filePath = new Path(parentPath, ftpFile.getName());
return new FileStatus(length, isDir, blockReplication, blockSize, modTime,
- accessTime, permission, user, group, filePath.makeQualified(this));
+ accessTime, permission, user, group,
filePath.makeQualified(this.getUri(), workDir));
}
> FTPFileSystem.listStatus() runs very slow, due to inappropriate call of
> filePath.makeQualified
> ----------------------------------------------------------------------------------------------
>
> Key: HADOOP-9377
> URL: https://issues.apache.org/jira/browse/HADOOP-9377
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Affects Versions: 2.0.3-alpha
> Reporter: James Yu
> Attachments: HADOOP-9377.diff
>
>
> FTPFileSystem.listStatus() calls
> getFileStatus(ftpFiles[i], absolute) calls
> new FileStatus(....) calls
> filePath.makeQualified(...) calls
> fs.getWorkingDirectory() calls
> getHomeDirectory()
> which creates new FTP connection every time, to get the workdingDirectory.
> this caused the FTPFileSystem.listStatus() takes long time to run (on average
> 3-6 seconds per file in my test).
> I attach a suggestion of fix in FTPFileSystem.java, only 4 lines of change.
> after the fix, there's no slowness issue anymore.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira