[
https://issues.apache.org/jira/browse/HADOOP-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595335#action_12595335
]
Doug Cutting commented on HADOOP-3246:
--------------------------------------
Overall this looks very good! A few minor suggestions:
- We should accept the username and password in the uri, as ftp://user:[EMAIL
PROTECTED]/, with the password optional.
- It would be nice if folks could specify different usernames and passwords
for different hosts in their configuration, perhaps with properties like
ftp.user.host.example.com and ftp.password.host.example.com.
- Rather than keeping a connection open in the FileSystem instance, perhaps we
should open and close new connections for each file read, written, renamed,
etc? FileSystem.java caches FileSystem implementations forever, and an FTP
connection might time out. Also, the working-directory state of the connection
makes this not thread-safe, which a connection per request would fix.
- It would be best if the unit tests ran standalone, without requiring an
external FTP server. We might include the Mina FTP server just for testing?
We could put the jars somewhere in src/test.
> FTP client over HDFS
> --------------------
>
> Key: HADOOP-3246
> URL: https://issues.apache.org/jira/browse/HADOOP-3246
> Project: Hadoop Core
> Issue Type: New Feature
> Components: util
> Affects Versions: 0.16.3
> Reporter: Ankur
> Priority: Minor
> Attachments: ftpFileSystem.patch
>
>
> An FTP client that stores content directly into HDFS allows data from FTP
> serves to be stored directly into HDFS instead of first copying the data
> locally and then uploading it into HDFS. The benefits are apparent from an
> administrative perspective as large datasets can be pulled from FTP servers
> with minimal human intervention.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.