[
https://issues.apache.org/jira/browse/HDFS-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563153#comment-13563153
]
Konstantin Shvachko commented on HDFS-3598:
-------------------------------------------
Let me summarise the ideas expressed here.
The requirement is to have a common interface for WebHDFS and
DistributedFileSystem. So that any application written for HDFS
(DistributedFileSystem) would work for WebHDFS as well by just replacing the
URI schema.
{code}
hadoop fs -ls hdfs://nn1/user/shv
hadoop fs -ls webhdfs://nn2/user/shv
{code}
should work the same.
So far the common interface was FileSystem. So in order to satisfy the above
req. concat() should be added to FileSystem. Then all other subclasses of
FileSystem need to implement concat(). Per Harsh's observation LocalFileSystem
in particular will have to implement concat() and the LFS implementation will
work on arbitrary size files as opposed to DFS.
Is this confusing? Well, yes and no. Yes, because the behaviour is different,
indeed. No, because implementations in different file systems can differ in
their restrictions and semantics. We already have that in LFS with
getFileBlockLocations() returning "localhost:50010", which could have as well
been "IDontKnow" or "OverTheHills:ThroughTheWoods".
A new API as Nicholas proposes (HadoopDistributedFileSystem) would work, but it
will separate WebHDFS and DFS from other files systems. I am guessing HttpFS
will also want to extend that, then LFS will follow the suite, and we will end
up in the same place, having HadoopDistributedFileSystem essentially
equivalent to FileSystem.
I think we should add concat() to the FileSystem. People start using it, and
somebody might try implementing full concatenation. Exposing restricted API may
be beneficial in this case as opposed to hiding it. Especially since the
restricted version is pretty powerful by itself.
Does it make sense for you guys?
> WebHDFS: support file concat
> ----------------------------
>
> Key: HDFS-3598
> URL: https://issues.apache.org/jira/browse/HDFS-3598
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: webhdfs
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: Plamen Jeliazkov
>
> In trunk and branch-2, DistributedFileSystem has a new concat(Path trg, Path
> [] psrcs) method. WebHDFS should support it.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira