[ 
https://issues.apache.org/jira/browse/HDFS-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563153#comment-13563153
 ] 

Konstantin Shvachko commented on HDFS-3598:
-------------------------------------------

Let me summarise the ideas expressed here.
The requirement is to have a common interface for WebHDFS and 
DistributedFileSystem. So that any application written for HDFS 
(DistributedFileSystem) would work for WebHDFS as well by just replacing the 
URI schema.
{code}
hadoop fs -ls hdfs://nn1/user/shv
hadoop fs -ls webhdfs://nn2/user/shv
{code}
should work the same.
So far the common interface was FileSystem. So in order to satisfy the above 
req. concat() should be added to FileSystem. Then all other subclasses of 
FileSystem need to implement concat(). Per Harsh's observation LocalFileSystem 
in particular will have to implement concat() and the LFS implementation will 
work on arbitrary size files as opposed to DFS.

Is this confusing? Well, yes and no. Yes, because the behaviour is different, 
indeed. No, because implementations in different file systems can differ in 
their restrictions and semantics. We already have that in LFS with 
getFileBlockLocations() returning "localhost:50010", which could have as well 
been "IDontKnow" or "OverTheHills:ThroughTheWoods".

A new API as Nicholas proposes (HadoopDistributedFileSystem) would work, but it 
will separate WebHDFS and DFS from other files systems. I am guessing HttpFS 
will also want to extend that, then LFS will follow the suite, and we will end 
up in the same place, having  HadoopDistributedFileSystem essentially 
equivalent to FileSystem.

I think we should add concat() to the FileSystem. People start using it, and 
somebody might try implementing full concatenation. Exposing restricted API may 
be beneficial in this case as opposed to hiding it. Especially since the 
restricted version is pretty powerful by itself.
Does it make sense for you guys?
                
> WebHDFS: support file concat
> ----------------------------
>
>                 Key: HDFS-3598
>                 URL: https://issues.apache.org/jira/browse/HDFS-3598
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Plamen Jeliazkov
>
> In trunk and branch-2, DistributedFileSystem has a new concat(Path trg, Path 
> [] psrcs) method.  WebHDFS should support it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to