[ 
https://issues.apache.org/jira/browse/HDFS-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated HDFS-5224:
--------------------------------

    Attachment: HDFS-5224.1.patch

Here is an initial cut at a patch.  I don't think it's final, but I want to 
discuss further with [~andrew.wang] and [~cmccabe] before doing any more work.

This changes the client-facing API in {{DistributedFileSystem}} so that any 
occurrence of a path is represented by a {{Path}} object instead of a 
{{String}}.  This includes the internal member of {{PathBasedCacheDirective}} 
and {{PathBasedCacheDescriptor}}.  Input and output {{Path}} objects are 
subject to validation and qualification against the file system and its working 
directory, just like other methods that use a {{Path}}.  {{EmptyPathError}} has 
been removed, because it is impossible to create a {{Path}} with an empty 
string.

>From this version of the patch, it looks like {{Path}} is useful in the 
>client-facing API of {{DistributedFileSystem}}, for the reasons discussed in 
>earlier issues: existing validation logic, path qualification, and consistency 
>with other methods.  However, it also looks like {{Path}} is more of a 
>hindrance after you cross the barrier from {{DistributedFileSystem}} into the 
>protocol and the namenode implementation.  The path string portion of the 
>{{Path}} is the only useful thing at that point, so we just end up passing it 
>around a lot until we need to unpack it via {{Path#toUri#getPath}}.

I'd like to float the idea of defining different objects, similar to 
{{PathBasedCacheDirective}}/{{PathBasedCacheDescriptor}}, but using strings for 
the paths.  The existing objects would be used in the interface of 
{{DistributedFileSystem}}.  The new objects would be used at all layers below: 
{{DFSClient}}, {{ClientProtocol}}, {{CacheManager}}, etc.  This would decouple 
client and server so that the two sides can evolve independently.  There is 
some existing precedent for this in that {{DFSClient}} and {{ClientProtocol}} 
use {{String}} instead of {{Path}}.

Colin and Andrew, thoughts?


> Refactor PathBasedCache* methods to use a Path rather than a String
> -------------------------------------------------------------------
>
>                 Key: HDFS-5224
>                 URL: https://issues.apache.org/jira/browse/HDFS-5224
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, namenode
>    Affects Versions: HDFS-4949
>            Reporter: Andrew Wang
>            Assignee: Chris Nauroth
>         Attachments: HDFS-5224.1.patch
>
>
> As discussed in HDFS-5213, we should refactor PathBasedCacheDirective and 
> related methods in DistributedFileSystem to use a Path to represent paths to 
> cache, rather than a String.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to