[
https://issues.apache.org/jira/browse/HDFS-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-5224:
--------------------------------
Attachment: HDFS-5224.1.patch
Here is an initial cut at a patch. I don't think it's final, but I want to
discuss further with [~andrew.wang] and [~cmccabe] before doing any more work.
This changes the client-facing API in {{DistributedFileSystem}} so that any
occurrence of a path is represented by a {{Path}} object instead of a
{{String}}. This includes the internal member of {{PathBasedCacheDirective}}
and {{PathBasedCacheDescriptor}}. Input and output {{Path}} objects are
subject to validation and qualification against the file system and its working
directory, just like other methods that use a {{Path}}. {{EmptyPathError}} has
been removed, because it is impossible to create a {{Path}} with an empty
string.
>From this version of the patch, it looks like {{Path}} is useful in the
>client-facing API of {{DistributedFileSystem}}, for the reasons discussed in
>earlier issues: existing validation logic, path qualification, and consistency
>with other methods. However, it also looks like {{Path}} is more of a
>hindrance after you cross the barrier from {{DistributedFileSystem}} into the
>protocol and the namenode implementation. The path string portion of the
>{{Path}} is the only useful thing at that point, so we just end up passing it
>around a lot until we need to unpack it via {{Path#toUri#getPath}}.
I'd like to float the idea of defining different objects, similar to
{{PathBasedCacheDirective}}/{{PathBasedCacheDescriptor}}, but using strings for
the paths. The existing objects would be used in the interface of
{{DistributedFileSystem}}. The new objects would be used at all layers below:
{{DFSClient}}, {{ClientProtocol}}, {{CacheManager}}, etc. This would decouple
client and server so that the two sides can evolve independently. There is
some existing precedent for this in that {{DFSClient}} and {{ClientProtocol}}
use {{String}} instead of {{Path}}.
Colin and Andrew, thoughts?
> Refactor PathBasedCache* methods to use a Path rather than a String
> -------------------------------------------------------------------
>
> Key: HDFS-5224
> URL: https://issues.apache.org/jira/browse/HDFS-5224
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Affects Versions: HDFS-4949
> Reporter: Andrew Wang
> Assignee: Chris Nauroth
> Attachments: HDFS-5224.1.patch
>
>
> As discussed in HDFS-5213, we should refactor PathBasedCacheDirective and
> related methods in DistributedFileSystem to use a Path to represent paths to
> cache, rather than a String.
--
This message was sent by Atlassian JIRA
(v6.1#6144)