[
https://issues.apache.org/jira/browse/HADOOP-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442315#comment-13442315
]
Chris Douglas edited comment on HADOOP-8724 at 8/27/12 6:25 PM:
----------------------------------------------------------------
bq. Simply instantiating a Glob object when the path might not be a glob may be
problematic. Perhaps a better way to handle might be to have a static
Path.create(String) which returns either a Path or GlobPath.
Distinguishing intent where the {{Path}} is created (as above) could solve part
of the problem with HADOOP-8709 (caller can resolve which API to use?), but I
don't think a subtype of {{Path}} will solve the other issues. Dispatch is
still on the static type, so nothing is solved for the callee.
bq. The first was to simply have a base path and a string pattern as the
parameters to globStatus. I thought it would be better to encapsulate the two
into a single Glob object so it is obvious when an API takes a glob and when it
does not.
Having an API advertise its "globiness" is useful, I like it. Though users
specifying a single resource will need to create {{Glob}} objects on top of
{{Path}} objects which are really URIs... which seems unnecessarily confusing.
Still, methods for {{Configuration}} are also straightforward; {{setGlob()}}
would need to escape everything in the URI side first (so users could continue
to specify globs on the commandline as {{Paths}} with special characters), but
aside from that it seems straightforward. "Correcting" it everywhere in the
code may be prohibitive, though...
Since many of these are user-facing, do you think we need a more specific type
than {{String}} for the glob part? {{ls /users/hadoop/\*.foo}} translated into:
{code}fs.globStatus(new Glob(new Path("hdfs://nn:8020/users/hadoop"),
Pattern.compile("*.foo"))){code}
seems like it's strayed from sanity...
was (Author: chris.douglas):
bq. Simply instantiating a Glob object when the path might not be a glob
may be problematic. Perhaps a better way to handle might be to have a static
Path.create(String) which returns either a Path or GlobPath.
Distinguishing intent where the {{Path}} is created (as above) could solve part
of the problem with HADOOP-8709 (caller can resolve which API to use?), but I
don't think a subtype of {{Path}} will solve the other issues. Dispatch is
still on the static type, so nothing is solved for the callee.
bq. The first was to simply have a base path and a string pattern as the
parameters to globStatus. I thought it would be better to encapsulate the two
into a single Glob object so it is obvious when an API takes a glob and when it
does not.
Having an API advertise its "globiness" is useful, I like it. Though users
specifying a single resource will need to create {{Glob}} objects on top of
{{Path}} objects which are really {{URI}}s... which seems unnecessarily
confusing. Still, methods for {{Configuration}} are also straightforward;
{{setGlob()}} would need to escape everything in the {{URI}} side first (so
users could continue to specify globs on the commandline as {{Paths}} with
special characters), but aside from that it seems straightforward. "Correcting"
it everywhere in the code may be prohibitive, though...
Since many of these are user-facing, do you think we need a more specific type
than {{String}} for the glob part? {{ls /users/hadoop/\*.foo}} translated into:
{code}fs.globStatus(new Glob(new Path("hdfs://nn:8020/users/hadoop"),
Pattern.compile("*.foo"))){code}
seems like it's strayed from sanity...
> Add improved APIs for globbing
> ------------------------------
>
> Key: HADOOP-8724
> URL: https://issues.apache.org/jira/browse/HADOOP-8724
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Robert Joseph Evans
> Assignee: Robert Joseph Evans
>
> After the discussion on HADOOP-8709 it was decided that we need better APIs
> for globbing to remove some of the inconsistencies with other APIs. Inorder
> to maintain backwards compatibility we should deprecate the existing APIs and
> add in new ones.
> See HADOOP-8709 for more information about exactly how those APIs should look
> and behave.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira