[ 
https://issues.apache.org/jira/browse/HADOOP-8724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442315#comment-13442315
 ] 

Chris Douglas edited comment on HADOOP-8724 at 8/27/12 6:25 PM:
----------------------------------------------------------------

bq. Simply instantiating a Glob object when the path might not be a glob may be 
problematic. Perhaps a better way to handle might be to have a static 
Path.create(String) which returns either a Path or GlobPath.

Distinguishing intent where the {{Path}} is created (as above) could solve part 
of the problem with HADOOP-8709 (caller can resolve which API to use?), but I 
don't think a subtype of {{Path}} will solve the other issues. Dispatch is 
still on the static type, so nothing is solved for the callee.

bq. The first was to simply have a base path and a string pattern as the 
parameters to globStatus. I thought it would be better to encapsulate the two 
into a single Glob object so it is obvious when an API takes a glob and when it 
does not.

Having an API advertise its "globiness" is useful, I like it. Though users 
specifying a single resource will need to create {{Glob}} objects on top of 
{{Path}} objects which are really URIs... which seems unnecessarily confusing. 
Still, methods for {{Configuration}} are also straightforward; {{setGlob()}} 
would need to escape everything in the URI side first (so users could continue 
to specify globs on the commandline as {{Paths}} with special characters), but 
aside from that it seems straightforward. "Correcting" it everywhere in the 
code may be prohibitive, though...

Since many of these are user-facing, do you think we need a more specific type 
than {{String}} for the glob part? {{ls /users/hadoop/\*.foo}} translated into:
{code}fs.globStatus(new Glob(new Path("hdfs://nn:8020/users/hadoop"), 
Pattern.compile("*.foo"))){code}
seems like it's strayed from sanity...
                
      was (Author: chris.douglas):
    bq. Simply instantiating a Glob object when the path might not be a glob 
may be problematic. Perhaps a better way to handle might be to have a static 
Path.create(String) which returns either a Path or GlobPath.

Distinguishing intent where the {{Path}} is created (as above) could solve part 
of the problem with HADOOP-8709 (caller can resolve which API to use?), but I 
don't think a subtype of {{Path}} will solve the other issues. Dispatch is 
still on the static type, so nothing is solved for the callee.

bq. The first was to simply have a base path and a string pattern as the 
parameters to globStatus. I thought it would be better to encapsulate the two 
into a single Glob object so it is obvious when an API takes a glob and when it 
does not.

Having an API advertise its "globiness" is useful, I like it. Though users 
specifying a single resource will need to create {{Glob}} objects on top of 
{{Path}} objects which are really {{URI}}s... which seems unnecessarily 
confusing. Still, methods for {{Configuration}} are also straightforward; 
{{setGlob()}} would need to escape everything in the {{URI}} side first (so 
users could continue to specify globs on the commandline as {{Paths}} with 
special characters), but aside from that it seems straightforward. "Correcting" 
it everywhere in the code may be prohibitive, though...

Since many of these are user-facing, do you think we need a more specific type 
than {{String}} for the glob part? {{ls /users/hadoop/\*.foo}} translated into:
{code}fs.globStatus(new Glob(new Path("hdfs://nn:8020/users/hadoop"), 
Pattern.compile("*.foo"))){code}
seems like it's strayed from sanity...
                  
> Add improved APIs for globbing
> ------------------------------
>
>                 Key: HADOOP-8724
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8724
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>
> After the discussion on HADOOP-8709 it was decided that we need better APIs 
> for globbing to remove some of the inconsistencies with other APIs.  Inorder 
> to maintain backwards compatibility we should deprecate the existing APIs and 
> add in new ones.
> See HADOOP-8709 for more information about exactly how those APIs should look 
> and behave.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to