[ 
https://issues.apache.org/jira/browse/HADOOP-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943020#comment-15943020
 ] 

Steve Loughran commented on HADOOP-14235:
-----------------------------------------

The bad news "one line fixes" are just as much trouble as anything else to get 
working. The main difference is the amount of review time each line gets is 
higher.

This is the [hadoop-aws testing policy|
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md].
 no declared test run: no review. No extra tests: either justify the lack of 
tests or expect no review.

Here I'm not going to worry about it as HADOOP-3257 shows this is a broader 
problem. This alternative construction trick may work here, but it's only going 
to delay the problem. For example. it won't do anything for writing data when 
the destination path has a ":", in, recursive listFiles(path, recursive=true), 
calls, or anywhere else where the Path/2 constructors are used to build paths. 
Which is a lot of places in the code.

I don't think we can/should be trying to fix this in a stack-trace-by-stack 
trace approach as it will simply break again the moment someone changes the 
codepath. Path itself is going to need tuning. That's not impossible; if you 
look at what goes on there to handle windows paths like "C:/something" you can 
see what has gone ahead. It's just a significant body of work, which someone 
who understands that bit of the code (not me!) needs to do.


> S3A Path does not understand colon (:) when globbing
> ----------------------------------------------------
>
>                 Key: HADOOP-14235
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14235
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.8.0, 3.0.0-alpha1, 3.0.0-alpha2, 2.8.1
>         Environment: EC2, AWS
>            Reporter: Kazuyuki Tanimura
>
> S3 paths, colons ":" are valid character in S3 paths. However, the Java URI 
> class, which is used in the Path class, does not allow it.
> This becomes a problem particularly when we are globbing S3 paths. The 
> globber thinks paths with colons are invalid paths and throws 
> URISyntaxException.
> The reason is we are sharing Globber.java with all other Fs. Some of the 
> rules for regular Fs are not applicable to S3 just like this colon as an 
> example.
> Same issue is reported here https://issues.apache.org/jira/browse/SPARK-20061
> The good news is I have a one line fix that I am about to send a pull request.
> However, for a right fix, we should separate the S3 globber from the 
> Globber.java as proposed at https://issues.apache.org/jira/browse/HADOOP-13371



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to