[
https://issues.apache.org/jira/browse/HADOOP-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943020#comment-15943020
]
Steve Loughran commented on HADOOP-14235:
-----------------------------------------
The bad news "one line fixes" are just as much trouble as anything else to get
working. The main difference is the amount of review time each line gets is
higher.
This is the [hadoop-aws testing policy|
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md].
no declared test run: no review. No extra tests: either justify the lack of
tests or expect no review.
Here I'm not going to worry about it as HADOOP-3257 shows this is a broader
problem. This alternative construction trick may work here, but it's only going
to delay the problem. For example. it won't do anything for writing data when
the destination path has a ":", in, recursive listFiles(path, recursive=true),
calls, or anywhere else where the Path/2 constructors are used to build paths.
Which is a lot of places in the code.
I don't think we can/should be trying to fix this in a stack-trace-by-stack
trace approach as it will simply break again the moment someone changes the
codepath. Path itself is going to need tuning. That's not impossible; if you
look at what goes on there to handle windows paths like "C:/something" you can
see what has gone ahead. It's just a significant body of work, which someone
who understands that bit of the code (not me!) needs to do.
> S3A Path does not understand colon (:) when globbing
> ----------------------------------------------------
>
> Key: HADOOP-14235
> URL: https://issues.apache.org/jira/browse/HADOOP-14235
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.8.0, 3.0.0-alpha1, 3.0.0-alpha2, 2.8.1
> Environment: EC2, AWS
> Reporter: Kazuyuki Tanimura
>
> S3 paths, colons ":" are valid character in S3 paths. However, the Java URI
> class, which is used in the Path class, does not allow it.
> This becomes a problem particularly when we are globbing S3 paths. The
> globber thinks paths with colons are invalid paths and throws
> URISyntaxException.
> The reason is we are sharing Globber.java with all other Fs. Some of the
> rules for regular Fs are not applicable to S3 just like this colon as an
> example.
> Same issue is reported here https://issues.apache.org/jira/browse/SPARK-20061
> The good news is I have a one line fix that I am about to send a pull request.
> However, for a right fix, we should separate the S3 globber from the
> Globber.java as proposed at https://issues.apache.org/jira/browse/HADOOP-13371
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]