[
https://issues.apache.org/jira/browse/HADOOP-14235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kazuyuki Tanimura updated HADOOP-14235:
---------------------------------------
Description:
S3 paths, colons ":" are valid character in S3 paths. However, the Java URI
class, which is used in the Path class, does not allow it.
This becomes a problem particularly when we are globbing S3 paths. The globber
thinks paths with colons are invalid paths and throws URISyntaxException.
The reason is we are sharing Globber.java with all other Fs. Some of the rules
for regular Fs are not applicable to S3 just like this colon as an example.
Same issue is reported here https://issues.apache.org/jira/browse/SPARK-20061
The good news is I have a one line fix that I am about to send a pull request.
However, for a right fix, we should separate the S3 globber from the
Globber.java as proposed at https://issues.apache.org/jira/browse/HADOOP-13371
was:
S3 paths, colons (:) are valid character in S3 paths. However, the Java URI
class, which is used in the Path class, does not allow it.
This becomes a problem particularly when we are globbing S3 paths. The globber
thinks paths with colons are invalid paths and throws URISyntaxException.
The reason is we are sharing Globber.java with all other Fs. Some of the rules
for regular Fs are not applicable to S3 just like this colon as an example.
Same issue is reported here https://issues.apache.org/jira/browse/SPARK-20061
The good news is I have a one line fix that I am about to send a pull request.
However, for a right fix, we should separate the S3 globber from the
Globber.java as proposed at https://issues.apache.org/jira/browse/HADOOP-13371
> S3A Path does not understand colon (:) when globbing
> ----------------------------------------------------
>
> Key: HADOOP-14235
> URL: https://issues.apache.org/jira/browse/HADOOP-14235
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.8.0, 3.0.0-alpha1, 3.0.0-alpha2, 2.8.1
> Reporter: Kazuyuki Tanimura
>
> S3 paths, colons ":" are valid character in S3 paths. However, the Java URI
> class, which is used in the Path class, does not allow it.
> This becomes a problem particularly when we are globbing S3 paths. The
> globber thinks paths with colons are invalid paths and throws
> URISyntaxException.
> The reason is we are sharing Globber.java with all other Fs. Some of the
> rules for regular Fs are not applicable to S3 just like this colon as an
> example.
> Same issue is reported here https://issues.apache.org/jira/browse/SPARK-20061
> The good news is I have a one line fix that I am about to send a pull request.
> However, for a right fix, we should separate the S3 globber from the
> Globber.java as proposed at https://issues.apache.org/jira/browse/HADOOP-13371
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]