[
https://issues.apache.org/jira/browse/HADOOP-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-18753:
------------------------------------
Component/s: fs/s3
(was: tools)
> S3AFileSystem doesn't consistently handle prefixes that are both files and
> directories between versions
> -------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-18753
> URL: https://issues.apache.org/jira/browse/HADOOP-18753
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 3.3.4
> Reporter: Helen Weng
> Priority: Major
>
> We have a prefix structure where the prefix Spark reads is both a file and a
> directory. So s3://a/b is the file we are trying to read, but s3://a/b/c is
> also a file. In 3.2.1, listStatuses identifies a/b as a File, but a change in
> 3.3.4 now identifies a/b as a directory and tries to read a/b/c instead of
> a/b.
> When s3GetFileStatus is called on the path with StatusProbeEnum HEAD, the
> path does return as "File". However innerListStatus first assumes that any
> prefix that is "nonempty" is a directory; it only calls s3GetFileStatus on
> empty directories and on listObjects results of the prefix.
> Wonder if this is known/if there are any suggestions to get around this
> without changing the prefix structure?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]