[ https://issues.apache.org/jira/browse/HADOOP-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran resolved HADOOP-18753. ------------------------------------- Resolution: Won't Fix > S3AFileSystem doesn't consistently handle prefixes that are both files and > directories between versions > ------------------------------------------------------------------------------------------------------- > > Key: HADOOP-18753 > URL: https://issues.apache.org/jira/browse/HADOOP-18753 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 > Affects Versions: 3.3.4 > Reporter: Helen Weng > Priority: Major > > We have a prefix structure where the prefix Spark reads is both a file and a > directory. So s3://a/b is the file we are trying to read, but s3://a/b/c is > also a file. In 3.2.1, listStatuses identifies a/b as a File, but a change in > 3.3.4 now identifies a/b as a directory and tries to read a/b/c instead of > a/b. > When s3GetFileStatus is called on the path with StatusProbeEnum HEAD, the > path does return as "File". However innerListStatus first assumes that any > prefix that is "nonempty" is a directory; it only calls s3GetFileStatus on > empty directories and on listObjects results of the prefix. > Wonder if this is known/if there are any suggestions to get around this > without changing the prefix structure? -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org