[
https://issues.apache.org/jira/browse/HADOOP-13230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554808#comment-16554808
]
Steve Jacobs commented on HADOOP-13230:
---------------------------------------
Could this be implemented replacing the HEAD request for the fakedir entry with
a listObjects call? That would be the same number of api calls in the 'empty
fakeDir' case, but no more work in the populated directory case.
Recently I ran into this issue using PRESTO to insert into hive partitions.
Presto does not use the S3a driver, and does not delete the fakedir objects.
> s3a's use of fake empty directory blobs does not interoperate with other s3
> tools
> ---------------------------------------------------------------------------------
>
> Key: HADOOP-13230
> URL: https://issues.apache.org/jira/browse/HADOOP-13230
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/s3
> Affects Versions: 2.9.0
> Reporter: Aaron Fabbri
> Priority: Major
>
> Users of s3a may not realize that, in some cases, it does not interoperate
> well with other s3 tools, such as the AWS CLI. (See HIVE-13778, IMPALA-3558).
> Specifically, if a user:
> - Creates an empty directory with hadoop fs -mkdir s3a://bucket/path
> - Copies data into that directory via another tool, i.e. aws cli.
> - Tries to access the data in that directory with any Hadoop software.
> Then the last step fails because the fake empty directory blob that s3a wrote
> in the first step, causes s3a (listStatus() etc.) to continue to treat that
> directory as empty, even though the second step was supposed to populate the
> directory with data.
> I wanted to document this fact for users. We may mark this as not-fix, "by
> design".. May also be interesting to brainstorm solutions and/or a config
> option to change the behavior if folks care.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]