[
https://issues.apache.org/jira/browse/HADOOP-14172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15907162#comment-15907162
]
Steve Loughran commented on HADOOP-14172:
-----------------------------------------
-1 to tree walks, it does in real-world situations
I'd be happier with an enum of list options, where "all" would be allowed. The
choice of which to filter (e.g. "/" anf s3n folders) is then something that
s3guard can filter on itself. I know the other strategy would be to support
some name predicate to pass in, but having seen downstream code doing expensive
operations in such predicates (e.g. constructing FileStatus instance off the
name, querying the values, ...) I'd rather return the iterator and let the
caller do the filtering themselves
> S3Guard: import does not import empty directory
> -----------------------------------------------
>
> Key: HADOOP-14172
> URL: https://issues.apache.org/jira/browse/HADOOP-14172
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
>
> It imports everything comes up from listFiles, which includes only files (and
> their parent directories as a side-effect). My first thought on doing this
> would be to override S3AFileSystem to add an optional parameter to use
> AcceptAllButSelfAndS3nDirs instead of AcceptFilesOnly. But we could also
> manually traverse the tree to get all FileStatus objects directory by
> directory like we do for diff. That's far slower but doesn't add surface area
> to S3AFileSystem. But there's also the impact to other S3 clients to worry
> about - I could go either way on that.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]