[
https://issues.apache.org/jira/browse/HADOOP-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15620602#comment-15620602
]
Chris Nauroth commented on HADOOP-13651:
----------------------------------------
bq. I'll go study the FileSystem cache now (i.e. does it guarantee one instance
per-bucket, or get close to that?)
Yes, the relevant piece to look at is the cache {{Key}} class inside
{{FileSystem}}. This data structure defines a composite key for entries in the
cache:
{code}
/** FileSystem.Cache.Key */
static class Key {
final String scheme;
final String authority;
final UserGroupInformation ugi;
final long unique; // an artificial way to make a key unique
{code}
The {{scheme}} will be "s3a", and the {{authority}} will be the S3 bucket, so
it will guarantee the same instance is reused for the same bucket, so long as
it's the same user running the code that allocates the {{FileSystem}}. The
{{unique}} field is an artifical cache buster used for callers that explicitly
do not want to share an instance and instead request a unique one by calling
{{FileSystem#newInstance}}. Calling {{FileSystem#close}} evicts the instance
from the cache. There are some pretty big gotchas that can come up related to
this {{FileSystem}} cache, but for the sake of this discussion, we can say that
it works as expected.
I don't have any objection to a plan of proceeding with this patch and
converting to an instance per {{S3AFileSystem}} in a later patch if that's
helpful for the development process. We have the freedom to work that way on a
feature branch. However, I wonder if that's problematic for tests that access
multiple buckets, like the tests that read from the public landsat-pds bucket.
> S3Guard: S3AFileSystem Integration with MetadataStore
> -----------------------------------------------------
>
> Key: HADOOP-13651
> URL: https://issues.apache.org/jira/browse/HADOOP-13651
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Aaron Fabbri
> Assignee: Aaron Fabbri
> Attachments: HADOOP-13651-HADOOP-13345.001.patch,
> HADOOP-13651-HADOOP-13345.002.patch, HADOOP-13651-HADOOP-13345.003.patch
>
>
> Modify S3AFileSystem et al. to optionally use a MetadataStore for metadata
> consistency and caching.
> Implementation should have minimal overhead when no MetadataStore is
> configured.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]