[ 
https://issues.apache.org/jira/browse/HADOOP-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15620602#comment-15620602
 ] 

Chris Nauroth commented on HADOOP-13651:
----------------------------------------

bq. I'll go study the FileSystem cache now (i.e. does it guarantee one instance 
per-bucket, or get close to that?)

Yes, the relevant piece to look at is the cache {{Key}} class inside 
{{FileSystem}}.  This data structure defines a composite key for entries in the 
cache:

{code}
    /** FileSystem.Cache.Key */
    static class Key {
      final String scheme;
      final String authority;
      final UserGroupInformation ugi;
      final long unique;   // an artificial way to make a key unique
{code}

The {{scheme}} will be "s3a", and the {{authority}} will be the S3 bucket, so 
it will guarantee the same instance is reused for the same bucket, so long as 
it's the same user running the code that allocates the {{FileSystem}}.  The 
{{unique}} field is an artifical cache buster used for callers that explicitly 
do not want to share an instance and instead request a unique one by calling 
{{FileSystem#newInstance}}.  Calling {{FileSystem#close}} evicts the instance 
from the cache.  There are some pretty big gotchas that can come up related to 
this {{FileSystem}} cache, but for the sake of this discussion, we can say that 
it works as expected.

I don't have any objection to a plan of proceeding with this patch and 
converting to an instance per {{S3AFileSystem}} in a later patch if that's 
helpful for the development process.  We have the freedom to work that way on a 
feature branch.  However, I wonder if that's problematic for tests that access 
multiple buckets, like the tests that read from the public landsat-pds bucket.

> S3Guard: S3AFileSystem Integration with MetadataStore
> -----------------------------------------------------
>
>                 Key: HADOOP-13651
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13651
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>         Attachments: HADOOP-13651-HADOOP-13345.001.patch, 
> HADOOP-13651-HADOOP-13345.002.patch, HADOOP-13651-HADOOP-13345.003.patch
>
>
> Modify S3AFileSystem et al. to optionally use a MetadataStore for metadata 
> consistency and caching.
> Implementation should have minimal overhead when no MetadataStore is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to