[ 
https://issues.apache.org/jira/browse/HADOOP-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563782#comment-15563782
 ] 

Aaron Fabbri commented on HADOOP-13651:
---------------------------------------

Minor status update, since this JIRA has a long gestation period. I'm working 
on this now.  So far I have code for:

- New config values: {{fs.s3a.metadatastore.authoratitive}}, and 
{{fs.s3a.metadatastore.impl}}.
- getFileStatus()
- listStatus()
- rename()
- delete()
- mkdirs()
- copyFromLocalFile()
- copyFile()

What remains for this jira:
- create().  Figuring out the OutputStream plumbing now 
- More testing.

What I'd like to do as separate jiras (because I favor smaller code reviews).
- Delete tracking
- Retries (i.e. eventual consistency retry policy).  Would love to see this in 
isolation since it is non-trivial.

I'm inserting TODO comments as I go at key locations for those two items.

Interesting things about my approach so far:

I'm trying to minimize changes to {{S3AFileSystem}}
   - diff stat so far: {quote}
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
                   | 116 ++++++++++++++++++++++++++++++------
{quote}
   - I introduce a "metadatastore s3a helper/glue" glass S3Guard which is a 
bunch of static helper functions, so far.
   - I introduce {{NullMetadataStore}} which is a no-op metadata store.   Goal 
was to simplify S3AFileSystem changes (always call MetadataStore, don't care if 
it is no-op), but I also like that it further clarifies {{MetadataStore}} 
semantics.  Turns out S3AFileSystem still sometimes wants to know if there is 
no MetadataStore to avoid allocating stuff that isn't needed.  Seems like ok 
tradeoff but I'll let folks comment when I post v1 patch.

I'm trying to keep PathMetadata simple:  Either you have a PathMetadata, 
including S3AFileStatus, or  you don't.   There are some spots where it would 
be convenient to just record "this path exists, but we don't have metadata 
yet", (e.g. create() -> OutputStream.close() -> S3AFileSystem.writeFinished().. 
at that point I don't have a FileStatus.), but that would complicate 
S3AFileSystem logic.  We'll see.


> S3Guard: S3AFileSystem Integration with MetadataStore
> -----------------------------------------------------
>
>                 Key: HADOOP-13651
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13651
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>
> Modify S3AFileSystem et al. to optionally use a MetadataStore for metadata 
> consistency and caching.
> Implementation should have minimal overhead when no MetadataStore is 
> configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to