[
https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392892#comment-15392892
]
Chris Nauroth commented on HADOOP-13345:
----------------------------------------
[~ajfabbri] and [~eddyxu], thank you for sharing your work. I see a lot of
commonality between the 2 efforts so far.
I also explored something like the Layered FileSystem vs. Pluggable Metadata
trade-off that you described. Specifically, I had an earlier prototype (not
attached) that was a {{FilterFileSystem}}, with the intent that you could layer
it over any other {{FileSystem}} implementation. I abandoned this idea when I
got into implementation and found that I was going to need to coordinate more
directly with the S3A logic in a way that wasn't amenable to overriding
{{FileSystem}} methods. For example, I needed special case logic around
{{createFakeDirectoryIfNecessary}}. It looks like you came to the same
conclusion in your patch.
The main difference I see is that my work focused more on consistency, with the
S3 bucket still treated as source of truth, and your work focused more on
performance. I hadn't tried anything with the DynamoDB lookup completely
short-circuiting the S3 lookup. I think we can reconcile this though. Like
you said, we can support configurable policies for different use cases. For
example, if a user is willing to commit to performing all access through S3A
and no external tools, then I expect it's safe for them to turn on a more
aggressive caching policy that satisfies all metadata lookups from DynamoDB.
Alternatively, there can be a fix-up tool like you described. This might fold
into HADOOP-13311, where I proposed a new shell entry point for S3A-specific
administration commands.
Another interesting example in this area is GCS, which has something like the
policies we are describing in terms of their {{DirectoryListCache}}. This
includes an implementation like the in-memory one included in your patch.
https://github.com/GoogleCloudPlatform/bigdata-interop/blob/1447da82f2bded2ac8493b07797a5c2483b70497/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/InMemoryDirectoryListCache.java
The JavaDocs advertise it as providing consistency within a single process.
Like you said, there is no cache coherence across processes.
[HADOOP-12876|https://issues.apache.org/jira/browse/HADOOP-12876] is slightly
related. Azure Data Lake has implemented an in-memory {{FileStatus}} cache
(patch not yet available). When this idea was suggested, I raised the concern
about cache coherence, but system testing with that caching enabled has gone
well. That's a good sign that the cache coherence problem might not cause much
harm to applications in practice. I had been thinking the HADOOP-12876 work
could eventually be refactored to hadoop-common for any {{FileSystem}} to use,
effectively becoming something like the "dentry cache" of Hadoop. I had been
thinking this could happen independent of S3Guard. We can explore further if
that makes sense, or if it's really beneficial to push the caching lower into
S3A itself. (Some of the internal S3 listing calls don't map exactly to
{{FileSystem}} method calls.)
To summarize though, I see more commonality than difference, so I'd like to
proceed with collaborating on this. I'd start by creating a feature branch and
folding all of the information into a shared design doc. Please let me know
your thoughts.
> S3Guard: Improved Consistency for S3A
> -------------------------------------
>
> Key: HADOOP-13345
> URL: https://issues.apache.org/jira/browse/HADOOP-13345
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs/s3
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Attachments: HADOOP-13345.prototype1.patch,
> S3C-ConsistentListingonS3-Design.pdf, S3GuardImprovedConsistencyforS3A.pdf,
> s3c.001.patch
>
>
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a
> stronger consistency model than what is currently offered. The solution
> coordinates with a strongly consistent external store to resolve
> inconsistencies caused by the S3 eventual consistency model.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]