Aaron Fabbri created HADOOP-15779:
-------------------------------------
Summary: S3guard: add inconsistency detection metrics
Key: HADOOP-15779
URL: https://issues.apache.org/jira/browse/HADOOP-15779
Project: Hadoop Common
Issue Type: Bug
Components: fs/s3
Affects Versions: 3.2.0
Reporter: Aaron Fabbri
S3Guard uses a trailing log of metadata changes made to an S3 bucket to add
consistency to the eventually-consistent AWS S3 service. We should add some
metrics that are incremented when we detect inconsistency (eventual
consistency) in S3.
I'm thinking at least two counters: (1) getFileStatus() (HEAD) inconsistency
detected, and (2) listing inconsistency detected. We may want to further
separate into categories (present / not present etc.)
This is related to Auth. Mode and TTL work that is ongoing, so let me outline
how I think this should all evolve:
This should happen after HADOOP-15621 (TTL for dynamo MetadataStore), since
that will change *when* we query both S3 and the MetadataStore (e.g. Dynamo)
for metadata. There I suggest that:
# Prune time is different than TTL. Prune time: "how long until inconsistency
is no longer a problem" . TTL time "how long a MetadataStore entry is
considered authoritative before refresh"
# Prune expired: delete entries (when hadoop CLI prune command is run). TTL
Expired: entries become non-authoritative.
# Prune implemented in each MetadataStore, but TTL filtering happens in S3A.
Once we have this, S3A will be consulting both S3 and MetadataStore depending
on configuration and/or age of the entry in the MetadataStore. Today
HEAD/getFileStatus() is always short-circuit (skips S3 if MetadataStore returns
results). I think S3A should consult both when TTL is stale, and invoke a
callback on inconsistency that increments the new metrics. For listing, we
already are comparing both sources of truth (except when S3A auth mode is on
and a directory is marked authoritative in MS), so it would be pretty simple to
invoke a callback on inconsistency and bump some metrics.
Comments / suggestions / questions welcomed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]