[ 
https://issues.apache.org/jira/browse/HADOOP-13453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15905231#comment-15905231
 ] 

Steve Loughran commented on HADOOP-13453:
-----------------------------------------

I'm afraid HADOOP-13914 has just broken the patch, which means, sadly, you get 
to do the merge. Let's get this in *before* anything else traumatic comes in, 
so other patches get to suffer next time.

I like what you've done measuring latency as well as counts. I think we could 
actually do this more broadly. I think the timing counting should be in a 
finally() clause though, so timings for failures get included too. (side issue: 
count success and failures separately? with different timings?)

I would like to think about how we could avoiding having to pass the 
instrumentation around all the time. Ideally, we could just pass it in as a 
constructor to the metadata store. Alternatively, that store could collect 
metrics and we could wire it up, but I don't see an easy way to do that in 
Hadoop metrics (compared to Coda Hale's). The easiest would be just to pass in 
the S3AInstrumentation (or an inner class) down, but currently the metastore 
interface is not specific to S3A only.

If we add an interface for metadata store instrumentation, then 
S3AInstrumentation can implement it in an inner class, and S3AFS can pass it 
down during initialization. Th's would let the metastore do all it wants, with 
well defined strings, of course.

What do people think?


> S3Guard: Instrument new functionality with Hadoop metrics.
> ----------------------------------------------------------
>
>                 Key: HADOOP-13453
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13453
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Ai Deng
>         Attachments: HADOOP-13453-HADOOP-13345-001.patch, 
> HADOOP-13453-HADOOP-13345-002.patch
>
>
> Provide Hadoop metrics showing operational details of the S3Guard 
> implementation.
> The metrics will be implemented in this ticket:
> ● S3GuardRechecksNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent
> in rechecks attempting to achieve consistency. Repeated for multiple 
> percentile values
> of N.  This metric is an indicator of the additional latency cost of running 
> S3A with
> S3Guard.
> ● S3GuardRechecksNumOps (MutableQuantiles) ­​ Number of times a consistency
> recheck was required while attempting to achieve consistency.
> ● S3GuardStoreNthPercentileLatency (MutableQuantiles) ­​ Percentile time 
> spent in
> operations against the consistent store, including both write operations 
> during file system
> mutations and read operations during file system consistency checks. Repeated 
> for
> multiple percentile values of N. This metric is an indicator of latency to 
> the consistent
> store implementation.
> ● S3GuardConsistencyStoreNumOps (MutableQuantiles) ­​ Number of operations
> against the consistent store, including both write operations during file 
> system mutations
> and read operations during file system consistency checks.
> ● S3GuardConsistencyStoreFailures (MutableCounterLong) ­​ Number of failures
> during operations against the consistent store implementation.
> ● S3GuardConsistencyStoreTimeouts (MutableCounterLong) ­​ Number of timeouts
> during operations against the consistent store implementation.
> ● S3GuardInconsistencies (MutableCounterLong) ­ C​ ount of times S3Guard 
> failed to
> achieve consistency, even after exhausting all rechecks. A high count may 
> indicate
> unexpected out­of­band modification of the S3 bucket contents, such as by an 
> external
> tool that does not make corresponding updates to the consistent store.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to