lewismc opened a new pull request, #876:
URL: https://github.com/apache/nutch/pull/876

   PR for [NUTCH-3134](https://issues.apache.org/jira/browse/NUTCH-3134). 
Notably, this PR introduces a new Class named `LatencyTracker.java` which 
tracks latency metrics. The implementation wraps the TDigest data structure to 
collect latency samples and emit Hadoop counters with count, sum, and 
percentile values (p50, p95, p99). Note this is limited to Fetcher, Parser and 
Indexer jobs right now but could certainly be extended to other jobs in the 
future.
   
   One note for any reviewers, please sanity check
   
   1. latency start ands stop boundaries are accurate.
   2. counters are emitted at the correct times.
   
   Thanks for any review. Local testing is favorable. My next step will be to 
share my WIP Nutch observability solution via user@ .
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to