Hi dev@,

*What?*
I've been chipping away at some documentation which would provide a
one-stop-shop for understanding Nutch metrics. My first pass is available at
https://cwiki.apache.org/confluence/display/NUTCH/Metrics
This relates to the recent JIRA issue I filed about establishing a Nutch
metrics convention.
https://issues.apache.org/jira/browse/NUTCH-2909

*My ask*
It is probably unrealistic for me to ask for a review of the metrics
documentation due to the document length. I suppose it can be used more as
reference material further down the line. Therefore my request is simple,
if you are able to take a look at NUTCH-2909 it would be greatly
appreciated.

*My intention*
Depending on what feedback I get, I intend to
1. begin with applying the naming convention across all existing Hadoop
counters updating the above documentation as I go
2. prototype and experiment with sending Nutch metrics to systems like
StatsD (https://github.com/apache/nutch/pull/712) and Prometheus.
3. based on #2, roll out a fully blown metrics implementation which would
allow expressive metrics to be collected in a dedicated metrics system.

The overall goal is to assist Nutch administrators by better informing them
about how Nutch runs in production. This is a pretty large task so it will
most likely take a couple more months.

lewismc


-- 
http://home.apache.org/~lewismc/
http://people.apache.org/keys/committer/lewismc

Reply via email to