Lewis John McGibbney created NUTCH-3132:
-------------------------------------------
Summary: Standardize existing Nutch metrics naming and
implementation
Key: NUTCH-3132
URL: https://issues.apache.org/jira/browse/NUTCH-3132
Project: Nutch
Issue Type: Sub-task
Components: metrics
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Fix For: 1.22
This task will create a centralized metrics constants class following
[Prometheus naming conventions|https://prometheus.io/docs/practices/naming/],
cache counter references to reduce lookup overhead, and standardize naming
across the codebase. This affects 88 counter operations in 17 files.
Following [Prometheus best
practices|https://prometheus.io/docs/practices/naming/]:
||Rule ||Convention||Example||
|Prefix|Application namespace {{nutch_}}|{{nutch_fetcher_...}}|
|Case|snake_case\||{{robots_denied}} not {{RobotsDenied}} |
|Units|Base units as suffix|{{{}_bytes{}}}, {{_seconds}}|
|Counts|{{_total}} suffix|{{pages_fetched_total}}|
h3.
h3. Counter Group Mapping
||Current||Prometheus-style||
|{{FetcherStatus}}|{{nutch_fetcher}}|
|{{Generator}}|{{nutch_generator}}|
|{{IndexerStatus}}|{{nutch_indexer}}|
|{{CrawlDB status}}|{{nutch_crawldb}}|
|{{injector}}|{{nutch_injector}}|
|{{UpdateHostDb}}|{{nutch_hostdb}}|
|{{ParserStatus}}|{{nutch_parser}}|
h3. Counter Name Mapping (examples)
||Current||Prometheus||
|{{bytes_downloaded}}|{{bytes_downloaded_total}}|
|{{robots_denied}}|{{robots_denied_total}}|
|{{URL_FILTERS_REJECTED}}|{{urls_filtered_total}}|
|{{urls_injected}}|{{urls_injected_total}}|
|{{indexed (add/update)}}|{{docs_indexed_total}}|
--
This message was sent by Atlassian Jira
(v8.20.10#820010)