Lewis John McGibbney created NUTCH-3132:
-------------------------------------------

             Summary: Standardize existing Nutch metrics naming and 
implementation
                 Key: NUTCH-3132
                 URL: https://issues.apache.org/jira/browse/NUTCH-3132
             Project: Nutch
          Issue Type: Sub-task
          Components: metrics
            Reporter: Lewis John McGibbney
            Assignee: Lewis John McGibbney
             Fix For: 1.22


This task will create a centralized metrics constants class following 
[Prometheus naming conventions|https://prometheus.io/docs/practices/naming/], 
cache counter references to reduce lookup overhead, and standardize naming 
across the codebase. This affects 88 counter operations in 17 files.

Following [Prometheus best 
practices|https://prometheus.io/docs/practices/naming/]:
||Rule ||Convention||Example||
|Prefix|Application namespace {{nutch_}}|{{nutch_fetcher_...}}|
|Case|snake_case\||{{robots_denied}} not {{RobotsDenied}} |
|Units|Base units as suffix|{{{}_bytes{}}}, {{_seconds}}|
|Counts|{{_total}} suffix|{{pages_fetched_total}}|

 
h3.  
h3. Counter Group Mapping
||Current||Prometheus-style||
|{{FetcherStatus}}|{{nutch_fetcher}}|
|{{Generator}}|{{nutch_generator}}|
|{{IndexerStatus}}|{{nutch_indexer}}|
|{{CrawlDB status}}|{{nutch_crawldb}}|
|{{injector}}|{{nutch_injector}}|
|{{UpdateHostDb}}|{{nutch_hostdb}}|
|{{ParserStatus}}|{{nutch_parser}}|
h3. Counter Name Mapping (examples)

 
||Current||Prometheus||
|{{bytes_downloaded}}|{{bytes_downloaded_total}}|
|{{robots_denied}}|{{robots_denied_total}}|
|{{URL_FILTERS_REJECTED}}|{{urls_filtered_total}}|
|{{urls_injected}}|{{urls_injected_total}}|
|{{indexed (add/update)}}|{{docs_indexed_total}}|

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to