[
https://issues.apache.org/jira/browse/NUTCH-3132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18044242#comment-18044242
]
ASF GitHub Bot commented on NUTCH-3132:
---------------------------------------
lewismc commented on PR #871:
URL: https://github.com/apache/nutch/pull/871#issuecomment-3638717426
The primary issue that I neglected to mention @sebastian-nagel (and anyone
else)... this PR proposes non-backwards compatible changes to the Nutch metrics
system. For example, if someone was already leveraging the Hadoop metrics
system directly or mining metrics from Nutch logs, then their logic would
break. The upside is that hopefully metrics names are much clearer now.
Thanks for the review.
> Standardize existing Nutch metrics naming and implementation
> ------------------------------------------------------------
>
> Key: NUTCH-3132
> URL: https://issues.apache.org/jira/browse/NUTCH-3132
> Project: Nutch
> Issue Type: Sub-task
> Components: metrics
> Reporter: Lewis John McGibbney
> Assignee: Lewis John McGibbney
> Priority: Major
> Fix For: 1.22
>
>
> This task will create a centralized metrics constants class following
> [Prometheus naming conventions|https://prometheus.io/docs/practices/naming/],
> cache counter references to reduce lookup overhead, and standardize naming
> across the codebase. This affects 88 counter operations in 17 files.
> Following [Prometheus best
> practices|https://prometheus.io/docs/practices/naming/]:
>
> ||Rule ||Convention||Example||
> |Prefix|Application namespace {{nutch_}}|{{nutch_fetcher_...}}|
> |Units|Base units as suffix|{{{}_bytes{}}}, {{_seconds}}|
> |Counts|{{_total}} suffix|{{pages_fetched_total}}|
> |Case|snake_case|{{robots_denied}} not {{RobotsDenied}}|
> h2. Counter Group Mapping
> ||Current||Prometheus-style||
> |{{FetcherStatus}}|{{nutch_fetcher}}|
> |{{Generator}}|{{nutch_generator}}|
> |{{IndexerStatus}}|{{nutch_indexer}}|
> |{{CrawlDB status}}|{{nutch_crawldb}}|
> |{{injector}}|{{nutch_injector}}|
> |{{UpdateHostDb}}|{{nutch_hostdb}}|
> |{{ParserStatus}}|{{nutch_parser}}|
> h2. Counter Name Mapping (examples)
> ||Current||Prometheus||
> |{{bytes_downloaded}}|{{bytes_downloaded_total}}|
> |{{robots_denied}}|{{robots_denied_total}}|
> |{{URL_FILTERS_REJECTED}}|{{urls_filtered_total}}|
> |{{urls_injected}}|{{urls_injected_total}}|
> |{{indexed (add/update)}}|{{docs_indexed_total}}|
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)