lewismc opened a new pull request, #892: URL: https://github.com/apache/nutch/pull/892
PR for [NUTCH-3150](https://issues.apache.org/jira/browse/NUTCH-3150) which Implements comprehensive counter caching optimization across all MapReduce jobs to eliminate repeated `context.getCounter()` lookups in hot paths. Breaking this PR down... * Counter caching is now implemented in 16 MapReduce classes using a standardized `initCounters(Context context)` pattern which I think improves code interpretation aallowing for more intuitive future development around metrics. I saw @igiguere evolving metrics counters in https://github.com/apache/nutch/pull/891 which is excellent :) * Migrated `DomainStatistics.java` from custom enum to NutchMetrics constants with cached counters. * Refactored inline counter initialization to dedicated `initCounters()` methods for consistency across: * Core crawl jobs: Fetcher, Generator, Injector, CrawlDbFilter, CrawlDbReducer * Post-processing: DeduplicationJob, CleaningJob, ParseSegment * Analytics: DomainStatistics, WebGraph, SitemapProcessor * HostDB: UpdateHostDbMapper, UpdateHostDbReducer, ResolverThread * Export: WARCExporter * Indexing: IndexerMapReduce ... the metrics journey continues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]

