Sebastian Nagel created NUTCH-3162:
--------------------------------------
Summary: Latency metrics to properly merge data from all threads
and tasks
Key: NUTCH-3162
URL: https://issues.apache.org/jira/browse/NUTCH-3162
Project: Nutch
Issue Type: Bug
Affects Versions: 1.22
Reporter: Sebastian Nagel
Fix For: 1.23
The latency metrics (NUTCH-3134) have to issues:
1. Only the data from one thread is used, in case, a tool is multi-threaded.
That's definitely the case for Fetcher. The "emitCounters" methods needs to
increment the counter values, instead of calling "setValue". However, this is
not the correct approach for the percentiles, see also next point.
2. If running full cluster mode with multiple parallel tasks, the task counters
are summed up to the job counter value. However, the values of the latency
percentiles then turn out to be too high.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)