Sebastian Nagel created NUTCH-3162:
--------------------------------------

             Summary: Latency metrics to properly merge data from all threads 
and tasks
                 Key: NUTCH-3162
                 URL: https://issues.apache.org/jira/browse/NUTCH-3162
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 1.22
            Reporter: Sebastian Nagel
             Fix For: 1.23


The latency metrics (NUTCH-3134) have to issues:

1. Only the data from one thread is used, in case, a tool is multi-threaded. 
That's definitely the case for Fetcher. The "emitCounters" methods needs to 
increment the counter values, instead of calling "setValue". However, this is 
not the correct approach for the percentiles, see also next point.

2. If running full cluster mode with multiple parallel tasks, the task counters 
are summed up to the job counter value. However, the values of the latency 
percentiles then turn out to be too high.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to