Sebastian Nagel created NUTCH-3059:
--------------------------------------
Summary: Generator: selector job does not count reduce output
records
Key: NUTCH-3059
URL: https://issues.apache.org/jira/browse/NUTCH-3059
Project: Nutch
Issue Type: Bug
Components: generator
Affects Versions: 1.20
Reporter: Sebastian Nagel
Fix For: 1.21
The selector step (job) of the Generator does not count the reduce output
records resp. shows the count "0":
{noformat}
2024-06-05 13:57:09,299 INFO o.a.n.c.Generator [main] Generator: starting
2024-06-05 13:57:09,299 INFO o.a.n.c.Generator [main] Generator: selecting
best-scoring urls due for fetch.
...
Map-Reduce Framework
Map input records=6
Map output records=6
...
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=594
Reduce input records=6
Reduce output records=0
Spilled Records=12
...
{noformat}
Not a big issue but should investigate why this happens. The other counters
seem to work properly, also the partitioner job shows the reduce output
records. The issue is observed in local and distributed mode.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)