Michael Coffey created NUTCH-2574:
-------------------------------------
Summary: hostCount >= maxCount comparison wrong
Key: NUTCH-2574
URL: https://issues.apache.org/jira/browse/NUTCH-2574
Project: Nutch
Issue Type: Bug
Components: generator
Affects Versions: 1.13
Reporter: Michael Coffey
In the Generator.Selector.reduce function, there is a comparison of
hostCount[1] to maxCount, to determine whether or not to push the current URL
to the next segment. The purpose is to honor generate.max.count.
Sebastian noticed that it should test if (hostCount[1] > maxCount) rather than
">=". As it stands, the code sometimes puts one less url into a segment than
it should.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)