This is an automated email from the ASF dual-hosted git repository. snagel pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/nutch.git.
from 873d7bf Merge pull request #473 from sebastian-nagel/NUTCH-2381-text-prof-signature-lexicographic-sorting new f02c98e NUTCH-2737 Generator: count and log reason of rejections during selection - add counters for rejections in Generator's SelectorMapper - parameterize log messages to simplify code new e46232d NUTCH-2738 Generator: document property generate.restrict.status - add generate.restrict.status to nutch-default.xml - get status (byte) from status name in setConf() to speed up comparison in SelectorMapper new 8d21260 Generator: fix logging of hostdb path new 35da06f NUTCH-2737 Generator: count and log reason of rejections during selection - count rejections by `generate.max.count` * number of hosts (resp. domains) affected * number of URLs skipped total (for all hosts) new 44ded9b Generator: apply formatting new 4d68c08 NUTCH-2740 Generator: generate.max.count overflow not logged new 2f310ae Generator: improve description of crawl.gen.delay new a2762f0 Merge pull request #477 from sebastian-nagel/NUTCH-2737-generator-log-selection The 2970 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: conf/nutch-default.xml | 17 +- src/java/org/apache/nutch/crawl/CrawlDatum.java | 9 + src/java/org/apache/nutch/crawl/Generator.java | 837 ++++++++++++------------ 3 files changed, 456 insertions(+), 407 deletions(-)