[
https://issues.apache.org/jira/browse/NUTCH-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902588#comment-17902588
]
Sebastian Nagel commented on NUTCH-3096:
----------------------------------------
+1 lgtm.
Could move the bucketing code into a function / method. The
ResolverThread.run() method contains a try-catch block with most of the code in
the catch part. Not easy to read.
> HostDB ResolverThread can create too many job counters
> ------------------------------------------------------
>
> Key: NUTCH-3096
> URL: https://issues.apache.org/jira/browse/NUTCH-3096
> Project: Nutch
> Issue Type: Bug
> Components: hostdb
> Affects Versions: 1.20
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Priority: Major
> Fix For: 1.21
>
> Attachments: NUTCH-3096-1.15.patch, NUTCH-3096.patch
>
>
> Hadoop will allow no more than 120 distinct counters. If we have a large
> number of distinct DNS lookup failure counts, we'll exceed the limit, Hadoop
> will complain, the job will fail.
>
> Let's limit the amount of possibilities by grouping the numFailures in just a
> few buckets.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)