Sebastian Nagel created NUTCH-2364:
--------------------------------------

             Summary: http.agent.rotate: IllegalArgumentException / last 
element of agent names ignored
                 Key: NUTCH-2364
                 URL: https://issues.apache.org/jira/browse/NUTCH-2364
             Project: Nutch
          Issue Type: Bug
          Components: protocol
    Affects Versions: 1.12, 1.11, 1.10
            Reporter: Sebastian Nagel
            Priority: Minor
             Fix For: 1.13


With http.agent.rotate == true and a one-element agent name list, the following 
exception is thrown:
{noformat}
% cat .../conf/agents.txt
my-test-crawler/Nutch-1.13
% .../bin/nutch parsechecker -Dhttp.agent.rotate=true http://nutch.apache.org/
...
Fetch failed with protocol status: exception(16), lastModified=0: 
java.lang.IllegalArgumentException: bound must be positive
% cat .../logs/hadoop.log
...
2017-03-03 11:17:19,750 ERROR http.Http - Failed to get protocol output
java.lang.IllegalArgumentException: bound must be positive
        at 
java.util.concurrent.ThreadLocalRandom.nextInt(ThreadLocalRandom.java:352)
        at 
org.apache.nutch.protocol.http.api.HttpBase.getUserAgent(HttpBase.java:379)
        at 
org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:180)
...
{noformat}

Caused by
{code}
userAgentNames.get(ThreadLocalRandom.current().nextInt(userAgentNames.size()-1));
{code}
but nextInt(...) is defined as: "Returns a pseudorandom int value between zero 
(inclusive) and the specified bound (exclusive)."



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to