Have you try to set that property in conf/nutch-default.xml ?

-- Khang

On Fri, Oct 29, 2010 at 2:08 PM, Matthew Stevens <[email protected]>wrote:

> Running the following command:
>
> ./bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log
>
> Generates the following text in crawl.log
>
> Fetcher: No agents listed in 'http.agent.name' property.
>
> Exception in thread "main" java.lang.IllegalArgumentException: Fetcher: No
> agents listed in 'http.agent.name' property.
>
> at org.apache.nutch.fetcher.Fetcher.checkConfiguration(Fetcher.java:1166)
>
> at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1068)
>
> at org.apache.nutch.crawl.Crawl.main(Crawl.java:133)
>
>
> My questions are:
>
> Is the property being referred to supposed to be that listed in
>
> *nutch-site.xml*
> If so, then the xml value is:
>
> <property>
>  <name>http.agent.name</name>
>  <value>mini3</value>
>  <description>HTTP 'User-Agent' request header. MUST NOT be empty -
>  please set this to a single word uniquely related to your organization.
>
>  NOTE: You should also check other related properties:
>
>       http.robots.agents
>       http.agent.description
>       http.agent.url
>       http.agent.email
>       http.agent.version
>
>  and set their values appropriately.
>
>  </description>
> </property>
>
> Can someone reproduce this error or tell me how to correct it? Additionally
> it should be noted that I have not yet gotten this to run successfully.
>

Reply via email to