Newbie issue resolved. The filename was mangled and the correct filename really didn't have the correct value.

On , Matthew Stevens <[email protected]> wrote:
Yes, the files are identical. No change in behavior.



Matthew Stevens



On Oct 30, 2010, at 9:44, Khang Ich [email protected]> wrote:




Have you try to set that property in conf/nutch-default.xml ?



-- Khang



On Fri, Oct 29, 2010 at 2:08 PM, Matthew Stevens [email protected]>wrote:




Running the following command:



./bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log



Generates the following text in crawl.log



Fetcher: No agents listed in 'http.agent.name' property.



Exception in thread "main" java.lang.IllegalArgumentException: Fetcher: No

agents listed in 'http.agent.name' property.



at org.apache.nutch.fetcher.Fetcher.checkConfiguration(Fetcher.java:1166)



at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1068)



at org.apache.nutch.crawl.Crawl.main(Crawl.java:133)





My questions are:



Is the property being referred to supposed to be that listed in



*nutch-site.xml*

If so, then the xml value is:





http.agent.name

mini3

HTTP 'User-Agent' request header. MUST NOT be empty -

please set this to a single word uniquely related to your organization.



NOTE: You should also check other related properties:



http.robots.agents

http.agent.description

http.agent.url

http.agent.email

http.agent.version



and set their values appropriately.









Can someone reproduce this error or tell me how to correct it? Additionally

it should be noted that I have not yet gotten this to run successfully.





Reply via email to