Have you try to set that property in conf/nutch-default.xml ? -- Khang
On Fri, Oct 29, 2010 at 2:08 PM, Matthew Stevens <[email protected]>wrote: > Running the following command: > > ./bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log > > Generates the following text in crawl.log > > Fetcher: No agents listed in 'http.agent.name' property. > > Exception in thread "main" java.lang.IllegalArgumentException: Fetcher: No > agents listed in 'http.agent.name' property. > > at org.apache.nutch.fetcher.Fetcher.checkConfiguration(Fetcher.java:1166) > > at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1068) > > at org.apache.nutch.crawl.Crawl.main(Crawl.java:133) > > > My questions are: > > Is the property being referred to supposed to be that listed in > > *nutch-site.xml* > If so, then the xml value is: > > <property> > <name>http.agent.name</name> > <value>mini3</value> > <description>HTTP 'User-Agent' request header. MUST NOT be empty - > please set this to a single word uniquely related to your organization. > > NOTE: You should also check other related properties: > > http.robots.agents > http.agent.description > http.agent.url > http.agent.email > http.agent.version > > and set their values appropriately. > > </description> > </property> > > Can someone reproduce this error or tell me how to correct it? Additionally > it should be noted that I have not yet gotten this to run successfully. >

