Hello Nutch Folks, I'm running nutch built from the trunk. When I run "bin/nutch crawl urls -dir crawl -threads 5 -depth 2 -topN 20", I get the following errors in hadoop.log:
2007-12-13 15:20:59,920 ERROR http.Http - java.lang.NullPointerException 2007-12-13 15:20:59,920 ERROR http.Http - java.lang.NullPointerException 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.protocol.Content.getContentType(Content.java:327) 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.protocol.Content.getContentType(Content.java:327) 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.protocol.Content.<init>(Content.java:95) 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.protocol.Content.<init>(Content.java:95) 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java :226) 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java :226) 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:164) 2007-12-13 15:20:59,920 ERROR http.Http - at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:164) 2007-12-13 15:20:59,922 INFO fetcher.Fetcher - fetch of http://[SOMEURLHERE] failed with: java.lang.NullPointerException My environment -- I'm using Linux running Java 1.6. hadoop-site.xml is empty, so I'm using the local file system. nutch-site.xml has all the agent related properties set. I can get the 0.9 release to work, but I consistently get this error with the trunk version. All the fetches fail, and crawldb contains 0 fetched urls at the end of the run. Does anyone have an idea what might be wrong? Thanks, Sandeep
