Hi George, How were you executing parsing on this page?
the toArgMap method in Tool Util can throw a runtime exception, however this doesn't look like the one your getting. What other kind of logging do you have around here? Specifically related to when the parse method kicks in? This might give us a bit more idea where exactly this is happening. Thanks Lewis On Fri, May 25, 2012 at 9:00 PM, George Smith <[email protected]> wrote: > I've been using the nutchgora branch for a few months so I'm very new to it > and I've been able to find information on the user or dev list, jira, or > regular web searches for most of the issues I've encountered except one. > > > > This error occurs frequently when parsing but not always and there doesn't > seem to be a common element among the pages it is erroring out on. > > > > I have tried a number of revisions of the nutchgora branch that built with > ant/ivy fine and also with eclipse. I also tried the prebuilt copies from > Jenkins. I've run them on Debian 6.0.5, ubuntu 10, and ubuntu 12 with the > openjdk and the sun jdk and always receive the same error. > > > > Can someone shed some light on this or point me in the right direction. > Thanks. > > > > > > The error output to stdout is: > > Parsing http://www.site.com/dir/page.html > > Exception in thread "main" java.lang.RuntimeException: job failed: > name=parse, jobid=job_local_0001 > > at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:47) > > at org.apache.nutch.parse.ParserJob.run(ParserJob.java:242) > > at org.apache.nutch.parse.ParserJob.parse(ParserJob.java:257) > > at org.apache.nutch.parse.ParserJob.run(ParserJob.java:300) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > > at org.apache.nutch.parse.ParserJob.main(ParserJob.java:304) > > > > The error output in the hadoop.log is: > > 2012-05-25 13:50:29,635 INFO parse.ParserJob - Parsing > http://www.site.com/dir/page.html > > 2012-05-25 13:50:29,638 WARN mapred.FileOutputCommitter - Output path is > null in cleanup > > 2012-05-25 13:50:29,639 WARN mapred.LocalJobRunner - job_local_0001 > > java.lang.NullPointerException > > at org.apache.avro.util.Utf8.<init>(Utf8.java:37) > > at org.apache.nutch.parse.ParseUtil.process(ParseUtil.java:212) > > at org.apache.nutch.parse.ParserJob$ParserMapper.map(ParserJob.java:123) > > at org.apache.nutch.parse.ParserJob$ParserMapper.map(ParserJob.java:76) > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > > at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) -- Lewis

