Hi,

I found an exception when I running nutch 2.1 with mysql. The command line is: 
bin/nutch crawl urls -depth 1 -topN 5
Here's the reproduce steps for the issue:
1. start nutch
2. stop it during it executing
3. start nutch again
The problem can be recovered by clean up the table 'webpage'.

========================= Error in the console 
=====================================
Skipping http://blog.foofactory.fi/2007/03/perfomance-history-for-nutch.html; 
different batch id (null)
Exception in thread "main" java.lang.RuntimeException: job failed: name=parse, 
jobid=job_local_0004
        at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
        at org.apache.nutch.parse.ParserJob.run(ParserJob.java:251)
        at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
        at org.apache.nutch.crawl.Crawler.run(Crawler.java:171)
        at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)

========================= Error in the logs/hadoop.log 
=====================================
2012-12-12 22:26:33,379 INFO  parse.ParserJob - Skipping 
http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html; 
different batch id (null)
2012-12-12 22:26:33,379 INFO  parse.ParserJob - Skipping 
http://blog.foofactory.fi/2007/03/perfomance-history-for-nutch.html; different 
batch id (null)
2012-12-12 22:26:33,380 WARN  mapred.FileOutputCommitter - Output path is null 
in cleanup
2012-12-12 22:26:33,381 WARN  mapred.LocalJobRunner - job_local_0004
java.io.IOException: java.io.EOFException
        at org.apache.gora.sql.query.SqlResult.nextInner(SqlResult.java:58)
        at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
        at 
org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:111)
        at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
        at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.io.EOFException
        at 
org.apache.avro.io.BinaryDecoder$InputStreamByteSource.readRaw(BinaryDecoder.java:818)
        at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:340)
        at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:265)
        at 
org.apache.gora.mapreduce.FakeResolvingDecoder.readString(FakeResolvingDecoder.java:131)
        at 
org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:280)
        at 
org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:191)
        at 
org.apache.gora.avro.PersistentDatumReader.readMap(PersistentDatumReader.java:182)
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:83)
        at 
org.apache.gora.avro.PersistentDatumReader.read(PersistentDatumReader.java:102)
        at org.apache.gora.util.IOUtils.deserialize(IOUtils.java:259)
        at org.apache.gora.sql.store.SqlStore.readField(SqlStore.java:565)
        at org.apache.gora.sql.store.SqlStore.readObject(SqlStore.java:486)
        at org.apache.gora.sql.query.SqlResult.nextInner(SqlResult.java:54)
        ... 8 more

Thanks.

Regards,
Rui

Reply via email to