In which file should I put this property ? ----- Message d'origine ----- De : Markus Jelsma Envoyés : 19.12.11 11:20 À : [email protected] Objet : Re: 'A record version mismatch occured'
Does io.skip.checksum.errors = true help? On Monday 19 December 2011 10:17:54 Danicela nutch wrote: > Hi, > > During segment updates, a persistent crawldb checksum error appeared : > > 2011-12-18 02:06:30,703 WARN mapred.LocalJobRunner - job_local_0001 > org.apache.hadoop.fs.ChecksumException: Checksum error: > file:/home/nutch/nutch@beetween/runs/fr1/crawldb/current/part-00000/data > at 1337333760 > > Last time this problem occured, I removed both .crc in the crawldb and it > worked. > > But now, removing the crcs brings another persistent error : > > 2011-12-19 08:47:21,918 WARN mapred.LocalJobRunner - job_local_0001 > A record version mismatch occured. Expecting v2, found v66 > at > org.apache.nutch.protocol.ProtocolStatus.readFields(ProtocolStatus.java:16 > 8) at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:167) at > org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:272) at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeseria lizer > .deserialize(WritableSerialization.java:67) at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer > .deserialize(WritableSerialization.java:40) at > org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.jav > a:1817) at > org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java > :1790) at > org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(Sequence > FileRecordReader.java:103) at > org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordR > eader.java:78) at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.ja > va:192) at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176 > ) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) > 2011-12-19 08:47:22,861 FATAL crawl.CrawlDb - Cra wlDb update: > java.io.IOException: Job failed! at > org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) at > org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:94) > at org.apache.nutch.crawl.CrawlDb.run(CrawlDb.java:189) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.crawl.CrawlDb.main(CrawlDb.java:150) > > What can I do ? > > Thanks. -- Markus Jelsma - CTO - Openindex

