according to the docs its in core-default but any file should work. On Monday 19 December 2011 12:14:07 Danicela nutch wrote: > In which file should I put this property ? > > ----- Message d'origine ----- > De : Markus Jelsma > Envoyés : 19.12.11 11:20 > À : [email protected] > Objet : Re: 'A record version mismatch occured' > > Does io.skip.checksum.errors = true help? On Monday 19 December 2011 > 10:17:54 Danicela nutch wrote: > Hi, > > During segment updates, a > persistent crawldb checksum error appeared : > > 2011-12-18 02:06:30,703 > WARN mapred.LocalJobRunner - job_local_0001 > > org.apache.hadoop.fs.ChecksumException: Checksum error: > > file:/home/nutch/nutch@beetween/runs/fr1/crawldb/current/part-00000/data > > at 1337333760 > > Last time this problem occured, I removed both .crc in > the crawldb and it > worked. > > But now, removing the crcs brings another > persistent error : > > 2011-12-19 08:47:21,918 WARN mapred.LocalJobRunner > - job_local_0001 > A record version mismatch occured. Expecting v2, found > v66 > at > > org.apache.nutch.protocol.ProtocolStatus.readFields(ProtocolStatus.java:16 > > 8) at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:167) > at > org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:272) at > > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeseria > lizer > .deserialize(WritableSerialization.java:67) at > > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer > > .deserialize(WritableSerialization.java:40) at > > org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.jav > > a:1817) at > > org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java > > :1790) at > > org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(Sequence > > FileRecordReader.java:103) at > > org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordR > > eader.java:78) at > > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.ja > > va:192) at > > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176 > > ) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) > at > > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) > > 2011-12-19 08:47:22,861 FATAL crawl.CrawlDb - Cra wlDb update: > > java.io.IOException: Job failed! at > > org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) at > > org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:94) > at > org.apache.nutch.crawl.CrawlDb.run(CrawlDb.java:189) > at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.nutch.crawl.CrawlDb.main(CrawlDb.java:150) > > What can I do ? > > > Thanks. -- Markus Jelsma - CTO - Openindex
-- Markus Jelsma - CTO - Openindex

