Yes, we've got trouble with it too, similar exception, but we're did not sync our nutch with that issue IIRC.
Will check further this week. -----Original message----- > From:Sebastian Nagel <[email protected]> > Sent: Tuesday 17th September 2013 15:40 > To: [email protected] > Subject: IOException reading segments with current trunk > > Hi, > > recently I got some IO exceptions when reading older segments > with recent trunk builds. Did anyone make similar observations? > > According to the stack it seems possible that NUTCH-1622 > causes segments' parse_data to be incompatible between versions? > > Thanks, > Sebastian > > java.io.IOException: IO error in map input file > file:.../segments/20130623204752/parse_data/part-00000/data > ... > Caused by: java.io.EOFException > at java.io.DataInputStream.readFully(DataInputStream.java:197) > at java.io.DataInputStream.readUTF(DataInputStream.java:609) > at java.io.DataInputStream.readUTF(DataInputStream.java:564) > at > org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:199) > at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:146) > at org.apache.nutch.parse.Outlink.readFields(Outlink.java:54) > at org.apache.nutch.parse.Outlink.read(Outlink.java:84) > at org.apache.nutch.parse.ParseData.readFields(ParseData.java:133) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) >

