Thanks, good to know. We should add a warning to release notes and CHANGES.txt. Ideally, of course, reading segments should be backward compatible.
Sebastian 2013/9/17 Markus Jelsma <[email protected]> > Yes, we've got trouble with it too, similar exception, but we're did not > sync our nutch with that issue IIRC. > > Will check further this week. > > -----Original message----- > > From:Sebastian Nagel <[email protected]> > > Sent: Tuesday 17th September 2013 15:40 > > To: [email protected] > > Subject: IOException reading segments with current trunk > > > > Hi, > > > > recently I got some IO exceptions when reading older segments > > with recent trunk builds. Did anyone make similar observations? > > > > According to the stack it seems possible that NUTCH-1622 > > causes segments' parse_data to be incompatible between versions? > > > > Thanks, > > Sebastian > > > > java.io.IOException: IO error in map input file > > file:.../segments/20130623204752/parse_data/part-00000/data > > ... > > Caused by: java.io.EOFException > > at java.io.DataInputStream.readFully(DataInputStream.java:197) > > at java.io.DataInputStream.readUTF(DataInputStream.java:609) > > at java.io.DataInputStream.readUTF(DataInputStream.java:564) > > at > org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:199) > > at > org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:146) > > at org.apache.nutch.parse.Outlink.readFields(Outlink.java:54) > > at org.apache.nutch.parse.Outlink.read(Outlink.java:84) > > at > org.apache.nutch.parse.ParseData.readFields(ParseData.java:133) > > at > > > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) > > at > > > org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) > > >

