Thanks, good to know.

We should add a warning to release notes and CHANGES.txt.
Ideally, of course, reading segments should be backward compatible.

Sebastian


2013/9/17 Markus Jelsma <[email protected]>

> Yes, we've got trouble with it too, similar exception, but we're did not
> sync our nutch with that issue IIRC.
>
> Will check further this week.
>
> -----Original message-----
> > From:Sebastian Nagel <[email protected]>
> > Sent: Tuesday 17th September 2013 15:40
> > To: [email protected]
> > Subject: IOException reading segments with current trunk
> >
> > Hi,
> >
> > recently I got some IO exceptions when reading older segments
> > with recent trunk builds. Did anyone make similar observations?
> >
> > According to the stack it seems possible that NUTCH-1622
> > causes segments' parse_data to be incompatible between versions?
> >
> > Thanks,
> > Sebastian
> >
> > java.io.IOException: IO error in map input file
> > file:.../segments/20130623204752/parse_data/part-00000/data
> > ...
> > Caused by: java.io.EOFException
> >         at java.io.DataInputStream.readFully(DataInputStream.java:197)
> >         at java.io.DataInputStream.readUTF(DataInputStream.java:609)
> >         at java.io.DataInputStream.readUTF(DataInputStream.java:564)
> >         at
> org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:199)
> >         at
> org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:146)
> >         at org.apache.nutch.parse.Outlink.readFields(Outlink.java:54)
> >         at org.apache.nutch.parse.Outlink.read(Outlink.java:84)
> >         at
> org.apache.nutch.parse.ParseData.readFields(ParseData.java:133)
> >         at
> >
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
> >         at
> >
> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
> >
>

Reply via email to