Hi,

recently I got some IO exceptions when reading older segments
with recent trunk builds. Did anyone make similar observations?

According to the stack it seems possible that NUTCH-1622
causes segments' parse_data to be incompatible between versions?

Thanks,
Sebastian

java.io.IOException: IO error in map input file
file:.../segments/20130623204752/parse_data/part-00000/data
...
Caused by: java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:197)
        at java.io.DataInputStream.readUTF(DataInputStream.java:609)
        at java.io.DataInputStream.readUTF(DataInputStream.java:564)
        at 
org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:199)
        at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:146)
        at org.apache.nutch.parse.Outlink.readFields(Outlink.java:54)
        at org.apache.nutch.parse.Outlink.read(Outlink.java:84)
        at org.apache.nutch.parse.ParseData.readFields(ParseData.java:133)
        at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67)
        at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)

Reply via email to