I am using the MapFileReader to iterate through the file. And I read the key into a Text object and the MetaData into a ParseData object. I get the following exception:
Exception in thread "main" java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:197) at org.apache.hadoop.io.Text.readString(Text.java:402) at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243) at org.apache.nutch.parse.ParseData.readFields(ParseData.java:144) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1813) at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1941) at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:517) at NearDuplicates.main(NearDuplicates.java:58) Thanks, Regards, Ami Parikh (213)590-0005 On Thu, Feb 26, 2015 at 11:00 AM, Renxia Wang <[email protected]> wrote: > Hi Ami, > > What method of what class do you use to get the meta data? Please provide > more info about this, log etc. > > Zhique > > On Thu, Feb 26, 2015 at 10:53 AM, Ami Akshay Parikh <[email protected]> > wrote: > >> Hello, >> >> When I try to use the parse_data from the segment directory for getting >> the MetaData for finding near duplicates, My code runs into a EOFException. >> I found something about a bug in nutch in the archives, but I wanted to >> know if anyone else is facing this problem and how can I possibly resolve >> it. >> >> Thanks, >> >> Regards, >> Ami Parikh >> (213)590-0005 >> > >

