I am using the MapFileReader to iterate through the file. And I read the
key into a Text object and the MetaData into a ParseData object. I get the
following exception:

Exception in thread "main" java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at org.apache.hadoop.io.Text.readString(Text.java:402)
at org.apache.nutch.metadata.Metadata.readFields(Metadata.java:243)
at org.apache.nutch.parse.ParseData.readFields(ParseData.java:144)
at
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1813)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1941)
at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:517)
at NearDuplicates.main(NearDuplicates.java:58)

Thanks,

Regards,
Ami Parikh
(213)590-0005

On Thu, Feb 26, 2015 at 11:00 AM, Renxia Wang <[email protected]> wrote:

> Hi Ami,
>
> What method of what class do you use to get the meta data? Please provide
> more info about this, log etc.
>
> Zhique
>
> On Thu, Feb 26, 2015 at 10:53 AM, Ami Akshay Parikh <[email protected]>
> wrote:
>
>> Hello,
>>
>> When I try to use the parse_data from the segment directory for getting
>> the MetaData for finding near duplicates, My code runs into a EOFException.
>> I found something about a bug in nutch in the archives, but I wanted to
>> know if anyone else is facing this problem and how can I possibly resolve
>> it.
>>
>> Thanks,
>>
>> Regards,
>> Ami Parikh
>> (213)590-0005
>>
>
>

Reply via email to