Re: Random thought: line separators

Olivier Grisel Mon, 18 Jan 2010 07:50:46 -0800

2010/1/18 Olivier Grisel <olivier.gri...@ensta.org>:
> 2010/1/18 Robin Anil <robin.a...@gmail.com>:
>> could you be specific on which map/reduce job you encountered the error ?
>
> I thought it was on:
>
> hadoop jar examples/target/mahout-examples-0.3-SNAPSHOT.job
> org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver -i
> "wikipediadump/chunk-0001.xml" -o wikipediainput-eof-exception -c
> examples/src/test/resources/country.txt
>
> I just ran it again... successfully... The next time I encounter that
> error I will note the complete complete stacktrace however
> uninformative it looks.


I ran the same jobs again on all the chunks and could reproduce the error:

$ hadoop jar examples/target/mahout-examples-0.3-SNAPSHOT.job
org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorDriver -i
"wikipediadump" -o wikipediainput-eof-exception -c
examples/src/test/resources/country.txt
[...]
10/01/18 16:20:46 INFO mapred.JobClient:  map 100% reduce 83%
10/01/18 16:21:42 INFO mapred.JobClient: Task Id :
attempt_201001172109_0010_r_000000_2, Status : FAILED
java.io.EOFException
        at java.io.DataInputStream.readByte(DataInputStream.java:250)
        at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
        at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
        at org.apache.hadoop.io.Text.readString(Text.java:400)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2869)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2794)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2077)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2263)

I have no idea where it could possibly stem from.

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name

Re: Random thought: line separators

Reply via email to