On Jan 18, 2007, at 4:44 PM, Andrzej Bialecki wrote:


java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java: 178) at org.apache.hadoop.io.DataOutputBuffer$Buffer.write (DataOutputBuffer.java:57) at org.apache.hadoop.io.DataOutputBuffer.write (DataOutputBuffer.java:91)
        at org.apache.hadoop.io.UTF8.readChars(UTF8.java:212)
        at org.apache.hadoop.io.UTF8.readString(UTF8.java:204)
at org.apache.hadoop.io.ObjectWritable.readObject (ObjectWritable.java:173)

UTF8? How weird - recent versions of Nutch tools, such as Crawl, Generate et al (and SegmentMerger) do NOT use UTF8, they use Text. It seems this data was created with older versions. Please check that you don't have older versions of Hadoop or nutch classes on you classpath.

I printed my CLASSPATH in the bin/nutch script before it calls anything, and all the jars and jobs are local to the nightly directory which I downloaded today except for /usr/local/java/lib/ tools.jar. All are dated 2007-01-17 19:42.

hadoop-0.10.1-core is in there.

And the data is brand new (I delete the crawl dir before doing my test run.)

-Brian


Reply via email to