On Jan 18, 2007, at 4:44 PM, Andrzej Bialecki wrote:
java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:
178)
at org.apache.hadoop.io.DataOutputBuffer$Buffer.write
(DataOutputBuffer.java:57)
at org.apache.hadoop.io.DataOutputBuffer.write
(DataOutputBuffer.java:91)
at org.apache.hadoop.io.UTF8.readChars(UTF8.java:212)
at org.apache.hadoop.io.UTF8.readString(UTF8.java:204)
at org.apache.hadoop.io.ObjectWritable.readObject
(ObjectWritable.java:173)
UTF8? How weird - recent versions of Nutch tools, such as Crawl,
Generate et al (and SegmentMerger) do NOT use UTF8, they use Text.
It seems this data was created with older versions. Please check
that you don't have older versions of Hadoop or nutch classes on
you classpath.
I printed my CLASSPATH in the bin/nutch script before it calls
anything, and all the jars and jobs are local to the nightly
directory which I downloaded today except for /usr/local/java/lib/
tools.jar. All are dated 2007-01-17 19:42.
hadoop-0.10.1-core is in there.
And the data is brand new (I delete the crawl dir before doing my
test run.)
-Brian