hi there,
I am trying to add a new data field to Page class, a
simple String.
So, I follow the URL field in Page Class as template.
But when I do WebDBInject, it gives me following error
messages. Seems the readFields() is not reading in the
right position.
I wonder if it is feasible to make a change in Page
Class, as I understand nutch webdb has advanced
structure and operations. From OO view, all the Page
fields should be accessed by Page Class Interface, but
I just met something weird.
thanks,
Michael Ji,
---------------------------------------------
Exception in thread "main" java.io.EOFException
at
java.io.DataInputStream.readUnsignedShort(DataInputStream.java:310)
at org.apache.nutch.io.UTF8.readFields(UTF8.java:101)
at org.apache.nutch.db.Page.readFields(Page.java:146)
at
org.apache.nutch.io.SequenceFile$Reader.next(SequenceFile.java:278)
at
org.apache.nutch.io.MapFile$Reader.next(MapFile.java:349)
at
org.apache.nutch.db.WebDBWriter$PagesByURLProcessor.mergeEdits(WebDBWriter.java:618)
at
org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:557)
at
org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544)
at
org.apache.nutch.db.WebDBInjector.close(WebDBInjector.java:336)
at
org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:581)
__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005
http://mail.yahoo.com