Stefan Groschupf wrote:
can someone please tell me what is the technical difference between
org.apache.nutch.io.Writable and java.io.Externalizable?

For me that looks very similar and Externalizable is available since jdk 1.1.
What do I miss?

You don't miss much!

I avoided using Java's built-in Serialization and RMI when first writing Nutch as I wanted close control of how objects are written and of the client/server architecture (how it connects, how many connections, what happens when things fail, etc). I felt that it might be difficult to use parts of Serialization and RMI without getting tangled in the rest.

Yes, we could easily switch to using java.io.Externalizable in place of org.apache.nutch.io.Writable. We would also then need to switch to using ObjectInput and ObjectOutput in place of DataInput and DataOutput. But how should we implement writeObject() and readObject()? I'm hesitant to use ObjectInputStream and ObjectOutputStream, since these have a lot of other baggage, but maybe I'm just paranoid.

That said, in org.apache.nutch.io.ObjectWritable (mapred branch) I have now recreated much of object serialization, so perhaps it is time to seriously reconsider this decision.

In general I try to not adopt libraries into the core that include a lot of complex functionality that we don't intend to use. Java's Serialization provides a lot of features needed for RMI that I don't think that Nutch requires.

What do others think?

Doug


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to