Curt Cox wrote:
In my experience, using Serialization instead of DataInput/DataOutput streams has a major impact on versioning. Serialization keeps a lot of metadata in the stream. This makes detecting format changes very easy, but can really complicate backward compatibility.
FYI, Owen has just proposed a mechanism for object & API versioning that stores the versions separately from the data.
http://issues.apache.org/jira/browse/HADOOP-558 Doug