Power and flexibility of Serialization is under exploited

Peter Firmstone Wed, 12 Oct 2011 19:06:44 -0700

Serialization has an undeserving bad reputation, perhaps caused by toomany developers just adding implements Serializable and accepting thedefault serialized form in public API, then turning around and sayingthey won't support backward compatible Serialization.

In the implementation discussed below all objects are using just onepublic API class with static factory methods, to keep it simple for userdevelopers.

I've been adding serialization to reference collections (just a bunch ofwrapper classes that encapsulate any collection framework interface andperform the boilerplate of retrieving referents, wrapping them inreferences and removing enqueued references from those collections,allowing the choice of Weak, Soft, Strong references with identity,equals and comparable semantics.

All the following package private wrapper classes share a singleserialized form at present:


ReferenceCollection
ReferenceList
ReferenceSet
ReferenceSortedSet
ReferenceNavigableSet
ReferenceQueue
ReferenceDeque
ReferenceBlockingQueue
ReferenceBlockingDeque

The serial form is generated using writeReplace, and it recreates thecorrect collection using ReadResolve.

Now because each wrapper class is only publicly visible to the client asa java collection framework (JCF) interface, the serialized form (alsocalled a serialization proxy), rebuilds it using the standard public apifactory class during de-serialization, based on the JCF interface itimplemented. So the remote end is free to use another implementation.

Now there's a readResolve bug worth mentioning here, with regard tocircular references. writeReplace replaces all original objectinstances with your serialized form object, but readResolve doesn'treplace circular referenced objects during de serialization. So ifyou're utilising readResolve to replace your serialized form, you'll endup with a mix of the serialized form object and your freshly constructedimplementation object. You'll get ClassCastExceptions etc...

Bob Lee, that's Crazy Bob from JSR330 and Google Guice, came up with theidea of having the serialization proxy and original objects share thesame interface, then having all methods redirected to the newly builtobject upon de-serialization.

So to implement that, I've got an inheritance hierarchy for theserialization proxy, to separate each function:


SerializationOfReferenceCollection
                |
ReadResolveFixCollectionCircularReferences
                |
ReferenceCollectionRefreshAfterSerialization
                |
ReferenceCollectionSerialData

Now right about now, you're probably saying 4 classes in an inheritancehierarchy is a bit heavy for serialization?

Well no, not when you consider: they serialize 9 classes, and of allthose classes, only one, ReferenceCollection has to implement a finalwriteReplace method, while all have to implement a readObject methodthat throws an exception to prevent direct de-serialization.

So all the 9 classes are freed from the implementation of Serialization,it's now the responsibility of the 4 classes in the serialization proxy(Serialization builder pattern) inheritance hierarchy.


Function of each class in the inheritance hierarchy:

SerializationOfReferenceCollection is an abstract class with a staticfactory method.

ReadResolveFixCollectionCircularReferences implements all the JCFcollection based interfaces and redirects their calls to theReferenceCollection implementation built during de-serialization.

ReferenceCollectionRefreshAfterSerialization, updates all the Referencescontained by the collection so they belong to the same garbagecollection ReferenceQueue and creates new References for all referents.

ReferenceCollectionSerialData, contains the fields transferred duringserialization and implements abstract methods for the super classes to"get" these fields.

Now the interesting part is, I'm considering having three differentserialized form's, each with a different purpose, the client can choosefrom:

1. A Non serialization class, that prevents serialization, where adeveloper want's to prevent access to serialized state.


2. The default serial data.

3. Defensive copying of serial data, to prevent stolen references tointernal state during de-serialization.


The choice between the three serial states can be left until runtime,

the recipient of these objects when serialized doesn't have a choicewhich serial form is used, only the creator of the original object does.

Items 1 and 3 would only be used in a local sense, where a clientprogram might try to use serialization to gain access to internalimplementation state.

Item 2 would be used in a genuine distributed environment, over a secureconnection, where there is no point using defensive copy's.

I've only implemented Item 2 of course, I decided that while it ispossible to do 1 and 3 as well to demonstrate just how flexibleserialization can be, it wasn't warranted based on that alone. It willbe possible to do this at some point in future, or to change the serialform in a non compatible manner, by adding a new serial form class,while retaining the original, so that both the old and new serial formscan be de-serialized.

When you apply Object design principals of responsibility, evenserialization can be flexible.

Serialized Form lock in, is the same as inappropriate use of publicfields or other poor programming practices. Note that there are timeswhere standard rules don't apply like the use of public fields inEntry's, which is totally appropriate, just as accepting the standardserial form in package private classes is appropriate too.


Cheers,

Peter.

Power and flexibility of Serialization is under exploited

Reply via email to