On Sep 10, 2008, at 10:09 AM, Charles Oliver Nutter wrote:

>
> Attila Szegedi wrote:
>> Well, overriding resolveClass() seems like a good solution to me.  
>> What
>> problems do you have with that?
>
> I can give you one reason that it's not good enough, at least for the
> JRuby use case of needing to do our own initialization of objects as
> they're deserialized: I can't make others use our overridden OIS
> subclass, like app servers and so on.

That's true -- I specifically said this only works when it's your code  
doing the deserialization. Yeah, serialization is seriously broken,  
and there's not much you can do to fix it. Some runtime objects in a  
JVM just don't lend themselves to be serialized - threads and class  
loaders are prime examples. And since a class loader is part of a  
class' identity, there you go...

Even if someone would embark on a project to create a new pair of  
serialization stream classes, that somehow can represent class loader  
identity, it would be a very tough nut to crack, but here are some  
ideas:

* URLClassLoaders can be "serialized" by storing their URL lists. Of  
course, if the URLs aren't global, then deserialization could fail if  
the JAR files aren't in the same location. Further, you would end up  
with duplicate class loaders on multiple deserializations, except if  
you'd happen to have a global registry of sorts.

* A better idea would be to have a JNDI javax.naming or similar  
context that might be passed to serialization/deserialization stream  
to bind class loaders to names known in the JVM instance, and have the  
ObjectStreamClass also contain the JNDI name of the class' loader.  
It'd turn into a headache for "anonymous" class loaders created to  
load generated code though, but...

* ... on subject of generated code, I think a language runtime  
creating classes on-the-fly for purposes of holding compiled code  
could have those classes define writeReplace to have their instances  
replaced with a code generator object. The code generator would in  
turn need to have a readResolve method to generate the class,  
instantiate it, and return it. Maybe utilize a global cache keyed by  
SHA-1 digest of the code to eliminate duplicates.

I know, it's a bit scary to store the actual code of the class in a  
serialized stream, but if you think of it, that's exactly the defining  
aspect of the generated-code classes. Except if you can store an  
external reference to the source code. That's exactly what I'm doing  
in a Rhino-based system I have, where we're serializing continuations.  
I can avoid serializing the objects that contain actual function code  
(reachable through the Function object), because I'm registering such  
code objects from all scripts in a map, and using that in my  
seralization subclasses to replace them with named stubs. So, if you  
have a script loaded from "foo.js" that looks like this:

function a()
{
     var b = function()
     {
         ....
     }
     var c = function()
     {
         ....
     }
}

then the code object of the script itself (which is a top-level  
function in JS) becomes registered as "foo.js", code of function a is  
registered as "foo.js/0" and code of its nested function "b" is   
"foo.js/0/0", code of function c is "foo.js/0/1" etc. Then I'm just  
using such named stubs in serialization/deserialization.  
Deserialization must also be able to load and compile the script if  
it's not yet loaded when a stub referencing it is encountered; you can  
see how this only works in a larger framework that has the notion of  
actual identifiers (i.e. URIs) for scripts' source code... I even go  
to the trouble of storing MD5 hashes of the functions' code with the  
stub, and if it doesn't match the MD5 hash of a function on the  
continuation stack when deserialized, it'll fail early instead of  
running a corrupted continuation (that can now contain a bogus return  
address in a callee stack frame because of changed caller code)

But again, this is part of a larger framework ecosystem where various  
components cooperate to provide stubs, load and compile scripts during  
deserialization as needed, etc. Which only goes to show that stock  
serialization in itself is indeed insufficient for more serious uses...

Attila.

--
weblog: http://constc.blogspot.com
twitter: http://twitter.com/szegedi



>
>
> - Charlie



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to jvm-languages@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to