On Sep 10, 2008, at 10:09 AM, Charles Oliver Nutter wrote: > > Attila Szegedi wrote: >> Well, overriding resolveClass() seems like a good solution to me. >> What >> problems do you have with that? > > I can give you one reason that it's not good enough, at least for the > JRuby use case of needing to do our own initialization of objects as > they're deserialized: I can't make others use our overridden OIS > subclass, like app servers and so on.
That's true -- I specifically said this only works when it's your code doing the deserialization. Yeah, serialization is seriously broken, and there's not much you can do to fix it. Some runtime objects in a JVM just don't lend themselves to be serialized - threads and class loaders are prime examples. And since a class loader is part of a class' identity, there you go... Even if someone would embark on a project to create a new pair of serialization stream classes, that somehow can represent class loader identity, it would be a very tough nut to crack, but here are some ideas: * URLClassLoaders can be "serialized" by storing their URL lists. Of course, if the URLs aren't global, then deserialization could fail if the JAR files aren't in the same location. Further, you would end up with duplicate class loaders on multiple deserializations, except if you'd happen to have a global registry of sorts. * A better idea would be to have a JNDI javax.naming or similar context that might be passed to serialization/deserialization stream to bind class loaders to names known in the JVM instance, and have the ObjectStreamClass also contain the JNDI name of the class' loader. It'd turn into a headache for "anonymous" class loaders created to load generated code though, but... * ... on subject of generated code, I think a language runtime creating classes on-the-fly for purposes of holding compiled code could have those classes define writeReplace to have their instances replaced with a code generator object. The code generator would in turn need to have a readResolve method to generate the class, instantiate it, and return it. Maybe utilize a global cache keyed by SHA-1 digest of the code to eliminate duplicates. I know, it's a bit scary to store the actual code of the class in a serialized stream, but if you think of it, that's exactly the defining aspect of the generated-code classes. Except if you can store an external reference to the source code. That's exactly what I'm doing in a Rhino-based system I have, where we're serializing continuations. I can avoid serializing the objects that contain actual function code (reachable through the Function object), because I'm registering such code objects from all scripts in a map, and using that in my seralization subclasses to replace them with named stubs. So, if you have a script loaded from "foo.js" that looks like this: function a() { var b = function() { .... } var c = function() { .... } } then the code object of the script itself (which is a top-level function in JS) becomes registered as "foo.js", code of function a is registered as "foo.js/0" and code of its nested function "b" is "foo.js/0/0", code of function c is "foo.js/0/1" etc. Then I'm just using such named stubs in serialization/deserialization. Deserialization must also be able to load and compile the script if it's not yet loaded when a stub referencing it is encountered; you can see how this only works in a larger framework that has the notion of actual identifiers (i.e. URIs) for scripts' source code... I even go to the trouble of storing MD5 hashes of the functions' code with the stub, and if it doesn't match the MD5 hash of a function on the continuation stack when deserialized, it'll fail early instead of running a corrupted continuation (that can now contain a bogus return address in a callee stack frame because of changed caller code) But again, this is part of a larger framework ecosystem where various components cooperate to provide stubs, load and compile scripts during deserialization as needed, etc. Which only goes to show that stock serialization in itself is indeed insufficient for more serious uses... Attila. -- weblog: http://constc.blogspot.com twitter: http://twitter.com/szegedi > > > - Charlie --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to jvm-languages@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en -~----------~----~----~----~------~----~------~--~---