Alan McKean wrote:
Here at GemStone we are investigating adding transparent object 'persistence by reachability' to JRuby. We have run JRuby in our JVM and have the persistence working. But we end up persisting too much because of the attachment of JRuby objects to the runtime. Whenever an object is saved, all objects reachable via its instance variables are also saved. When it can reach the runtime, we end up persisting way more than we want, and trying to persist some things that are not persistable (threads, for example). I realize that the coupling to the runtime is a problem in supporting serialiazation/deserialization as well.

It looks like the gateway to the runtime is via the metaclass instance variable in JRubyObject. Couldn't that be made transient? This would go a long way toward decoupling the runtime from the application objects for the purposes of serialization and persistence.

I talked with the GemStone guys yesterday and the transient change seems like a trivial one to add. There are various interesting details for reconsitituting objects, however...more below.

Reattaching persisted objects to the runtime would be more difficult. I am told that there have been ideas about how to do this voiced in the past, but the issue has been deferred until now. One suggestion for reattaching a persistent object to the runtime is to have an accessor that lazily initializes the runtime connection. On the first invocation after the object is loaded from the database or deserialized, it would restore the connection to the runtime. Both the overhead and the impact to the code base would be minimal.

One problem is that we're dealing with two different serialization mechanisms in JRuby. We need to support Ruby's marshalling specification, where objects are pulled off a string by a Ruby runtime into existend, and we want to support general Java serialization, where there may or may not be a Ruby runtime available.

Ruby marshalling as the top-level mechanism is an easy problem to solve...since we have a runtime doing the unmarshalling, we can reconstitute links to metaclass et al as objects are being created. If we want to support serializing arbitrary Java objects, that's possible too...we just add in custom marshal code to the Java wrapper types that calls out to Java serialization.

Java serialization as the top-level mechanism is significantly more complicated. Even if we mark unsafe fields of objects as transient (to avoid pulling in RubyClass/Ruby and friends) we need a way to put them back. We can reconstitute the raw data in an object, but we have no way to get an appropriate metaclass from an appropriate runtime. So there are a few thoughts here:

1. We could work to eliminate ties to a specific runtime. This makes Ruby more of a first-class JVM language, since Ruby objects are "just Java objects". This solves all serialization problems. However, Ruby objects have a link back to a JRuby runtime for a good reason: "just Java objects" don't have all the rich semantics we need for Ruby behavior. Even if we sever the tie completely, we still need a way to get at the runtime, threadcontexts, runtime-global state and so on. That means finding a new place to get at runtime. Ideas: threadlocal, classloader-local...either way it means an additional potentially slow lookup, where now it's two field accesses and a couple method calls.

2. We could provide a standard place for deserialized Ruby objects to "go get" a runtime, perhaps through a threadlocal or classloader-local location. This sounds a little ugly, since we're now tying JRuby runtime access to a mechanism that disallows having many JRuby runtimes shared across threads and classloaders. But since at this point we're now starting to deal more with Ruby as a top-level JVM language and less as a general-purpose Ruby runtime, it may be acceptable.

To make one thing perfectly clear though: we do want to fix this, and we're willing to make changes to JRuby to support it. There was a bit of handwaving when the subject of Java serialization came up in the past, largely because it was out of scope for 1.0. But it's most definitely in scope now, and we need a solution.

- Charlie

---------------------------------------------------------------------
To unsubscribe from this list please visit:

   http://xircles.codehaus.org/manage_email

Reply via email to