Alan McKean wrote:
Here at GemStone we are investigating adding transparent object
'persistence by reachability' to JRuby. We have run JRuby in our JVM and
have the persistence working. But we end up persisting too much because
of the attachment of JRuby objects to the runtime. Whenever an object is
saved, all objects reachable via its instance variables are also saved.
When it can reach the runtime, we end up persisting way more than we
want, and trying to persist some things that are not persistable
(threads, for example). I realize that the coupling to the runtime is a
problem in supporting serialiazation/deserialization as well.
It looks like the gateway to the runtime is via the metaclass instance
variable in JRubyObject. Couldn't that be made transient? This would go
a long way toward decoupling the runtime from the application objects
for the purposes of serialization and persistence.
I talked with the GemStone guys yesterday and the transient change seems
like a trivial one to add. There are various interesting details for
reconsitituting objects, however...more below.
Reattaching persisted objects to the runtime would be more difficult. I
am told that there have been ideas about how to do this voiced in the
past, but the issue has been deferred until now. One suggestion for
reattaching a persistent object to the runtime is to have an accessor
that lazily initializes the runtime connection. On the first invocation
after the object is loaded from the database or deserialized, it would
restore the connection to the runtime. Both the overhead and the impact
to the code base would be minimal.
One problem is that we're dealing with two different serialization
mechanisms in JRuby. We need to support Ruby's marshalling
specification, where objects are pulled off a string by a Ruby runtime
into existend, and we want to support general Java serialization, where
there may or may not be a Ruby runtime available.
Ruby marshalling as the top-level mechanism is an easy problem to
solve...since we have a runtime doing the unmarshalling, we can
reconstitute links to metaclass et al as objects are being created. If
we want to support serializing arbitrary Java objects, that's possible
too...we just add in custom marshal code to the Java wrapper types that
calls out to Java serialization.
Java serialization as the top-level mechanism is significantly more
complicated. Even if we mark unsafe fields of objects as transient (to
avoid pulling in RubyClass/Ruby and friends) we need a way to put them
back. We can reconstitute the raw data in an object, but we have no way
to get an appropriate metaclass from an appropriate runtime. So there
are a few thoughts here:
1. We could work to eliminate ties to a specific runtime. This makes
Ruby more of a first-class JVM language, since Ruby objects are "just
Java objects". This solves all serialization problems. However, Ruby
objects have a link back to a JRuby runtime for a good reason: "just
Java objects" don't have all the rich semantics we need for Ruby
behavior. Even if we sever the tie completely, we still need a way to
get at the runtime, threadcontexts, runtime-global state and so on. That
means finding a new place to get at runtime. Ideas: threadlocal,
classloader-local...either way it means an additional potentially slow
lookup, where now it's two field accesses and a couple method calls.
2. We could provide a standard place for deserialized Ruby objects to
"go get" a runtime, perhaps through a threadlocal or classloader-local
location. This sounds a little ugly, since we're now tying JRuby runtime
access to a mechanism that disallows having many JRuby runtimes shared
across threads and classloaders. But since at this point we're now
starting to deal more with Ruby as a top-level JVM language and less as
a general-purpose Ruby runtime, it may be acceptable.
To make one thing perfectly clear though: we do want to fix this, and
we're willing to make changes to JRuby to support it. There was a bit of
handwaving when the subject of Java serialization came up in the past,
largely because it was out of scope for 1.0. But it's most definitely in
scope now, and we need a solution.
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email