On Jun 26, 2012, at 11:39 AM, exabrial wrote: > Thanks for pointing exactly where I need to look! > > I'm relieved to see that the underlying protocol isn't CORBA/IIOP. It looks > like is sort of a custom protocol. The request is encapsulated with a > request type (auth, jndi, or ejb) then it's reading serialized java objects > with ObjectInputStream. Overall, it's probably pretty danged fast.
At one point the protocol included custom ObjectInputStream/ObjectOutputStream implementations I wrote that *were* faster. JVM optimizations put and end to that and instead of being 30% faster they were actually slower :) So we pealed them out and reverted to the built-in JVM implementations. But overall the protocol has been written with an intimate knowledge of serialization and attempts to avoid some of the chattier parts of it. For example object serialization writes structure information about a class (once) then writes the data for that class (once per instance). Generally speaking, we've cut out the structure part of all our objects and get straight to the data writing part. So "our" portion of the communication is incredibly small leaving the rest for your objects. It also supports sending a versioned list of server addresses for clustering support. The client sends the version number on every request. If the list has changed, the server sends back a new list & version with the regular response. In general we try and keep state or similar things boiled down to a byte or long and only transmit "full" data when necessary. If ObjectInputStream/ObjectOutputStream implementations weren't so expensive to maintain, I'd take another crack at writing a better one. Basically, an OOS or OIS will cache both class and instance data. The instance data is cached so that if you see a reference to the same object again you just write its id instead of writing the entire object. Because there's instance data cached in the OOS and OIS instances, you have to throw them away and create new ones on each request. This unfortunately throws everything away including the class descriptor data. So if you 1000 requests using an object graph consisting of 30 objects you're writing effectively constant data 29970 times more than you need to. The optimization would be to simply split OOS into two objects (two caches). One to hold class descriptor cache -- this object you keep and reuse on every request. And one to hold instance cache -- this one you create on every request. Then communication would naturally compress. After the first few requests, you'd be done writing class descriptor data for the most part and only be writing instance data. Anyway, I get way too into this stuff :) If you were looking for something fun to hack on, this would be one of those cool areas. Grabbing the OOS and OIS code from Harmony would be a great way to get started. The it's just a matter of refactoring the code into a thread safe outter class to hold the class descriptor cache and a factory method to create an ObjectOutputStream which is really just an non-static inner class that can reuse the class cache and has it's own cache for instances. > > It's not modular however, but the design is beautifully simplistic; I'd hate > to see it get trashed with a pluggable handlers :( Thanks very much :) I like to think "tight and simple" describes OpenEJB overall, but definitely the protocol is one of my favorite parts of the code. -David