Clarification inline. Sent from my Samsung device. Include original message ---- Original message ---- From: Peter <j...@zeus.net.au> Sent: 08/02/2017 05:35:57 am To: dev@river.apache.org <dev@river.apache.org> Subject: Re: OSGi - deserialization remote invocation strategy
Hi Nic, I'm currently only considering OSGi server -> OSGi client. Mick's investigating all four options. Not expecting the client calling bundle to resolve everything, hence the stack, so we have the full visibility of the bundle of the class that was last resolved, so we can resolve its fields from it's bundle Eg it might import packages the client does not. The "exact version" thing (only applies to the proxy bundle as we expect the framework to load its deps) can be relaxed to compatible versions to increase class sharing if you think it helps. The proxy bundle doesn't export anything at the client, only the server, it just seems to make sense to keep the latest <clarification> This same proxy codebase version as the server is using, assuming it's using the latest. The proxy bundle codebase might not have been published to a repository, it may be an httpmd url for example. </clarification> proxy communicating in case that last bug fix release addresses a security issue. All proxy classes are implementation only classes. Because the proxy bundle manifest declares version import ranges, I'm expecting the framework to favour already loaded bundles to satisfy package import deps. If the client is matching service api with the correct import package version ranges (requirements defined by entry's), the proxy bundle should find the service api and other imported packages are already loaded. Eg the client may use the requirements to use the resource service or whatever the new bundle repository standard service is called now to preload the requirements. The client may also perform upgrades before downloading a service. In the majority of cases I don't think there's going to be much state in the smart proxy that can't be loaded via the smart proxy bundle and it's package imports, except for the odd handback, which the client bundle should have the opportunity to resolve before resorting to using an annotation. I'm not quite ready to agree it's too complex and it's unsolveable, I think we should at least explore it and understand it before we junk the idea of supporting OSGi. Rather than utilise the Java2 class loading I was planning to cast ClassLoaders to BundleReference where appropriate and utilise the Bundle. I did notice you're interpretation of what I've written is different than mine, so I think I need to put some effort into communicating more effectively. I think you're interpretation of codebase annotation "version is fixed" ignores that the annotation is only consulted after determining that the current class is not available in our Bundles currently participating in deserialization. It doesn't apply to resolved imported packages as annotations aren't used for them at all For example, the first class we attempt to resolve during unmarshalling belongs to a smart proxy, the client Bundle can't find the class. Ask the framework to load the proxy bundle from the codebase annotation, it does so and resolves all necessary package imports declared in its manifest. We now continue deserializing the smart proxy class fields with the visibility of the smart proxy's bundle. The smart proxy may contain fields referencing objects resolved from its imports, we ensure those classes deserialize their fields with the visibility of their own bundles. Every time we can't resolve a class we first check if it's a handback or parameter from a preceeding object in the graph, thus we walk our graphs bundle stack. If we still haven't resolved a class only then do we load a bundle from it's codebase annotation url and check it can be cast to the field before asdigning it. If it can't be cast to that field, we throw an exception. In the case of a non smart proxy, there is no codebase, deserialization will be loaded by and rely completely on the visibility of the client bundle. I think OSGi will be a lot less dependant on annotations than say a std env. Still I guess wiring may be an an option, so as Michael suggests, annotate objects with their wiring graphs. What would we be considering if we hadn't been pre exposed to codebase annotations? Standard deserialization uses one classpath, each bundle has its own unique classpath. Cheers, Peter. Sent from my Samsung device. Include original message ---- Original message ---- From: Niclas Hedhman <nic...@hedhman.org> Sent: 08/02/2017 12:32:35 am To: dev@river.apache.org Subject: Re: OSGi - deserialization remote invocation strategy TL;DR 1. It sounds awfully complex, because my gut says that it is not a solvable problem, especially since I don't see 4 distinct cases; Server(osgi)+Client(osgi), Server(osgi)+Client(plain), Server(plain)+Client(osgi) and Server(plain)+Client(plain), where the last one is what we currently have. I am not sure how many of those you are discussing. 2. For Server(osgi)+Client(plain), I don't think there is a generic solution, due to "uses" in OSGi terms. The same object graph can contain multiple versions of the same classes in non-hierarchical order, which to me seems to be incompatible with the Java2 classloading mechanism. Replicating this, would effectively need (as Michal suggested) the OSGi framework to be booted up on the client (not impossible per se). 3. For Server(plain)+Client(osgi), I think tthat the 'easy' solution is collapse all dependencies into a single bundle and load that bundle on the OSGi framework. The exception being the API classes, i.e. those that must have been present on the client to be able to use the service. Exactly how to figure that out could be complicated and I have no good answers. 4. For Server(osgi)+Client(osgi), number of options goes up. In this space, Paremus has a lot of experience, and perhaps willing to share a bit, without compromising the secret sauce? Either way, Michal's talk about "wiring" becomes important and that wiring should possibly be re-established on the client side. The insistence on "must be exactly the same version" is to me a reflection of "we haven't cared about version management before", and I think it may not be in the best interest to load many nearly identical bundles just because they are a little off, say stuff like guava, commons-xyz, slf4j and many more common dependencies. Peter wrote; > This is why the bundle must be given first > attempt to resolve an objects class and rely on the bundle dependency resolution process. > OSGi must be allowed to wire up dependencies, we must avoid attempting to make decisions about > compatibility and use the current bundle wires instead (our stack). Well, not totally sure about that. The 'root object classloader' doesn't have visibility to serialized objects, and will fail if left to do it all by itself. And as soon as you delegate to another BundleClassLoader, you have made the resolution decision, not the framework. Michal's proposal to transfer the BundleWiring (available in runtime) from the server to the client, makes it somewhat possible to do the delegation. And to make matters worse, it is quite common that packages are exported from more than one bundle, so the question is what is included in the bundleWiring coming across the wire. HTH On Tue, Feb 7, 2017 at 8:14 PM, Peter <j...@zeus.net.au> wrote: > Proposed JERI OSGi class loading strategy during deserialization. > > Record caller context - this is the default bundle at the beginning of the > stack. It is obtained by the InvocationHandler on the > client side. The InvocationDispatcher on the server side has the calling > context of the Remote > implementation. The reflection dynamic proxy must be installed in the > client's class loader, so the > InvocationHandler knows exactly what it is, it will be passed to the > MarshalInputStream. Any > interfaces not found in the client's bundle can be safely shed. For a > smart proxy the reflection proxy will > be installed in the smart proxy loader. The smart proxy is obtained > either via a reflection proxy or a MarshalledInstance. > MarshalledInstance also passes in the callers loader to the > MarshalInputStream. > > The smart proxy classloader is not a child loader of the clients loader, > instead it's a bundle that imports > service api packages, with a version range that overlaps those already > imported by the client. > > Both Invocationhandler and InvocationDispatcher utilise MarshalInputStream > and MarshalOutputStream, for marshalling parameters and return values. > > The codebase annotation bundle's manifest contains a list of package > imports. > > Do we need to make a list of package imports for every new bundle that we > load? > Do we need to record the wiring and packages and their imports from the > remote end? > > I don't think so, the bundles themselves contain this information, I think > we just need to keep the view of available classes relevant to the current > object being deserialized. > > Codebase Annotations are exact versions! They need to be to allow the > service to ensure the correct proxy codebase is used. Other proxy > codebases will be installed in the client, possibly different versions, but > these won't be visible through the resolved dependencies, because the proxy > codebases only import packages at the client and OSGi restricts visibility > to the current bundle's own classes and any imported packages. > Instead of appending dependencies to the codebase annotation they'll need > be defined in the proxy's bundle manifest. Of course if an identical > version of a proxy codebase bundle is already installed at the client, this > will be used again. > > Because a bundle generally imports packages (importing entire bundles is > discouraged in OSGi), there may be classes > that aren't visible from those bundles, such as transient imports, but > also including private packages that aren't exported, private > implementations need to be deserialized, but is it possible to do so > safely, without causing package > conflicts? Private implementation classes can be used as fields within > an exported public object, but cannot and should not > escape their private scope, doing so risks them being resolved to a bundle > with the version of the remote end, instead of the locally resolved / wired > package, causing ClassClassExceptions. > > Initial (naive) first pass strategy of class resolution (for each branch > in the serialized object graph)?: > 1. Try current bundle on the stack (which will be the callers bundle if > we haven't loaded any new bundles yet). > 2. Then use the package name of a class to determine if the package is > loaded by any of the bundles > referenced by the callers bundle imports (to handle any private > implementation packages > that aren't in the current imports). Is this a good idea? Or should we go > straight to step 3 > and let the framework resolve common classes, what if we use a different > version to the > client's imported bundle? Should we first compare our bundle annotation > to the currently > imported bundles and select one of those if it's a compatible version? > Yes, this could be an > application bundle, otherwise goto 3. > 3. Load bundle from annotation (if already loaded, it will be an exact > version match). Place the > new bundle on top of the bundle stack, remove this bundle from the stack > once all fields of > this object have been deserialized, returning to the previous bundle > context. We are relying > on the current bundle to wire itself up to the same package versions of > the clients bundle > imports, for shared classes. Classes that use different bundles will not > be visible to the client, > but will need to be visible to the current object's bundle. > 4. Place a bundle reference on the stack when a new object is > deserialized from the stream and > remove it once all fields have been deserialized. (we might need to > remember stack depth). > 5. Don't place non bundle references on the stack. For example system > class loader or any > other class loader, we want resolution to occur via the OSGi resolution > process. > > What about a simpler strategy (again naive), where we don't attempt to > resolve private implementation classes? > 1. The calling class' bundle, is given priority. > 2. Load bundle from annotation (exact version), when not found in > calling class. > 3. No stack, what if an application bundle from server is loaded that > conflicts with an existing > bundle resolved by the client? > 4. What about walking back through the stack? Probably unnecessary, as > the containing object > will reference the class by a common interface, the outer object may not > need to reference > it at all. But what if the outer object passed it in during construction? > > Revised strategy: > 1. Attempt to load from current bundle on stack (the stack begins with > the client's Bundle, each > node in the graph has its bundle added to the stack and is also removed > after that node is completely deserialized. > 2. If unsuccessful, walk back through deserialized bundle reference > stack and attempt to load class. > Why not start at the beginning of the stack? We are expecting bundles to > wire up to > currently loaded versions, but bundles can import different package > versions for > implementation, safest to start with current bundle and consult parent if > not found in the current bundle > dependency graph, ie possibly passed in during object construction or an > handback > implemented in the client, from an earlier invocation or dependency > injected. > 3. The client is responsible for determining compatibility with the > service api it's interested in > from the Import Package Entry's, prior to unmarshalling a service proxy. > 4. If a bundle previously on the stack resolves a class, then this > object's bundle reference is placed > on the top of the stack, it is removed once the current object and all > it's fields have been completely deserialized. > 5. Load bundle from annotation (exact version). > 6. No attempt will be made to directly load from wired bundles, always > rely on wires, > otherwise we may utilise an incompatible package / bundle. > > Do we need a graph of the wiring from the remote end? > During serialization (from the remote end) do we need to determine if a > bundle has dependants and send some sort of version range information? > When a class descriptor is read in from a stream, the class descriptor > contains information > about fields and it's serializable supertype class (if it exists) > are also read in from the stream, before any field objects are read in, > the declared field types > are visible from the bundle of the current object being deserialized. The > objects that will be > assigned to those field types must also resolve to those types. Hence > bundles being resolved as part > of deserialization must favour already resolved packages for imports. > What if a bundle requires a specific package version? This is why the > bundle must be given first > attempt to resolve an objects class and rely on the bundle dependency > resolution process. > OSGi must be allowed to wire up dependencies, we must avoid attempting to > make decisions about > compatibility and use the current bundle wires instead (our stack). > > The BundleReference stack is designed to follow the wires (dependency > links between bundles), > to allow private classes to be resolved, as they're not visible from other > bundles. > > We can't rely on annotations to resolve private classes, because we can't > predict the way bundle > dependency's are resolved in remote JVM's. > > General recommendations for OSGi: > * The service should use as wide a version range as possible for > service api. > * It is better to create new service api in a new bundle than to evolve > in a backward compatible manner, as > an incremental change may not be compatible if additional classes and > methods are missing > from the client, that the service proxy depends on. > * Don't split packages. > * Private implementation classes are ok, provided they remain within > public exported classes and don't escape, otherwise > they may not link up properly upon deserialization. > * The proxy should minimise the package imports it uses. > * There must be only one compatible service api version installed > already in the client. > * Duplicates of incompatible versions of service api are ok. > > The catch is, it may not be possible to build the bundle stack without > some programming hooks in ObjectInputStream. > > Unfortunately we don't have any control over OIS, the necessary hooks > could however be added to AtomicMarshalInputStream. > > Cheers, > > Peter. > -- Niclas Hedhman, Software Developer http://polygene.apache.org <http://zest.apache.org> - New Energy for Java