Re: OSGi - deserialization remote invocation strategy

Peter Tue, 07 Feb 2017 12:15:11 -0800

Clarification inline.

Sent from my Samsung device.
 
  Include original message
---- Original message ----
From: Peter <j...@zeus.net.au>
Sent: 08/02/2017 05:35:57 am
To: dev@river.apache.org <dev@river.apache.org>
Subject: Re: OSGi - deserialization remote invocation strategy


Hi Nic,

I'm currently only considering OSGi server -> OSGi client.  Mick's 
investigating all four options.

Not expecting the client calling bundle to resolve everything, hence the stack, 
so we have the full visibility of the bundle of the class that was last 
resolved, so we can resolve its fields from it's bundle  Eg it might import 
packages the client does not.

The "exact version" thing (only applies to the proxy bundle as we expect the 
framework to load its deps) can be relaxed to compatible versions to increase 
class sharing if you think it helps.  The proxy bundle doesn't export anything 
at the client, only the server, it just seems to make sense to keep the latest 

<clarification>
This same proxy codebase version as the server is using, assuming it's using 
the latest.  The proxy bundle codebase might not have been published to a 
repository, it may be an httpmd url for example.
</clarification>

proxy communicating in case that last bug fix release addresses a security 
issue.  All proxy classes are implementation only classes.

Because the proxy bundle manifest declares version import ranges, I'm expecting 
the framework to favour already loaded bundles to satisfy package import deps.

If the client is matching service api with the correct import package version 
ranges (requirements defined by entry's), the proxy bundle should find the 
service api and other imported packages are already loaded.  Eg the client may 
use the requirements to use the resource  service or whatever the new bundle 
repository standard service is called now to preload the requirements.  The 
client may also perform upgrades before downloading a service.

In the majority of cases I don't think there's going to be much state in the 
smart proxy that can't be loaded via the smart proxy bundle and it's package 
imports, except for the odd handback, which the client bundle should have the 
opportunity to resolve before resorting to using an annotation.

I'm not quite ready to agree it's too complex and it's unsolveable, I think we 
should at least explore it and understand it before we junk the idea of 
supporting OSGi.

Rather than utilise the Java2 class  loading I was planning to cast 
ClassLoaders to BundleReference where appropriate and utilise the Bundle.

I did notice you're interpretation  of what I've written is different than 
mine, so I think I need to put some effort into communicating more effectively. 
 I think you're interpretation  of codebase annotation "version is fixed" 
ignores that the annotation is only consulted after determining that the 
current class is not available in our Bundles currently participating in 
deserialization.   It doesn't apply to resolved imported packages as 
annotations aren't used for them at all

For example, the first class we attempt to resolve during unmarshalling belongs 
to a smart proxy, the client Bundle can't find the class. Ask the framework to 
load the proxy bundle from the codebase annotation, it does so and resolves all 
necessary package imports declared in its manifest.  We now continue 
deserializing the smart proxy class fields with the visibility of the smart 
proxy's bundle.  The smart proxy may contain fields referencing objects 
resolved from its imports, we ensure those classes deserialize their fields 
with the visibility of their own bundles.

Every time we can't  resolve a class we first check if it's a handback or 
parameter from a preceeding object in the graph, thus we walk our graphs bundle 
stack.

If we still haven't resolved a class only then do we load a bundle from it's 
codebase annotation url and check it can be cast to the field before asdigning 
it.  If it can't be cast to that field, we throw an exception.

In the case of a non smart proxy, there is no codebase, deserialization will be 
loaded by and  rely completely on the visibility of the client bundle.

I think OSGi will be a lot less dependant on annotations than say a std env.

Still I guess wiring may be an an option, so as Michael suggests, annotate 
objects with their wiring graphs.

What would we be considering if we hadn't been pre exposed to codebase 
annotations?

Standard deserialization uses one classpath, each bundle has its own unique 
classpath.

Cheers,

Peter.

Sent from my Samsung device.
 
  Include original message
---- Original message ----
From: Niclas Hedhman <nic...@hedhman.org>
Sent: 08/02/2017 12:32:35 am
To: dev@river.apache.org
Subject: Re: OSGi - deserialization remote invocation strategy

TL;DR 
1. It sounds awfully complex, because my gut says that it is not a solvable 
problem, especially since I don't see 4 distinct cases; 
Server(osgi)+Client(osgi), Server(osgi)+Client(plain), 
Server(plain)+Client(osgi) and Server(plain)+Client(plain), where the last 
one is what we currently have. I am not sure how many of those you are 
discussing. 

2. For Server(osgi)+Client(plain), I don't think there is a generic 
solution, due to "uses" in OSGi terms. The same object graph can contain 
multiple versions of the same classes in non-hierarchical order, which to 
me seems to be incompatible with the Java2 classloading mechanism. 
Replicating this, would effectively need (as Michal suggested) the OSGi 
framework to be booted up on the client (not impossible per se). 

3. For Server(plain)+Client(osgi), I think tthat the 'easy' solution is 
collapse all dependencies into a single bundle and load that bundle on the 
OSGi framework. The exception being the API classes, i.e. those that must 
have been present on the client to be able to use the service. Exactly how 
to figure that out could be complicated and I have no good answers. 

4. For Server(osgi)+Client(osgi), number of options goes up. In this space, 
Paremus has a lot of experience, and perhaps willing to share a bit, 
without compromising the secret sauce? Either way, Michal's talk about 
"wiring" becomes important and that wiring should possibly be 
re-established on the client side. The insistence on "must be exactly the 
same version" is to me a reflection of "we haven't cared about version 
management before", and I think it may not be in the best interest to load 
many nearly identical bundles just because they are a little off, say stuff 
like guava, commons-xyz, slf4j and many more common dependencies. 

Peter wrote; 
> This is why the bundle must be given first 
> attempt to resolve an objects class and rely on the bundle dependency 
resolution process. 
> OSGi must be allowed to wire up dependencies, we must avoid attempting to 
make decisions about 
> compatibility and use the current bundle wires instead (our stack). 

Well, not totally sure about that. The 'root object classloader' doesn't 
have visibility to serialized objects, and will fail if left to do it all 
by itself. And as soon as you delegate to another BundleClassLoader, you 
have made the resolution decision, not the framework. Michal's proposal to 
transfer the BundleWiring (available in runtime) from the server to the 
client, makes it somewhat possible to do the delegation. And to make 
matters worse, it is quite common that packages are exported from more than 
one bundle, so the question is what is included in the bundleWiring coming 
across the wire. 


HTH 

On Tue, Feb 7, 2017 at 8:14 PM, Peter <j...@zeus.net.au> wrote: 

> Proposed JERI OSGi class loading strategy during deserialization. 
> 
> Record caller context - this is the default bundle at the beginning of the 
> stack.  It is obtained by the InvocationHandler on the 
> client side.  The InvocationDispatcher on the server side has the calling 
> context of the Remote 
> implementation.  The reflection dynamic proxy must be installed in the 
> client's class loader, so the 
> InvocationHandler knows exactly what it is, it will be passed to the 
> MarshalInputStream.  Any 
> interfaces not found in the client's bundle can be safely shed.  For a 
> smart proxy the reflection proxy will 
> be installed in the smart proxy loader.  The smart proxy is obtained 
> either via a reflection proxy or a MarshalledInstance. 
> MarshalledInstance also passes in the callers loader to the 
> MarshalInputStream. 
> 
> The smart proxy classloader is not a child loader of the clients loader, 
> instead it's a bundle that imports 
> service api packages, with a version range that overlaps those already 
> imported by the client. 
> 
> Both Invocationhandler and InvocationDispatcher utilise MarshalInputStream 
> and MarshalOutputStream, for marshalling parameters and return values. 
> 
> The codebase annotation bundle's manifest contains a list of package 
> imports. 
> 
> Do we need to make a list of package imports for every new bundle that we 
> load? 
> Do we need to record the wiring and packages and their imports from the 
> remote end? 
> 
> I don't think so, the bundles themselves contain this information, I think 
> we just need to keep the view of available classes relevant to the current 
> object being deserialized. 
> 
> Codebase Annotations are exact versions!  They need to be to allow the 
> service to ensure the correct proxy codebase is used.  Other proxy 
> codebases will be installed in the client, possibly different versions, but 
> these won't be visible through the resolved dependencies, because the proxy 
> codebases only import packages at the client and OSGi restricts visibility 
> to the current bundle's own classes and any imported packages. 
> Instead of appending dependencies to the codebase annotation they'll need 
> be defined in the proxy's bundle manifest.  Of course if an identical 
> version of a proxy codebase bundle is already installed at the client, this 
> will be used again. 
> 
> Because a bundle generally imports packages (importing entire bundles is 
> discouraged in OSGi), there may be classes 
> that aren't visible from those bundles, such as transient imports, but 
> also including private packages that aren't exported, private 
> implementations need to be deserialized, but is it possible to do so 
> safely, without causing package 
> conflicts?   Private implementation classes can be used as fields within 
> an exported public object, but cannot and should not 
> escape their private scope, doing so risks them being resolved to a bundle 
> with the version of the remote end, instead of the locally resolved / wired 
> package, causing ClassClassExceptions. 
> 
> Initial (naive) first pass strategy of class resolution (for each branch 
> in the serialized object graph)?: 
> 1.    Try current bundle on the stack (which will be the callers bundle if 
> we haven't loaded any new bundles yet). 
> 2.    Then use the package name of a class to determine if the package is 
> loaded by any of the bundles 
> referenced by the callers bundle imports (to handle any private 
> implementation packages 
> that aren't in the current imports).  Is this a good idea? Or should we go 
> straight to step 3 
> and let the framework resolve common classes, what if we use a different 
> version to the 
> client's imported bundle?  Should we first compare our bundle annotation 
> to the currently 
> imported bundles and select one of those if it's a compatible version? 
> Yes, this could be an 
> application bundle, otherwise goto 3. 
> 3.    Load bundle from annotation (if already loaded, it will be an exact 
> version match).  Place the 
> new bundle on top of the bundle stack, remove this bundle from the stack 
> once all fields of 
> this object have been deserialized, returning to the previous bundle 
> context.  We are relying 
> on the current bundle to wire itself up to the same package versions of 
> the clients bundle 
> imports, for shared classes.  Classes that use different bundles will not 
> be visible to the client, 
> but will need to be visible to the current object's bundle. 
> 4.    Place a bundle reference on the stack when a new object is 
> deserialized from the stream and 
> remove it once all fields have been deserialized. (we might need to 
> remember stack depth). 
> 5.    Don't place non bundle references on the stack.  For example system 
> class loader or any 
> other class loader, we want resolution to occur via the OSGi resolution 
> process. 
> 
> What about a simpler strategy (again naive), where we don't attempt to 
> resolve private implementation classes? 
> 1.    The calling class' bundle, is given priority. 
> 2.    Load bundle from annotation (exact version), when not found in 
> calling class. 
> 3.    No stack, what if an application bundle from server is loaded that 
> conflicts with an existing 
> bundle resolved by the client? 
> 4.    What about walking back through the stack?  Probably unnecessary, as 
> the containing object 
> will reference the class by a common interface, the outer object may not 
> need to reference 
> it at all.  But what if the outer object passed it in during construction? 
> 
> Revised strategy: 
> 1.    Attempt to load from current bundle on stack (the stack begins with 
> the client's Bundle, each 
> node in the graph has its bundle added to the stack and is also removed 
> after that node is completely deserialized. 
> 2.    If unsuccessful, walk back through deserialized bundle reference 
> stack and attempt to load class. 
> Why not start at the beginning of the stack?  We are expecting bundles to 
> wire up to 
> currently loaded versions, but bundles can import different package 
> versions for 
> implementation, safest to start with current bundle and consult parent if 
> not found in the current bundle 
> dependency graph, ie possibly passed in during object construction or an 
> handback 
> implemented in the client, from an earlier invocation or dependency 
> injected. 
> 3.    The client is responsible for determining compatibility with the 
> service api it's interested in 
> from the Import Package Entry's, prior to unmarshalling a service proxy. 
> 4.    If a bundle previously on the stack resolves a class, then this 
> object's bundle reference is placed 
> on the top of the stack, it is removed once the current object and all 
> it's fields have been completely deserialized. 
> 5.    Load bundle from annotation (exact version). 
> 6.    No attempt will be made to directly load from wired bundles, always 
> rely on wires, 
> otherwise we may utilise an incompatible package / bundle. 
> 
> Do we need a graph of the wiring from the remote end? 
> During serialization (from the remote end) do we need to determine if a 
> bundle has dependants and send some sort of version range information? 
> When a class descriptor is read in from a stream, the class descriptor 
> contains information 
> about fields and it's serializable supertype class (if it exists) 
> are also read in from the stream, before any field objects are read in, 
> the declared field types 
> are visible from the bundle of the current object being deserialized.  The 
> objects that will be 
> assigned to those field types must also resolve to those types.  Hence 
> bundles being resolved as part 
> of deserialization must favour already resolved packages for imports. 
> What if a bundle requires a specific package version?  This is why the 
> bundle must be given first 
> attempt to resolve an objects class and rely on the bundle dependency 
> resolution process. 
> OSGi must be allowed to wire up dependencies, we must avoid attempting to 
> make decisions about 
> compatibility and use the current bundle wires instead (our stack). 
> 
> The BundleReference stack is designed to follow the wires (dependency 
> links between bundles), 
> to allow private classes to be resolved, as they're not visible from other 
> bundles. 
> 
> We can't rely on annotations to resolve private classes, because we can't 
> predict the way bundle 
> dependency's are resolved in remote JVM's. 
> 
> General recommendations for OSGi: 
> *    The service should use as wide a version range as possible for 
> service api. 
> *    It is better to create new service api in a new bundle than to evolve 
> in a backward compatible manner, as 
> an incremental change may not be compatible if additional classes and 
> methods are missing 
> from the client, that the service proxy depends on. 
> *    Don't split packages. 
> *    Private implementation classes are ok, provided they remain within 
> public exported classes and don't escape, otherwise 
> they may not link up properly upon deserialization. 
> *    The proxy should minimise the package imports it uses. 
> *    There must be only one compatible service api version installed 
> already in the client. 
> *    Duplicates of incompatible versions of service api are ok. 
> 
> The catch is, it may not be possible to build the bundle stack without 
> some programming hooks in ObjectInputStream. 
> 
> Unfortunately we don't have any control over OIS, the necessary hooks 
> could however be added to AtomicMarshalInputStream. 
> 
> Cheers, 
> 
> Peter. 
> 



--  
Niclas Hedhman, Software Developer 
http://polygene.apache.org <http://zest.apache.org> - New Energy for Java

Re: OSGi - deserialization remote invocation strategy

Reply via email to