Re: Maven repository Entry was Re: Codebase service?

Dennis Reedy Tue, 25 May 2010 05:09:28 -0700

Hi Peter,

Thanks for the detailed reply, comments interspersed below, Also, from a 
housekeeping point of view (if not already done), it would be great if Jira 
issues could be created for items below.


Dennis

On May 25, 2010, at 205AM, Peter Firmstone wrote:

> Hi Dennis,
> 
> Reasoning and hopefully the why's? below.

> 
> Dennis Reedy wrote:
>> Hi Peter,
>> 
>> I was hoping to take a step back for a second, perhaps its just me that 
>> seems to have my head spinning of late on this list. I may have missed some 
>> things, but we've discussed many issues over the past week:
>> 
>> - How to advertise the DL jar(s) a service vends, allowing a client to 
>> download requisite jars that allow the jars to be loaded from a local 
>> (trusted) location
>>  
> Yes, we can use an Entry, or as Chris pointed out, if we annotate 
> MarshalledInstance's using a new Maven URL schema we can extract that info 
> and make it available via MarshalledServiceItem (An abstract class that 
> extends ServiceItem).

I dont think a new Maven URL schema has actually been proposed? Why wouldnt we 
just use a String attribute in an Entry that is of the form 
groupId:artifactId:version:classifier?

> 
>> - Given the capability above, the need for a codebase service may not be 
>> required
>>  
> Agreed
>> - Conventions on how to develop River services, as it relates to jar naming, 
>> packaging and what dependencies are between the various artifacts
>> - How to possibly move forward with utilizing Maven repositories and the 
>> implied capabilities of published artifacts
>> - The development of a maven archetype to allow a developer to easily create 
>> a working project in seconds
>>  
> Yes to all above.
>> Your attention to detail and the documentation of how class loader 
>> interactions with regards to security is great. I'd like to understand the 
>> requirements of what you have documented below, the urge to refactor 
>> MarshalledInstance, and why the new class loader hierarchy needs to be added 
>> to River.
>>  
> 
> The urge to refactor MarshalledInstance is to allow the URL annotation to be 
> requested directly and passed via StreamServiceRegistrar and combined with 
> delayed unmarshalling of proxy's via MarshalledServiceItem, to allow the 
> client to provision and provide an alternate CodeSource if need be.

So this is related to the first bullet above, allowing a client to download 
requisite jars. I dont see why MarshalledInstance needs refactoring if we 
already have the jar(s)/artifact that can be provisioned. In the case of an 
artifact, it may not matter what the MarshalledInstance provides, because the 
artifact's location will most likely be in a repository.

> 
> StreamServiceRegistrar returns a ResultStream<ServiceItem> , so you have 
> check with instanceof MarshalledServiceItem.
> 
> The new packaging Scheme

packaging scheme?

> can be applied to distributed objects also, provided we create an 
> implementation of CodebaseAccessClassLoader (contributed by Gregg to replace 
> RMIClassLoaderSPI) that performs or requests local Maven archive provisioning.

As I have pointed out earlier you'll need more information on where to get the 
artifacts from, specifically the maven repositories to access. I dont know if 
the RMICLassLoaderSpi is the right place to put this added functionality or not 
at this time.

> 
> The new ClassLoader hierarchy is needed, to solve class identity (fully 
> qualified runtime classname = class + ClassLoader), class visibility, 
> isolation and versioning problems, that PreferredClassProvider partially 
> solves.

>> Perhaps I'm just missing some fundamental issues, but maybe we need to take 
>> some time and determine the whys before the hows? Is this direction 
>> fundamental to the OSGi direction that you're taking? If so, how does this 
>> impact non-OSGi based systems?
>>  
> The changes are OSGi agnostic, OSGi will live in the application space, so 
> while they benefit OSGi, they are independent of it, so the same benefits 
> will apply to other software and OSGi isn't required.
> 
> I realised that fundamentally OSGi uses ClassLoaders for isolating software 
> into components, so implementation classes aren't exposed outside of their 
> module, something which OSGi does very well, it also manages security 
> concerns very well.  Something else I realised, OSGi's use of ClassLoaders is 
> not optimum for distributed systems, there are difficulties determining the 
> correct ClassLoader during deserialization. OSGi wasn't designed with 
> Serialization in mind.  Distributed computing introduces another dimension, 
> like going from 2D to 3D,  in OSGi, you only have one bundle version 
> combination loaded (you can have many bundles of different versions but I 
> believe typically only one of each unique bundle instance, you can have the 
> same package version exported by differently versioned bundles). So how do 
> you determine the correct ClassLoader during unmarshalling.  In River we may 
> have many proxy's using the same jar version, however we don't want the 
> proxy's implementation to get all tied up in the local application bundles, 
> we'd be allowing the smart proxy to pollute the local application space, some 
> parts of the local application could see the proxy implementation.
> 
> In our new ClassLoader tree, a smart proxy can have it's own personal 
> ClassLoader,

As it would today through the class loader the RMIClassLoaderSPi returns right?

> because the ContextClassLoader will be that of the proxy's during returning 
> object deserialization, since it initiated the communication with the remote 
> Service host.  The reason a clients parameter implementation cannot have it's 
> own ClassLoader and must share with other clients that use the same codebase 
> and version is that they have no link to the ClassLoader at the remote 
> Service host, with ony the Codebase and Version to go by, since they didn't 
> initiate the communication, there could otherwise be many ClassLoaders 
> containing that codebase version, there not enough information to find it, 
> the last thing I want to do is require the client have an identity or 
> location to deal with that deserialization of parameters at the Service node.

If the client has the service-api.jar in it's classpath, why are there issues 
surrounding the client's parameter implementation's class loader?

> 
> Rather than take, "how you use OSGi" and apply it to River, I decided to 
> understand why they solved their problems the way they did and learn from it. 
>  It is a very good solution to the problem they've solved.  However with our 
> solution we can solve the deserialization issue for distributed applications 
> utilising OSGi.
> 
> Currently River uses Permission grants based on ClassLoader, (so does OSGi), 
> what I realised was I needed a finer grained Permission grant and having many 
> ProtectionDomain's inside one ClassLoader is about as fine as you can get.  
> Only one ClassLoader is used for the API space for class identity reasons, to 
> allow maximum sharing of API classes because you just can't control and 
> coordinate someone else's JVM's ClassLoader visibility, without overcoming 
> some serious trust issues (Simpler is better I don't even want to attempt to 
> solve them!). There is however one compromise with my approach.
> 
> By loading all API classes into the same ClassLoader, we cannot have 
> duplicate classes, so we must always load the latest API version, that must 
> not break backward compatibility. If the backward compatibility constraints 
> are hampering your design, it's simply better to deprecate a package and 
> append a number to change the package name.  (Or create a completely new API 
> jar)

If I understand correctly I think this is the crux of the issue. I dont 
understand why you need to load all API classes with the same class loader. 
FWIW, in Rio we handle the loading (and unloading) of services with the 
following structure 
(http://www.rio-project.org/apidocs/org/rioproject/boot/package-summary.html#package_description):

                  AppCL
                    |
            CommonClassLoader (http:// URLs of common JARs)
                    +
                    |
                    +
            +-------+-------+----...---+
            |               |          |
        Service-1CL   Service-2CL  Service-nCL
        

AppCL - Contains the main() class of the container. Main-Class in manifest 
points to com.sun.jini.start.ServiceStarter
Classpath:  boot.jar, start.jar, jsk-platform.jar
Codebase: none

CommonClassLoader - Contains the common Rio and Jini technology classes (and 
other declared common platform JARs) to be made available to its children.
Classpath: Common JARs such as rio.jar
Codebase: Context dependent. The codebase returned is the codebase of the 
specific child CL that is the current context of the request.

Service-nCL - Contains the service specific implementation classes.
Classpath: serviceImpl.jar
Codebase: "serviceX-dl.jar rio-dl.jar jsk-lib-dl.jar"

Certainly not as sophisticated as OSGi (or what you are targeting), but it 
meets the requirements of allowing multiple service versions, applying security 
context per class loader using the same approach as ActivateWrapper, and allows 
the JVM to stay running. 

> 
> org.some.thing
> org.some.thing2
> 
> The reason we version packages is so we don't have to rename them when they 
> break backward compatibility, this makes sense for implementations, but not 
> API.  If your going to have long lived persistent objects they belong in the 
> API space, if you don't need to persist your objects, why not have an 
> interface and throwaway class implementations, this solves Serialization 
> exposing class internal state and evolution.  Extend the interface if you 
> wan't new methods.
> 
> If a JVM has been running a long time, a new API version may have been 
> released, clients using the old API functionality only, won't be able to see 
> or utilise the new functionality until we restart the jvm.  That is the 
> compromise.  But I figure it's not too bad a compromise once API's have 
> stabilised and go into longer development cycles.  I can handle having to 
> restart my JVM once every 6 months.
> 
> I think Michael Warres got to the crux of the problem with his publication on 
> ClassLoader issues, my interpretation of what he said, is perhaps java should 
> tear apart the multiple ClassLoader concerns, of Security, Isolation and 
> Identity and start again.  I've chosen what appears to me to be the best 
> compromise based on Java ClassLoader's today.
> 
> So this new ClassLoader hierarchy should play nice with Maven,

I would suggest it has nothing to do with Maven. Maven is just used as a 
repository for us, and optionally as a way to build services.


> OSGi and other stuff too, because now the API is visible to everything below 
> in the ClassLoader hierarchy, while the implementations below, don't expose 
> themselves, instead, everything cooperates through the API.
> 
> OSGi can be used to synchronize ClassLoader visibility between two separate 
> JVM's, however that still requires the implementer deal with deserialization 
> issues, with our solution, we won't have to worry much about ClassLoader 
> issues.  With Maven, we won't have to worry about lost codebases either.
> 
> Yep, it has been a bit of a head spin, needed your help to work out the 
> details before I forgot them.
> 
> There is one more detail, I'd like to include in the jar archive: a list of 
> permissions the jar needs.  I'd like to use the same format OSGi uses, 
> because it's been done before, why be different.  This is to solve the: "what 
> grants does it need?" Problem. So we can minimise permission grants.

Yes, I'm interested in this as well.

Re: Maven repository Entry was Re: Codebase service?

Reply via email to