RE: Repository, Bootstrapping and Embedding

Alex Karasulu Wed, 05 Nov 2003 19:50:54 -0800

Excuse the lag I just got back from dinner with the family and put my
daughter to sleep.


Ok let's take thinks line by line inline:

> >When the stages of bootstrapping are analyzed we realize that the
> Repository
> >itself must be used to bootstrap the kernel.  The repository must give us
> >the locally cached artifacts used to assemble the Kernel's ClassLoaders.
> In
> >many respects it makes sense to ask the repository to build a ClassLoader
> >hierarchy on our behalf given any artifact.
> >
> 
> I like the idea of moving the ClassLoader construction as a service
> provided
> by the repository. It's nice!

Thanks!

> >This is where the power to query the repository for POM information
> becomes
> >critical.  When the repository is asked for a jar artifact it should
> return
> >the ClassLoader for that artifact and not a simple yes no answer.  The
> >ClassLoader regardless of the nesting scheme used should provide for all
> >runtime jar dependencies associated with original jar artifact requested.
> >
> 
> I think that using a POM as the source of information for a
> ClassLoader definition is feasible - but I have reservations
> concerning directly accessing a POM.  Basically a POM contains
> build time and test time dependencies which are not necessarily
> the same as runtime dependencies.  Secondly, runtime
> dependencies can involve policy decisions that are not
> expressed in the POM model as it stands today.
> 

Right the POM as I conceive of it will be much more than what it is today.
It will encompass the entire life-cycle of the SDP - putting something into
production and maintaining it there in a runtime state is as much a part of
a project as writing the code.  Deployment is also another major aspect.  I
can go on for a while here and get OT but you get the gist.  

Also when I say query the POM I'm thinking of the POM as a data model within
a directory.  It is the perfect place - I can't tell you how well the fit
goes together.  Perfect!  The hierarchy and relationship modeling
capabilities make it a match made in heaven.  So when you query the POM you
execute an LDAP query.  In our case we're sprinkling properties files all
over the repository directories.  These properties would eventually become
attributes that can be queried in a directory.  For now if we design this
properly we can swap out the code that snarfs down these property files with
code that executes a directory query.

> However - we can use a POM to generate classloader construction
> criteria. I've been playing around with jelly to automate the
> generation of classloader criteria using dependency properties.
> 
> For example - the following dependency declarion includes a
> property named "avalon.classloader" which contains the name
> of the classloader category into which the dependency should
> be loaded.
> 
>     <dependency>
>       <groupId>avalon-util</groupId>
>       <artifactId>avalon-util-defaults</artifactId>
>       <version>1.0-dev</version>
>       <properties>
>         <avalon.classloader>impl</avalon.classloader>
>       </properties>
>     </dependency>
> 

The neat thing is we can define an LDAP schema for POM objects and start
extending them for Merlin specific attributes like the class loader category
tag of 'impl' you're using here.  That way you can interoperate with maven
and still add customizations without breaking away.  We can add more
attributes to a dependency ObjectClass definition.

Yes we can model all this as XML or get serious about it.  LDAP is serious
as a distributed directory.  Keep the jars and content on disk but keep the
POM in a directory.  This way referrals can be used between remote
repositories to interlink projects and dependencies like the web does with
XML.  It makes software interoperability an distribute internet based
activity.

> This approach assumes specific property values such as "api",
> "spi", "impl". Using this information its possible to generate
> the classloader criteria in the form of a flat properties file
> which keeps things small in terms of footprint.  This process
> could be packaged into a plugin for convenience. However,
> there is a disadvantage that we would be duplicating
> information that exists within the POM.
> 

You generate artifacts based on the POM all the time and upload them into
the repo right?  Why not generate for now a set of properties within
property files?  Yeah the info is duplicated but as soon as it is generated
it should not change - changes should require a version number increment to
the release.  

I like the idea of a plugin for maven to work this.  We can eventually
directory enable the plugin so a deploy uploads the build byproducts to the
web server of the repository and adds a release entry to the POM log.  The
artifact is now registered with build and runtime attributes.

I totally see where you are going with the attribute flagging a jar artifact
as belonging to an api, spi, or impl group and using this to assemble the
respective ClassLoaders.


> 
> 
> >What implications does this have on the Repository?  The repository when
> >given an artifact must determine the chain of dependencies by querying a
> >POM.  The repository then uses this information to download the required
> >jars into the local cache in the appropriate structure.  The cached jars
> are
> >used to construct the ClassLoader with the appropriate parents for the
> >requested jar artifact.
> >
> >
> >The bottom line: the remote repository needs to become more intelligent.
> >For the time being we can mimic this intelligence by laying out some
> >descriptor artifacts within the repository and building in the logic
> within
> >the client API which effectively mimics a queriable (is that a word?)
> >repository implementation.  If we create a good repository SPI then the
> fat
> >repository implementation can be traded in latter for a thinner one that
> can
> >talk to the repository in a better language.  Yeah I think that's LDAP so
> >what! ;-)
> >
> >
> 
> :-)
> 
> 
> >Let's watch our expression semantics and be absolutely clear.
> Bootstrapping
> >in the general sense for kernels and any other repository dependent
> >applications occur through the repository.  The repository is the general
> >bootstrapping API.
> >
> >
> 
> +1
> 
> >Now bootstrapping the repository is a completely different endeavor all
> >together however it can benefit from the functionality used to make the
> >repository serve as the general bootstrapping API.   The difference
> between
> >the repository bootstrapping and the general bootstrapping mechanism is
> the
> >need for some seed information to get the process going.  This
> information
> >can easily be embedded into the repository API or the API jar.
> >
> 
> Yep.
> 
> >For the generic bootstrapping framework to be used by Merlin or any other
> >repository aware application we need to devise some conventions around
> its
> >use.  To think about the conventions we need let's start looking into the
> >use cases for Merlin.
> >
> >If Merlin is a repository aware application it resides within the
> repository
> >and so do its dependent artifacts.  The repository stores the project
> >information for Merlin as well as its dependencies.  The dependency tree
> can
> >be determined and a nested ClassLoader structure can be assembled using
> this
> >information to safely build and run the Merlin kernel.  With regard to
> the
> >way ClassLoaders work it is best to provide a top level factory interface
> >for your application and its embedding API.  This way anything using the
> top
> >level factory automatically creates objects that inherit the factory's
> >ClassLoader.  The factory method design pattern can elegantly be used to
> >cross ClassLoaders separating the API from the implementation classes.
> >Using this pattern a Factory interface (repository aware application's
> >embedding API part) would be used to cross into the implementation
> >ClassLoader to make calls against the factory implementation (the
> repository
> >aware application's embedding implementation part).  The implementation
> >factory then creates concrete implementation products within the
> >implementation ClassLoader so the concrete product classes are isolated
> in
> >the ClassLoader of the implementation factory.
> >
> 
> OK so far.
> 
> >The repository aware
> >application's embedding API must define a special initial factory
> >implementation that acts like a delegate to a factory implementation.
> This
> >initial factory should expose a constructor that takes arguments used to
> >determine the underlying implementation and perhaps even some
> >(implementation specific) parameters to pass on to the underlying
> >implementation.
> >
> 
> Not following the above paragraph too well.

Some application like Merlin that depends on a repository will need to have
an InitialXYZFactory.  Like a JNDI InitialContext this factory uses
parameters to determine which factory implementation to instantiate.  The
InitialXYZFactory will have a constructor defined.  The constructor may take
argument to determine the factory to create using the classloader from the
repository.  Let me know if I still don't make sense - I know my babble is
not the clearest writing in the world.


> 
> 
> >
> >So for Merlin an InitialKernelFactory implements the KernelFactory
> >interface.  Users create an InitialKernelFactory (who's constructor
> >determines the factory implementation to use based on arguments).  The
> >InitialKernelFactory requests an implementation's ClassLoader from the
> >repository.
> >
> 
> Just to confirm:
> 
> 1. we use the repository to establish an api and spi loader
> 2. we locate (via some parameter) the name of the initial factory
> interface
> 3. we locate (via some parameter) the name of the initial factory
> implementation
> 4. we perhaps do dome manipulation of parameters at this point
> 5. we invoke a creation request on the inital factory implementation
> 6. the inital factory implementation uses the repository to construct
> the impl classloader and load the operational factory together with the
> implementation specific parameters
> 7. creation method returns initial object
> 
> Does that sound right?

You're off just a little let me restate it but first here are the players
for an application that revolves around a Foo.  Put it this way Foo is the
API protagonist with a bunch of ancillary helper classes and interfaces.
There are the following user's and ClassLoader environments:

Two kinds of people (perspectives) in all this:
====================================================================
* Foo API user
* Foo API developer

In API User's ClassLoader:
====================================================================
FooFactory -> factory interface
InitialFooFactory -> concrete class with exposed public constructor
Foo -> product interface
FooConfig -> configuration (interface is best but can be a class)

In IMPL ClassLoader (Returned by Repository):
====================================================================
DefaultFooFactory -> concrete factory class
DefaultFoo -> concrete product


The Foo application API user is the most important so let's think about him
first. So what does a Foo API user need to do to embed the Foo application.

1. Instantiate an InitialFooFactory providing what ever information is
required to allow this class to request a ClassLoader from the repository.

2. Call FooFactory interfaces on the InitialFooFactory instance usually in a
dual stage approach.

        a). Request a configuration object from the factory.  The factory
creates default FooConfig beans as well as Foo objects.
        b). Alter the FooConfig to meet the API user's needs and call the
create interface method on the InitialFooFactory instance to get a handle on
a Foo object.

3. Use the Foo object to your heart's content.


The Foo application API developer now needs to be enabled by us to enable
the Foo API user.  To do so we provide him with the Repository API.  Below I
talk about the things the Foo API developer must implement along with an
idea of the control flow.

1. The Foo API developer must build a concrete InitialFooFactory that
implements the FooFactory interface.  The InitialFooFactory is still an API
based class and so it is not part of any implementation although it
implements FooFactory.

2. The Foo API developer must build a concrete FooFactory implementation,
DefaultFooFactory which is part of the Foo application's implementation.  It
creates Foo implementing, DefaultFoo object instances.

3. So yes from the above the developer must also create a concrete
DefaultFoo class implementing the Foo interface must be implemented.  More
than one implementation can exist and even within the same implementation
archive although not a requirement. So FooImpl_1 and FooImpl_2 can be
implemented for creation by FooFactoryImpl_1 and FooFactoryImpl_2
respectively and put into the same impl jar or sold separately.  It's all up
to the discretion of the Foo developers.

4. The Foo API developer uses the Repository API to first get a handle on a
Repository using the same factory method pattern btw.  Do what you preach
right?  So the Foo API developer guy gets a handle on a Repository instance
and calls a method to get a ClassLoader back, here's the signature and
remember this is all going down inside the InitialFooFactory constructor:

ClassLoader getArtifact( ArtifactDescriptor a_descriptor ) ;

So the Repository interface must have this method.  We can name it
getArtifactLoader to keep existing functionality as is with the boolean
returning getArtifact method on Repository.

5. So inside a constructor of the InitialFooFactory we use the Repository
API to get a Repository instance.  We then ask it for the ClassLoader.
Using the ClassLoader we instantiate the factory implementation chosen.  For
example the arguments to the InitialFooFactory may have selected the
instantiation of DefaultFooFactory rather than the FooFactoryImpl_1 or
FooFactoryImpl_2 factories.  So the ClassLoader is used to load and
instantiate the DefualtFooFactory.  Once the DefualtFooFactory is created
the InitialFooFactory is ready to delegate calls on its FooFactory methods
to the DefualtFooFactory instance the constructor code created.  After this
point the InitialFooFactory only becomes a wrapper around the actual factory
implementation object.

6. The API developer must make sure that the DefualtFoo class is loaded on
creation by the DefaultFooFactory using the ClassLoader.  The DefaultFoo
Class instance is then used to newInstance the DefaultFoo objects if it uses
a default constructor otherwise a Constructor is used.  

> 
> And the reason that the factory is creating the impl classloader is
> because the contents of the impl classloader may be a function of the
> parameters that we provide to the initial factory.

The repository can determine the details of the ClassLoader to create based
on POM information.  All it needs to know is the ArtifactDescriptor of the
implementation jar you intend to use.  Using this unique key it can lookup
the dependencies required, download them and build the impl ClassLoader.

> 
> >This request consists of determining the implementation to use
> >which then results in the creation of an instance of ArtifactDescriptor
> >which is part of the Repository API.  The call to the repository to get
> the
> >ClassLoader takes this descriptor as an argument.  The repository does
> its
> >magic: it queries for implementation dependencies, pulls down dependency
> and
> >implementation artifacts and builds the implementation ClassLoader (with
> SPI
> >and API parent ClassLoaders in a chain) based on cached jar files.  The
> >InitialKernelFactory then uses this ClassLoader to instantiate the
> >implementation's concrete Factory which it should know how to do using
> >reflection.  In Merlin's case this might be the MerlinKernelFactory.  The
> >InitialKernelFactory then delegates calls made on the KernelFactory
> >interface methods to the implementation factory delegate
> >(MerlinKernelFactory instance) that was instantiated within the context
> of
> >the Repository assembled ClassLoader.  Now users of the kernel embedding
> API
> >use the InitialKernelFactory as the pass through to tunnel into the
> >implementation ClassLoader and make calls against the
> MerlinKernelFactory.
> >All factory products returned like Kernel objects for example are based
> on
> >the implementation chosen.  In the case of Merlin it would be the
> >MerlinKernel.
> >
> 
> I think I've got it.
> 
> >If we back off for a moment and look at the big picture we may have a
> >generalized Avalon Kernel embedding API that can be used for all Kernel
> >implementations: Merlin, Pheonix et cetera.  This all comes down to
> agreeing
> >upon a common Kernel interface and KernelFactory interface and putting
> these
> >interfaces into the framework API.  If not we certainly have a
> generalized
> >repository aware application bootstrapping API and that's very valuable
> in
> >itself.
> >
> 
> I think its too early for a framework level defintion of a
> kernel - but I do think the repository stuff is heading in the
> right direction to be general container-side facility along-side
> framework and meta.

NP but Avalon should get there at some point.  I will leave that up to the
future.  For now I need to move and get this embedding done.  I'm getting
anxious.  The idea of firing up Merlin on the first InitialContext request
to bring up Eve is haunting me ;-).

> 
> >
> >This is a lot of babble for one email.  I will try to break this down
> into
> >understandable short and easy chunks.  I know I have not done a good job
> in
> >describing it.  I will also try to have some diagrams showing the use
> cases
> >and object interactions sequence charts.   Once the documentation is
> >complete the implementation will become very apparent.  I already began
> an
> >implementation but stopped myself to document it and get a consensus.
> >
> >Yeah if you're thinking this effort needs to be kept in a separate
> >development branch, then you are right.  I can also develop it within the
> >sandbox.
> >
> 
> Lets put it together in sandbox (because Steve is a CVS wimp).

Will do then let's suspend development for now on the trunk, and move a copy
of the just the repository group to the sandbox.  You want me to do this or
are you there by now.  

> 
> >Steve I know I've been taking a while here but I think its worth doing
> this
> >right the first time.  I don't want to come back to it and have to
> redesign
> >it.
> >
> 
> No problems here!
> 
> The factory handling side is a little fuzzy but I'm sure that will come
> together.  I'm more concerned about the question concerning dependency
> criteria resolution.  Do we use the POM directory or do we generate an
> artefact.  For the moment I'm leaning towards the generation of the
> artefact from the POM.

We have no choice until the POM lives inside a database and we can access
the database and query it through some line protocol like maybe hmmm LDAP!

Alex



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Repository, Bootstrapping and Embedding

Reply via email to