Re: Module isolation

Bryan Atsatt Mon, 18 Jun 2007 14:10:55 -0700

Stanley M. Ho wrote:

Hello Bryan,


Bryan Atsatt wrote:

...
But this approach doesn't really work very well. What happens when,
moments after releasing a module, another application is deployed that
needs the same module? It gets a different instance. And that could
easily lead to ClassCastExceptions and/or LinkageErrors.

How is it possible to know when it is safe to "free up resources"?


The use case I have is that sometimes we might want to make a module
temporary unavailable (e.g. turning off a plugin in the IDE), without
shutting down the repository (possibly with hundreds/thousands of other
modules) or uninstall the module. In this case, the container will
trigger not only the release of the existing module instance (so it will
have a chance to be GCed eventually), but it will also make the module
definition invisible (through visibility policy) from other modules.
Without the second part, it has the potential problem you described.


This use case seems to presume that the IDE can/will ensure that there
is only *one* consumer of the module: itself. What if a different plugin
has a dependency on it, and has already been resolved? Is the intention
that the IDE will be wired into the module system deeply enough to
manage this correctly?

Again, this is why we need either a real, general, isolation model, or
an access control model:

Containers, of any kind, must be able to explicitly control access to
"private" modules.


An application server, for example, will need to keep one application
from accessing the modules of another. And it must *know* that there is
no sharing, so that the application lifecycles can remain independent.

So we need some notion of a context. A purely private repository
instance is one (probably good) possibility. Another is the wrapper
Repository approach, but this requires definition copies (and management
of sharing content, lifecycle, etc).

I started this thread with another, explicit type (ModuleContext), but
it isn't obvious how to use such beast correctly.

An access control model would also work, where all parties can share
Repository instances, but we somehow discriminate between *callers* to
return different results.

We may even want to support some mixture of private + access control.

Regardless of what approach we take, the releaseModule() idea is too
simplistic. Having originally created the detach() method, thinking
along similar lines as you are with the plugin case, I do understand the
idea; I just no longer think it is sufficient :^).

The only "safe" time to release a module is when there are *zero*
importers, and, even then, you must hide/release atomically, ensuring
that no new imports spring up during the operation.


Note that I don't think this is a common thing many developers want to
do. In fact, I think we should discourage most developers from calling
releaseModule() because of the potential consequences. On the other
hand, we shouldn't preclude this use case either. If you have better
suggestion to address this use case, I would like to hear it.

Well, I agree in theory. But... I am struggling to understand how we
provide an isolation model. If:

- There is a 1:1 relation for ModuleDefinition<->Module instance, and
- Isolation requires separate Module instances, then
- Isolation requires separate ModuleDefinition instances

If this is our isolation model, then how does a ModuleSystem instance
support this? Clearly, it would need to keep a mapping from each
ModuleDefinition to its Module instance. How is this simpler, or better?

In your current model (just as in my detach() model), there is still a
1:1 from *at any given moment*, at least from the perspective of the
definition.

Are you thinking that the ModuleSystem would have to keep track of
released modules?


The ModuleSystem instance would have a <ModuleDefinition, Module>
mapping, and this should be a very simple thing to support and maintain.
Also, the ModuleSystem needs to maintain this information anyway to
avoid multiple Module instances to be instantiated for a given
ModuleDefinition, or to avoid a Module instance to be instantiated if a
ModuleDefinition has been uninstalled or belongs to a repository which
has been shutdown. And yes, the ModuleSystem would have another map to
keep track of all the ModuleDefinitions that are no longer
usable/instantiated. Having the information centralized in one place has
other benefits too, e.g. if we want to find out what the outstanding
module instances from module definitions in all repositories, the
ModuleSystem can provide the answer easily.


Sure. (And the ModuleSystem could not use strong references to hold
released Module instances, else it would prevent GC.)

My view is that ModuleSystem needs to keep track of various runtime
information related to ModuleDefinition and Module anyway, so I don't
see clear benefit in moving part of that information into other classes.


Other than Module instances, what other "runtime information" is there
to keep track of? Caches of exported packages?

Understand here that I am *not* focused on the idea of caching Module
instances on ModuleDefinition; I do understand the desire to avoid
polluting a stateless type.

I am trying to figure out if we *could* do so, as a means of pinning
down precisely what our model is.

For example, if we were to eliminate the releaseModule() method (in
favor of some more complete mechanism), then there really is always a
1:1 for Module:ModuleDefinition, and the model is simple and obvious.
(And therefore a field cache *could* be used).

But this approach means that the wrapper must be tightly coupled to the
wrappee. While I can imagine this holding true in some cases, it
certainly doesn't seem like the common case.

Why should an applet container, for example, be required to know the
*type* of ModuleDefinition subclasses in the repository?

Providing some sort of copy() operation on the base class eliminates
this kind of coupling.


Perhaps I still don't fully understand how you will want to create a new
repository for isolation, and why having the knowledge of the
ModuleDefinition subclasses is not acceptable in this context.

As I previously hinted, having some sort of copy() operation in the base
class is not feasible because many module definitions does not make
sense to be cloned. Cloning also implies that the underlying lifetime of
the ModuleDefinition (and the ModuleDefinitionContent) from two
different repositories could be arbitrarily tied, and I don't think it
makes sense unless the repositories have the same owner.


I completely agree that cloning raises concurrency and lifecycle issues.
But these don't go away just because you know the actual type of a
ModuleDefinition!

I am not in any way wedded to the copy idea. I am just trying to find
*some* solution that enables private Module instances; the copy idea was
actually yours :^).

In the case of an applet container, my expectation is that the container
will construct some kind of AppletURLRespository for each codebase with
some ModuleDefinition subclass, and each ModuleDefinition is
instantiated with a custom ModuleDefinitionContent implementation with
the applet cache as the backing store. In other words, the applet
container or the AppletURLRepository already has knowledge about the
ModuleDefinition subclass. If the applet container needs a new
repository for isolation, then it would construct another
AppletURLRespository, and this new AppletURLRespository could construct
each ModuleDefinition using the existing ModuleDefinitionContent instance.


So, in effect, we have 100% *private repository* instances.

I've been thinking that we need an intermediate somewhere between a
shared/public repo instance and an entirely private one, but... that now
strikes me as too fuzzy, and I can't see a real use case :^)

So an application server would have to create, say, a private
LocalRepository instance to hold the modules of a single application.
And it would have to ensure that no other application could get it's
grubby paws on that repository instance.

Ok. That works for me. And it eliminates the need for cloning AND for
releaseModule():

1. Any given Repository instance is either 100% shared or 100% private,
with *no* in-between.

2. The lifecycle of a shared repository instance is that of the process.

3. The lifecycle of a private repository instance is entirely up to the
creator of that instance.

4. The lifecycle of a ModuleDefinition/Module is at most that of the
enclosing Repository instance, and at least is bounded by
install/uninstall (no finer granularity).


This seems like a clean simple model: private Modules via private
Repositories.

And a private repository can work for the Applet, IDE plugin or EE
application cases just fine.

It does leave open the issue of dependencies *within* a private
repository. The simple model would be to treat the entire repository as
atomic, with any change requiring a new Repository instance. This is
probably too simplistic, however.

In an EE app, web-modules are supposed to be isolated from each other
and from other parts of the app (ejb, connectors, etc.). So this
requires either a further partitioning of the app into multiple
repositories, or some form of access control.

Further, it is possible to re-start or re-deploy/re-start only a single
web-module, *without* restarting the rest of the app. The re-start case
could use releaseModule(), though a "real" stop method would be
preferable. But this is a very special case in which the specs
essentially dictate the possible dependencies between modules. In the
general case where the dependencies are not dictated, releaseModule() is
problematic.

The re-deploy case would use uninstall/install (but would still like stop!).


If you *really* want the releaseModule() functionality, I would suggest
that we introduce a PrivateRepository type, and support release *only*
on that type.

// Bryan


- Stanley

Re: Module isolation

Reply via email to