Re: Module isolation

Bryan Atsatt Wed, 20 Jun 2007 16:13:46 -0700

Stanley M. Ho wrote:

Hi Bryan,


Sorry, I was sidetracked in the last two days, will follow up other
threads soon.

Bryan Atsatt wrote:

This use case seems to presume that the IDE can/will ensure that there
is only *one* consumer of the module: itself. What if a different plugin
has a dependency on it, and has already been resolved? Is the intention
that the IDE will be wired into the module system deeply enough to
manage this correctly?


It won't affect existing plugins that has also been initialized with the
dependency. My assumption is that the IDE should know what it's doing
when using the APIs.

An application server, for example, will need to keep one application
from accessing the modules of another. And it must *know* that there is
no sharing, so that the application lifecycles can remain independent.

So we need some notion of a context. A purely private repository
instance is one (probably good) possibility. Another is the wrapper
Repository approach, but this requires definition copies (and management
of sharing content, lifecycle, etc).


The way I see it is that the repository is the context that you want to
use for isolation. The APIs allow developers only to walk up the parent
repository instance from a child repository instance, but not vice
versa. If the repository instance is only used in a specific domain
(e.g. webapp) and the repository instance is hidden from other
applications (e.g. other webapps), this would effectively hide the
modules in that repository and the repository could be considered
private. Whether such repository is constructed using the wrapper
approach or reusing existing ModuleDefinitionContent is an
implementation detail. Do you agree?


Yes, but it is a very critical detail! The wrapper approach, if
required, drags in the copy, concurrency and lifecycle issues; all of
which need solutions. I am groping here for a model in which none of
this is required under normal circumstances.

I think the purely private repository model works fine, where the
definitions and module instances are also private.

And, as I said in the app server thread, using the private repository
approach for web-modules literally means a repository containing a
*single definition*. This may be an acceptable solution, but I want to
ensure that everyone is aware of this implication.

Regardless of what approach we take, the releaseModule() idea is too
simplistic. Having originally created the detach() method, thinking
along similar lines as you are with the plugin case, I do understand the
idea; I just no longer think it is sufficient :^).

The only "safe" time to release a module is when there are *zero*
importers, and, even then, you must hide/release atomically, ensuring
that no new imports spring up during the operation.


Releasing a module simply means that its reference is released from the
module system's cache; it does not affect any existing importer. Also,
hide and release does not need to happen atomically - I think hiding it
first would be sufficient to prevent new imports to be resolved, but
I'll need to double check it with the RI.


I understand what release does :^). The issue is not the release itself,
but the potential subsequent instantiation of a Module from that same
definition. This instance is now a duplicate, with duplicate classes,
which can easily lead to failures: the original instance may still be in
use.


Note that I'm not arguing the releaseModule() approach is perfect, but I
do think this is a use case that we can ignore; I would welcome better
suggestion to handle this.

Other than Module instances, what other "runtime information" is there
to keep track of? Caches of exported packages?


The Module instances that have been initialized, the ModuleDefinitions
that are being instantiated and the corresponding Module instances that
are being initialized (could be triggered simultaneously from multiple
threads), and the ModuleDefinitions that have been disabled (e.g. the
repository has been shutdown and no new Module instance should be created).


Right. But only the first of these is long lived, the others are
transient, needed only during resolution. So again, if releaseModule()
either did not exist, *or* if we did not keep references to released
instances, a simple field cache could be used. Just trying to see
through some of the fog here :^)

For example, if we were to eliminate the releaseModule() method (in
favor of some more complete mechanism), then there really is always a
1:1 for Module:ModuleDefinition, and the model is simple and obvious.
(And therefore a field cache *could* be used).


To keep this discussion focus, I think we could assume
Module:ModuleDefinition is 1:1 for now. releaseModule() is just a
special case. That said, even if the relationship is 1:1, it still
doesn't mean we *should* put the state in ModuleDefinition, see above. ;-)


Sure. (OTOH, it *would* make ModuleSystem a bit simpler by eliminating
the need for a silly mapping, and would bake the 1:1 model into the
design. And stashing a field like this isn't much different than an
immutable String having a hash field that is lazily stored. But clearly
if you want to track released/stopped instances, this becomes more
involved! I'm going to shut up about this aspect now ;^)

So, in effect, we have 100% *private repository* instances.

I've been thinking that we need an intermediate somewhere between a
shared/public repo instance and an entirely private one, but... that now
strikes me as too fuzzy, and I can't see a real use case :^)

So an application server would have to create, say, a private
LocalRepository instance to hold the modules of a single application.
And it would have to ensure that no other application could get it's
grubby paws on that repository instance.

Ok. That works for me. And it eliminates the need for cloning AND for
releaseModule():

1. Any given Repository instance is either 100% shared or 100% private,
with *no* in-between.

2. The lifecycle of a shared repository instance is that of the process.

3. The lifecycle of a private repository instance is entirely up to the
creator of that instance.


I agreed that distinguishing the repository instances for sharing or
private usages is more easy to understand, but the notion of shared and
private really depends on the usage context, and it's not an attribute
of the repository itself. Suppose there is a repository with two child
repositories; this repository would be considered "shared" from the
perspective of the modules in the child repositories, but this
repository might still be considered private from the other applications
in the same JVM.

I think your first three points can be combined as follows:

1. Any given Repository instance could be used for sharing or private
purpose.

2. The lifetime of a repository instance is managed by its creator.


Yes, but this doesn't go far enough. For example, the JRE will create
the system repository, and, by this rule, could shut it down at any
time. Clearly this would cause major havoc. Unless we have a complete
lifecycle model for Modules, so that it is possible to ensure that no
active user exists, such "global" modules must live for the lifetime of
the process. Otherwise, we risk collision failures.

4. The lifecycle of a ModuleDefinition/Module is at most that of the
enclosing Repository instance, and at least is bounded by
install/uninstall (no finer granularity).


Yes.


But releaseModule() violates #4. Use of that method enables unlimited
numbers of Module instances from the same definition, during the
lifetime of the enclosing repository.

It does leave open the issue of dependencies *within* a private
repository. The simple model would be to treat the entire repository as
atomic, with any change requiring a new Repository instance. This is
probably too simplistic, however.

In an EE app, web-modules are supposed to be isolated from each other
and from other parts of the app (ejb, connectors, etc.). So this
requires either a further partitioning of the app into multiple
repositories, or some form of access control.


The web-modules should probably be in its own repository if isolation is
needed. There is no access control for ModuleDefinitions - accessibility
of a ModuleDefinition is the same as visibility of ModuleDefinition;
this is also a very different issue. I think we should focus our
discussion on using repository instance for isolation unless we find
this approach insufficient.


Sure, again, as long as a single module repository is deemed an
acceptable solution.

Further, it is possible to re-start or re-deploy/re-start only a single
web-module, *without* restarting the rest of the app. The re-start case
could use releaseModule(), though a "real" stop method would be
preferable. But this is a very special case in which the specs
essentially dictate the possible dependencies between modules. In the
general case where the dependencies are not dictated, releaseModule() is
problematic.
The re-deploy case would use uninstall/install (but would still like
stop!).


Whether the solution is stop() or releaseModule(), it still shows that
we need to support the use case of long-lived repository with
short-lived modules. Do you agree?


Sure.

If you *really* want the releaseModule() functionality, I would suggest
that we introduce a PrivateRepository type, and support release *only*
on that type.


The notion of private or shared depends on the usage context of the
repository instance, so it won't be appropriate to surface the concept
at the API level.


If the semantics of releaseModule() limit its use to very special
circumstances, I think it would be appropriate to capture this in the API.

In the spec, you rule out the use of this method against any java.* or
bootstrap repository module.

But is it ever appropriate to call releaseModule() on a Module from the
system repository? Or any public, global repository?

I was simply exploring the possibility that this functionality only
makes sense on a private repository instance. But I don't care if we
generalize that to say that the creator should be able to control this
behavior.

If so, I'd like that to be manifest in the API somehow. I don't really
care how. As another possibility, why not expose this an attribute of
Repository, just as we do read-only status?

   public abstract boolean supportsRelease();

And then document ModuleSystem.releaseModule() as a no-op or failure
when the underlying repo returns false.


Bryan, there are many topics in this thread, but I think what you really
want is to discuss the notion of module isolation. As we discussed so
far, I think we all agreed that repository is the appropriate context to
make module isolation possible. Could you summarize any outstanding
design concern you have with this approach in a paragraph or two?


My only remaining design concern (on the topic of isolation) is the
semantics of releaseModule(). As long as we nail down the *correct*
usage of this method, I'll be happy.

Since the current design does not provide a means to determine or
control use of released modules, class duplication failures and/or large
memory leaks remain a strong possibility. As I've said from the very
beginning, we have solved this problem in our application server by
choosing to disable the class loaders of "released" modules, and it has
proven extremely useful. I would like to see this supported here as an
optional behavior.


Thanks,
- Stanley

Re: Module isolation

Reply via email to