Re: [External] : Re: Inconsistency with service loading by layer or by class loader

Ron Pressler Wed, 18 Dec 2024 07:47:45 -0800

Here’s why we find your suggestions/descriptions so difficult to wrap our heads 
around:

There are many situations in which a program is made up of multiple components 
and you want to make sure the components are all in place before starting the 
program, i.e. you want what we call “reliable configuration”. This capability 
wasn’t available in the past, but modules are the feature that offers it now. 
*To those users who want reliable configuration* (either for the program as a 
whole in the common case, or for a “user program” in the dynamic container 
case, which would then be represented as a module layer) the platform offers 
that with the modules feature.

The scenario you’ve described is one that effectively disables reliable 
configuration. That is what everything you’ve brought up is predicated on. I 
don’t doubt you have your reasons for not wanting reliable configuration, but 
that only raises the question, why do you want to use the reliable 
configuration feature only to disable reliable configuration? You say you want 
to be “a good citizen” to those users who want modules’ benefits, but then 
proceed to disable one of their main benefits. You say, “the point of all of 
this is to define modules that are bound late”, but given that one of modules’ 
main goals is to figure out the configuration *early*, the obvious question is, 
what is it that you’re trying to accomplish *with modules*? We’re not being 
difficult; this is really unclear to us.

A valid answer could be that modules also offer strong encapsulation, you want 
to use the feature for that aspect only, and that is what you want to offer 
your users: strong encapsulation only.

Maybe it is a good idea to separate the concerns and support the use of modules 
in a way that offers strong encapsulation and not reliable configuration, but 
that is a big-picture discussion. If the answer is that we should support this, 
then this is not something we should do by adding some API points that would 
allow you, in your own special way, to disable reliable configuration. Rather, 
it should be done right (maybe add a concept of a DynamicModuleLayer that lets 
you add modules to it willy-nilly?).

But before making such a big conceptual decision about modules — which 
currently do tie together both concerns of strong encapsulation and reliable 
configuration — we need to know that doing one without the other is, indeed, an 
important use case to support. You’ve given two reasons:

1. You said you need to dynamically add service providers to a user 
application, but didn’t explain why or offered a concrete example.

2. You claimed that putting an entire user application in a single layer would 
come at “a heavy performance cost” but only offered a hypothesis to support 
that claim. The hypothesis is not sufficient because the O(N) scanning 
algorithm is only done once, and its duration is likely to be related to other 
initialisation work. I.e. a 1000-module user application will probably have a 
longer time-to-first-response than a 100-module user application even if it 
were loaded lazily, and it’s unclear just how much overhead the initial scan 
adds.

In short, what you’re effectively proposing is that we support a 
strong-encapsulation-only use case, which is fine, but before we can go there 
we need more concrete information about why you cannot live with reliable 
configuration.  
Saying that convincing us of the importance of the strong-encapsulation-only 
use case is a waste of time so instead we should just add a couple of API 
points, please and thank you, is not how we want to do things.

— Ron

> On 18 Dec 2024, at 15:09, David Lloyd <david.ll...@redhat.com> wrote:
> 
> 
> 
> On Wed, Dec 18, 2024 at 3:55 AM Alan Bateman <alan.bate...@oracle.com> wrote:
> On 17/12/2024 17:21, David Lloyd wrote:
>> :
>> 
>> I was using it as more of an example about how a thing may be possible and 
>> allowed by the platform, and thus is achievable, yet not specifically 
>> presented with a convenient API. That said, I've opened 
>> https://bugs.openjdk.org/browse/JDK-8346439 as a way to continue the 
>> discussion, framed as a specific feature request which covers what we need, 
>> and it does in fact include an `addUses` method.
> 
> We decided in 2017 to not add these methods. I've trying to see if there is 
> any new insight that would motivate adding these methods now.
> 
> ML.Controller::addUses would be a no-op for an automatic module so this 
> method will only add a "uses" edge for an explicit module. 
> 
> If the module has been compiled with references to the service class and is 
> calling ServiceLoader.load with that service class then its module descriptor 
> should have the appropriate `uses` in the module-info already. Has the module 
> author neglected to add this, didn't test, and the ML.Controller method will 
> be used to fix this?
> 
> No. Since we are late-binding all modules, every module we would load would 
> start with no `requires`, and we use `addReads` on the controller to wire in 
> the dependencies when the module is lazily linked. This means that any `uses` 
> declarations present on the descriptor which refer to packages not found 
> within the module itself will trigger a validation error, so we must strip 
> them out as well. In my prototype, I have to generate a method stub in the 
> target module to call `addUses` for this purpose; so, it is already possible 
> for me to do this but it would be nice to be able to do it non-stupidly.
> 
> This lazy-linking design already seems to work very well, with reasonably 
> fast startup and correct linking, if you ignore service loading. Right now I 
> have to force service-providing *and* service-using modules to be unnamed for 
> services to load (albeit incompletely, since again the `provides` method 
> would not work in this case).
> 
> The other scenario, and the motivation for Module::addUses, is where the 
> service is not known at compile-time, maybe code in the module is doing 
> service loading on behalf of another module. In that case, code in the module 
> itself should be calling Module::addUses method to add the transient `uses` 
> edge. Maybe the module author is not calling Module:;addUses and the 
> ML.Controller method will be used to fix that?
> 
> That is correct, as far as it goes. But only because we have to define the 
> modules with descriptors that do not include the `uses`.
> 
> ML.Controller::addProvides is also puzzling. A service provider module can 
> only be compiled if the provider class is in the module and the service class 
> is accessible to it. Has the module author neglected to add the `provides` 
> and the ML.Controller method will be used to fix this?
> 
> It's the same thing. The module descriptor can only reference classes found 
> within packages that in the module itself or a dependency, or else validation 
> will fail. Since we have no `requires` (because we cannot have eager graph 
> resolution), the set of packages is reduced to only those of the module 
> itself. Thus this mechanism cannot be used to declare a service provider if 
> the provider API exists outside of the module (this is the common case 
> AFAICT).
>   Or maybe this is about instrumentation or code generation where the 
> container adds a provider implementation to the module? In that case, why 
> didn't the container augment the module-info at the same time? Maybe the code 
> generation to add the provider implementation happens after the module has 
> been loaded?
> 
> The point of all of this is to define modules that are bound late, so that we 
> can continue to:
> 
> - Resolve only the parts of the module graph that are actually used at run 
> time
> - Resolve modules only when they are used
> - Have short or long cycles in the module dependency graph
> - Have multiple versions of a given module in the module dependency graph
> - Isolate modules from each other so that each module "sees" only the base 
> layer (well, ideally, only `java.base`, but that isn't possible AFAIK) and 
> its own dependency set (which may include other modules from the base layer 
> as well as modules from sibling layers)
> - Dynamically add more modules to the graph during run time (and remove them 
> too, at least if they exist in islands that can be safely unloaded)
> - Ensure that upon loading/usage, each module is correct from a 
> local/relative point of view, rather than a global point of view, much like 
> classes
> 
> Finally, just to say that your prototype addProvides doesn't specify any 
> validation. It looks like it can be used to add any random class and 
> implementation class. If a method were to be added then it would minimally 
> need to check that the implementation class is in the module and that it 
> extends the service class.
> 
> Yes and no. (I assume you imply that if the implementation provides a 
> `provider` method, then it's the method return type that would need to be 
> checked).
> 
> Firstly, the services are actually internally registered by class *name* 
> rather than class object, which seems weaker than necessary (maybe to avoid a 
> strong class reference?) and might allow any validation to be tricked or 
> bypassed somehow: there's only an imperfect guarantee that the service and 
> provider will actually be the *same* as what was registered. This seems to 
> undermine any validation, though we could just do it anyway I suppose.
> 
> Secondly, any layer-per-module architecture must be able to define providers 
> outside of their own module, otherwise there is no way to find these 
> providers. When loading a service from a module, unless the module shares a 
> layer with its implementations, the module itself must be told where its 
> providers are so that service loading can be done by layer. An alternative 
> would be to allow a module *layer*'s provider set to be manipulated instead 
> (which is essentially what this method does in effect anyway - basically, 
> just drop the `module` argument), which would be an OK alternative from our 
> POV; it would just be a bit oddly asymmetrical with respect to `addUses` 
> then. But that might be a good way to satisfy the "letter of the law".
> 
> I tend to believe that the principles behind the restriction of requiring 
> service implementations to live in the same module as the `provides` 
> declaration are really only applicable to static, JDK-managed application 
> layers (those on the boot path, or those created via e.g. 
> `java.lang.module.ModuleFinder#of(Path...)`). Our app server module system, 
> which was conceived back in 2010 when Java 6 was the latest Java, relies on 
> the ability to dynamically resolve `META-INF/services` files so that every 
> module (which each has its own isolated class loader) has its own view of 
> what service providers were available for *all* other modules, which works 
> very well. Resolving and binding whole groups of modules at once to determine 
> the service graph is essentially infeasible for architectures like this. This 
> is somewhat analogous to supporting e.g. `URLClassLoader` for simple 
> applications versus non-hierarchical class loading in advanced application 
> containers.
> 
> -- 
> - DML • he/him

Re: [External] : Re: Inconsistency with service loading by layer or by class loader

Reply via email to