gharris1727 commented on code in PR #18403:
URL: https://github.com/apache/kafka/pull/18403#discussion_r1907898181
##########
connect/runtime/src/test/java/org/apache/kafka/connect/runtime/isolation/TestPlugins.java:
##########
@@ -251,7 +252,11 @@ public enum TestPlugin {
/**
* A ServiceLoader discovered plugin which subclasses another plugin
which is present on the classpath
*/
-
SUBCLASS_OF_CLASSPATH_OVERRIDE_POLICY(TestPackage.SUBCLASS_OF_CLASSPATH,
"test.plugins.SubclassOfClasspathOverridePolicy");
+
SUBCLASS_OF_CLASSPATH_OVERRIDE_POLICY(TestPackage.SUBCLASS_OF_CLASSPATH,
"test.plugins.SubclassOfClasspathOverridePolicy"),
+ /**
+ * A plugin which is part of the classpath by default. This packages
it as a separate jar which is used to test plugin isolation from the classpath
plugin.
+ */
+ CLASSPATH_CONVERTER(TestPackage.CLASSPATH_CONVERTER,
"org.apache.kafka.connect.converters.ByteArrayConverter");
Review Comment:
Mark this as not included by default.
##########
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/ReflectionScanner.java:
##########
@@ -76,8 +78,21 @@ private static <T> String versionFor(Class<? extends T>
pluginKlass) throws Refl
@Override
protected PluginScanResult scanPlugins(PluginSource source) {
+ // By default, java and classgraph uses parent first classloading,
hence if a plugin is loaded by the classpath
+ // loader, and then by an isolated plugin loader, the default
precedence will always load the classpath version.
+ // This breaks isolation and hence connect uses isolated plugin
loaders, which are child first classloaders.
+ // Therefore, we override the classloader order to be child first, so
that the isolated plugin loader is used first.
+ // In addition, we need to explicitly specify the full classloader
order, as classgraph only scans the classes available
+ // in the classloaders and not the entire parent chain. Due to this
reason if a plugin is extending a class present
+ // in classpath/application it will not be able to find the parent
class unless we explicitly specify the classloader order.
+ List<ClassLoader> classLoaderOrder = new ArrayList<>();
+ ClassLoader cl = source.loader();
+ while (cl != null) {
+ classLoaderOrder.add(cl);
+ cl = cl.getParent();
+ }
Review Comment:
But I can see that your reproduction case fails with the old code and
succeeds with the new code, why is that?
I can see after the initial `ClassGraph#scan()` call, it actually always
finds the PluginClassLoader isolated copy of ByteArrayConverter, even with the
trunk implementation. The difference happens inside of
`ClassInfoList#loadClasses` and `ClassGraphClassLoader#loadClass`. Here's what
I was seeing there:
#### Trunk
`environmentClassLoaderDelegationOrder`
* PlatformClassLoader
* **AppClassLoader**
* FilteringClassLoader (?)
* URLClassLoader (?)
* **PluginClassLoader**
#### Loader and parents
`overrideClassLoaders`
* **PluginClassLoader**
* **AppClassLoader**
* PlatformClassLoader
So the order of the specified classloaders _does_ change the classloading
order of the plugins, even though it doesn't change the scanning behavior.
I think the current implementation is satisfactory as a workaround for the
behavior, but we should follow-up with ClassGraph and try and use their
ClassLoaderHandler infrastructure to get the right classloader sorting. I
explored circumventing the ClassGraphClassLoader entirely and calling
`Class.forName(..., ..., source.loader())`, but its a bit clunky because we
have to also include a bunch of error handling.
##########
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/ReflectionScanner.java:
##########
@@ -76,8 +78,21 @@ private static <T> String versionFor(Class<? extends T>
pluginKlass) throws Refl
@Override
protected PluginScanResult scanPlugins(PluginSource source) {
+ // By default, java and classgraph uses parent first classloading,
hence if a plugin is loaded by the classpath
+ // loader, and then by an isolated plugin loader, the default
precedence will always load the classpath version.
+ // This breaks isolation and hence connect uses isolated plugin
loaders, which are child first classloaders.
+ // Therefore, we override the classloader order to be child first, so
that the isolated plugin loader is used first.
+ // In addition, we need to explicitly specify the full classloader
order, as classgraph only scans the classes available
+ // in the classloaders and not the entire parent chain. Due to this
reason if a plugin is extending a class present
+ // in classpath/application it will not be able to find the parent
class unless we explicitly specify the classloader order.
+ List<ClassLoader> classLoaderOrder = new ArrayList<>();
+ ClassLoader cl = source.loader();
+ while (cl != null) {
+ classLoaderOrder.add(cl);
+ cl = cl.getParent();
+ }
Review Comment:
Okay, thanks for the explanation, I see that ClassGraph is going outside the
classloader hierarchy for scanning, and just uses a list of jars. Here's what I
was seeing:
#### Trunk
```
.addClassLoader(source.loader())
```
* Gradle wrapper
* **Plugin**
* Gradle worker
* build/ class directories
* **connect-runtime**
* build/ lib jars
* Gradle caches (dependencies)
#### Single Loader
```
.overrideClassLoaders(source.loader())
```
* **Plugin**
#### Loader and parents
```
List<ClassLoader> classLoaderOrder = new ArrayList<>();
ClassLoader cl = source.loader();
while (cl != null) {
classLoaderOrder.add(cl);
cl = cl.getParent();
}
...
.overrideClassLoaders(classLoaderOrder.toArray(new ClassLoader[0]))
```
* Runtime modules
* **Plugin**
* Gradle worker
* build/ class directories
* **connect-runtime**
* build/ lib jars
* Gradle caches (dependencies)
It doesn't appear that the order in which we specify the classloaders has an
effect on the order in which scanning actually takes place. And I think that
makes sense given the ClassLoaderHandler stuff; It's applying a deterministic
ordering to generate these lists of jars.
Look at this:
https://github.com/classgraph/classgraph/blob/6f9012f2a193ebfefe4a4384e7642820e7aab0f5/src/main/java/nonapi/io/github/classgraph/classloaderhandler/README
> Note that URLClassLoader subclasses do not need a custom
ClassLoaderHandler (unless they need to override the delegation order, as with
SpringBootRestartClassLoaderHandler), URLClassLoader subclasses are handled
automatically by ClassGraph.
This sounds like it applies to us, because we have a child-first/parent-last
delegation order, but we don't have a special handler telling ClassGraph about
it. Maybe we can pursue upstreaming a handler to make the ordering work
properly.
##########
connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/ReflectionScanner.java:
##########
@@ -76,8 +78,22 @@ private static <T> String versionFor(Class<? extends T>
pluginKlass) throws Refl
@Override
protected PluginScanResult scanPlugins(PluginSource source) {
+ // By default, java and classgraph uses parent first classloading,
hence if a plugin is loaded by the classpath
+ // loader, and then by an isolated plugin loader, the default
precedence will always load the classpath version.
+ // This breaks isolation and hence connect uses isolated plugin
loaders, which are child first classloaders.
+ // Therefore, we override the classloader order to be child first, so
that the isolated plugin loader is used first.
+ // In addition, we need to explicitly specify the full classloader
order, to force classgraph to scan classes in
+ // the class path. This does not happen by default as it uses
reflections to obtain access to classpath URLs from
+ // the application classloader, which can fail with illegal access
exceptions.
+ List<ClassLoader> classLoaderOrder = new ArrayList<>();
+ ClassLoader cl = source.loader();
+ while (cl != null) {
+ classLoaderOrder.add(cl);
+ cl = cl.getParent();
+ }
ClassGraph classGraphBuilder = new ClassGraph()
- .addClassLoader(source.loader())
+ .addClassLoader(source.loader())
+ .overrideClassLoaders(classLoaderOrder.toArray(new
ClassLoader[0]))
Review Comment:
nit: make this a private method that returns the classloader array.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]