lhotari commented on pull request #9638: URL: https://github.com/apache/pulsar/pull/9638#issuecomment-785232699
> This is true for ThreadRuntime currently but not for ProcessRuntime / kubernetes runtijme. ThreadRuntime is also not doing the right thing since Ideally user provided code is isolated on completely separate classloader that only share common interfaces with the framework classloader. @jerrypeng Thanks for pointing this out. I spent a few hours investigating this to get a better understanding how the classloaders are configured in Pulsar Functions. I found the documentation of the classloader hierarchy for Process Runtime in JavaInstanceMain class: https://github.com/apache/pulsar/blob/00463adfd5c59958479b9b3a8e26140ab07fd8c1/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L34-L47 There's also logic that impacts how the classpaths get configured: https://github.com/apache/pulsar/blob/64971778d88a8dab6e6e50563e9c30d5425c8cc1/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/RuntimeUtils.java#L286-L297 By default `functionInstanceClassPath` is always null/empty, for both Process Runtime and also for Kubernetes Runtime. [Therefore `pulsar.functions.instance.classpath` will always be set to `System.getProperty("java.class.path")` by default](https://github.com/apache/pulsar/blob/64971778d88a8dab6e6e50563e9c30d5425c8cc1/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/RuntimeUtils.java#L286-L297). Because of the above logic where `pulsar.functions.instance.classpath` will always contain the classpath of the process that launches the processes, it leads to the same behavior for thread, process and kubernetes runtimes, where the **Pulsar provided dependencies will always get loaded from the parent classloader of the "user classloader"**. I don't see any actual classloading differences between the thread runtime, process runtime and kubernetes runtime when the default configuration is used (`functionInstanceClassPath` is left unconfigured in `conf/functions_worker.yml`) . I might be missing something, so I'd be happy to learn more about this area in Pulsar. > > While I am all for decreasing the sizes of connectors. This will have other consequences. There are integration tests which [test both the process runtime](https://github.com/apache/pulsar/blob/master/tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsProcessTest.java) and the thread runtime for [some connectors](https://github.com/apache/pulsar/blob/a5b3cf6fe694b57adffc7e9a69c8520f3d85b2e3/tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java#L79-L90). It seems like a major gap that not all connectors are tested with the thread runtime and the process runtime. You are right that there will be unknown consequences as long as there isn't test coverage for both scenarios. As explained in the previous observations of the thread vs. process/k8s runtime (no actual difference in classloading with default settings), the changes made in this PR are safe in general. @jerrypeng WDYT? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
