jerrypeng edited a comment on pull request #9638: URL: https://github.com/apache/pulsar/pull/9638#issuecomment-790809949
@lhotari `functionInstanceClsLoader ` defined in `JavaInstanceMain`: https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L87 Does NOT load the user's function JARs. It is suppose to load the Pulsar Function framework JARs thus it does load all of the Pulsar platform dependencies. The root classloader that contains only the interfaces in which the user defined function interacts with framework is defined here: https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L99 and passed into the `JavaInstaceStarter` here https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L99 The root classloader is subsequently pass into the `ThreadRuntimeFactory` here: https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/JavaInstanceStarter.java#L213 The `ThreadRuntime` will use the root classloader, which only contains those few interfaces, as the parent classloader of the user code function classloader. Thus, the classloader that loads the user function JARs will not contain all the dependencies of Pulsar. By the way `ThreadRuntime` is used by both `ProcessRuntime` and `KubernetesRuntime` underneath but don't be confused by this with actually configuring the worker to use `ThreadRuntime`. When `ThreadRuntime` is configured as the runtime to be used by the worker, this root classloader will not be set and and default to `Thread.currentThread().getContextClassLoader()` which contains all of pulsar's dependencies: https://github.com/apache/pulsar/blob/master/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/thread/ThreadRuntimeFactory.java#L91 However, this is not the case for `ProcessRuntime` and `KubernetesRuntime` and this is the difference between `ThreadRuntime` and the other two runtimes. In general, for platform that supports third party plugins or executing user submitted code, it is best if classpaths are isolated and transitive dependencies are not shared across platform and user code. This will cause a lot of dependency versioning issues and limit what versions dependencies user submitted code can use. @lhotari I appreciate your effort to solve the issues with testing and to understand the Pulsar Function code. Perhaps we can find another solution here? Looking forward to working with you in the Pulsar community! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
