lhotari commented on pull request #9638:
URL: https://github.com/apache/pulsar/pull/9638#issuecomment-785232699


   > This is true for ThreadRuntime currently but not for ProcessRuntime / 
kubernetes runtijme. ThreadRuntime is also not doing the right thing since 
Ideally user provided code is isolated on completely separate classloader that 
only share common interfaces with the framework classloader.
   
   @jerrypeng  Thanks for pointing this out. I spent a few hours investigating 
this to get a better understanding how the classloaders are configured in 
Pulsar Functions. 
   
   I found the documentation of the classloader hierarchy for Process Runtime 
in JavaInstanceMain class: 
https://github.com/apache/pulsar/blob/00463adfd5c59958479b9b3a8e26140ab07fd8c1/pulsar-functions/runtime-all/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceMain.java#L34-L47
   
   There's also logic that impacts how the classpaths get configured:
   
https://github.com/apache/pulsar/blob/64971778d88a8dab6e6e50563e9c30d5425c8cc1/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/RuntimeUtils.java#L286-L297
   
   By default `functionInstanceClassPath` is always null/empty, for both 
Process Runtime and also for Kubernetes Runtime. [Therefore 
`pulsar.functions.instance.classpath` will always be set to 
`System.getProperty("java.class.path")` by 
default](https://github.com/apache/pulsar/blob/64971778d88a8dab6e6e50563e9c30d5425c8cc1/pulsar-functions/runtime/src/main/java/org/apache/pulsar/functions/runtime/RuntimeUtils.java#L286-L297).
   
   Because of the above logic where `pulsar.functions.instance.classpath` will 
always contain the classpath of the process that launches the processes, it 
leads to the same behavior for thread, process and kubernetes runtimes, where 
the **Pulsar provided dependencies will always get loaded from the parent 
classloader of the "user classloader"**. I don't see any actual classloading 
differences between the thread runtime, process runtime and kubernetes runtime 
when the default configuration is used (`functionInstanceClassPath` is left 
unconfigured in `conf/functions_worker.yml`) . I might be missing something, so 
I'd be happy to learn more about this area in Pulsar.
   
   > 
   > While I am all for decreasing the sizes of connectors. This will have 
other consequences.
   
   There are integration tests which [test both the process 
runtime](https://github.com/apache/pulsar/blob/master/tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsProcessTest.java)
 and the thread runtime for [some 
connectors](https://github.com/apache/pulsar/blob/a5b3cf6fe694b57adffc7e9a69c8520f3d85b2e3/tests/integration/src/test/java/org/apache/pulsar/tests/integration/functions/PulsarFunctionsTest.java#L79-L90).
 It seems like a major gap that not all connectors are tested with the thread 
runtime and the process runtime. You are right that there will be unknown 
consequences as long as there isn't test coverage for both scenarios.
   
   As explained in the previous observations of the thread vs. process/k8s 
runtime (no actual difference in classloading with default settings), the 
changes made in this PR are safe in general.
   @jerrypeng WDYT?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to