Note sure the example is atomic enough but in https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/finder/StandaloneContainerFinder.java#L40 the "instance()" is a singleton used by all the runtime of the framework.
Deserialization happens in https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/serialization/SerializableService.java#L26 and serialization is about creating this object in a write replace. Then the runtime is switching its classloader (runner for beam) as in https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/base/LifecycleImpl.java#L60 asap and resets it once done to not break its environment for reused jvms case. If we take the case of an IO, the io would lazily creates its defined classloader from its spec and use some reference counting logic to destroy it when needed in its teardown for instance. The io then does the classloader switch in its callbacks (setup/teardown/process/bundle hooks etc). Le mer. 6 juin 2018 23:33, Lukasz Cwik <[email protected]> a écrit : > Romain, can you point to an example of a global singleton registry that > does this right for class loading (it may allow people to work towards such > an effort)? > > On Tue, Jun 5, 2018 at 10:06 PM Romain Manni-Bucau <[email protected]> > wrote: > >> It is actually very localised in runner code where beam should reset the >> classloader when the deserialization happens and then the runner owns the >> classloader all the way in evaluators. >> >> If IO change the classloader they must indeed handle it too and patch the >> deserialization too. >> >> Here again (we mentionned it multiple times in other threads) beam misses >> a global singleton registry where you can register a "service" to look it >> up based of a serialization configuration and a lifecycle allowing to close >> the classloader in all instances without hacks. >> >> >> Le mar. 5 juin 2018 23:50, Kenneth Knowles <[email protected]> a écrit : >> >>> Perhaps we can also adopt a practice of making our own APIs explicitly >>> pass a Classloader when appropriate so we only have to set this when we are >>> entering code that does not have good hygiene. It might actually be nice to >>> have a lightweight static analysis to forbid bad methods in our code. >>> >>> Kenn >>> >>> On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik <[email protected]> wrote: >>> >>>> I totally agree, but there are so many Java APIs (including ours) that >>>> messed this up so everyone lives with the same hack. >>>> >>>> On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud <[email protected]> >>>> wrote: >>>> >>>>> It seems like a terribly fragile way to pass arguments but my tests >>>>> pass when I wrap the JDBC path into Beam pipeline execution with that >>>>> pattern. >>>>> >>>>> Thanks! >>>>> >>>>> Andrew >>>>> >>>>> On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik <[email protected]> wrote: >>>>> >>>>>> It is a common mistake for APIs to not include a way to specify which >>>>>> class loader to use when doing something like deserializing an instance >>>>>> of >>>>>> a class via the ObjectInputStream. This common issue also affects Apache >>>>>> Beam (SerializableCoder, PipelineOptionsFactory, ...) and the way that >>>>>> typical Java APIs have gotten around this is to use to the thread context >>>>>> class loader (TCCL) as the way to plumb this additional attribute >>>>>> through. >>>>>> So Apache Beam is meant to in all places honor the TCCL if it has been >>>>>> set >>>>>> as most Java libraries (not all) do the same hack. >>>>>> >>>>>> In most environments the TCCL is not set and we are working with a >>>>>> single class loader. It turns out that in more complicated environments >>>>>> (like when loading a JDBC driver, or JNDI, or an application server, ...) >>>>>> this usually doesn't work without each caller knowing what class loading >>>>>> context they should be in. A common work around for most scenarios is to >>>>>> always set the TCCL to the current classes class loader like so before >>>>>> invoking any APIs that do class loading so you don't propagate the TCCL >>>>>> of >>>>>> the caller along since they may have set it for some other reason: >>>>>> >>>>>> ClassLoader originalClassLoader = >>>>>> Thread.currentThread().getContextClassLoader();try { >>>>>> >>>>>> Thread.currentThread().setContextClassLoader(getClass().getClassLoader()); >>>>>> // call some API that uses reflection without taking ClassLoader >>>>>> param} finally { >>>>>> Thread.currentThread().setContextClassLoader(originalClassLoader);} >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Jun 4, 2018 at 1:57 PM Andrew Pilloud <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> I'm having class loading issues that go away when I revert the >>>>>>> changes in our use of Class.forName added in >>>>>>> https://github.com/apache/beam/pull/4674. The problem I'm having is >>>>>>> that the typical JDBC GUI (SqlWorkbench/J, SQuirreL SQL) creates an >>>>>>> isolated class loader to load our library. Things work if we call >>>>>>> Class.forName with the default class loader >>>>>>> [getClass().getClassLoader() or >>>>>>> no argument] but not if we use the thread context class loader >>>>>>> [Thread.currentThread().getContextClassLoader() or >>>>>>> ReflectHelpers.findClassLoader()]. Why is using the default class loader >>>>>>> not the right thing to do? How can I fix this problem? >>>>>>> >>>>>>> See this integration test for an example: >>>>>>> https://github.com/apilloud/beam/blob/directrunner/sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/JdbcIT.java#L44 >>>>>>> >>>>>>> https://scans.gradle.com/s/iquqinhns2ymi/tests/slmg6ytuuqlus-akh5xpgshj32k >>>>>>> >>>>>>> Andrew >>>>>>> >>>>>>
