Romain, can you point to an example of a global singleton registry that does this right for class loading (it may allow people to work towards such an effort)?
On Tue, Jun 5, 2018 at 10:06 PM Romain Manni-Bucau <[email protected]> wrote: > It is actually very localised in runner code where beam should reset the > classloader when the deserialization happens and then the runner owns the > classloader all the way in evaluators. > > If IO change the classloader they must indeed handle it too and patch the > deserialization too. > > Here again (we mentionned it multiple times in other threads) beam misses > a global singleton registry where you can register a "service" to look it > up based of a serialization configuration and a lifecycle allowing to close > the classloader in all instances without hacks. > > > Le mar. 5 juin 2018 23:50, Kenneth Knowles <[email protected]> a écrit : > >> Perhaps we can also adopt a practice of making our own APIs explicitly >> pass a Classloader when appropriate so we only have to set this when we are >> entering code that does not have good hygiene. It might actually be nice to >> have a lightweight static analysis to forbid bad methods in our code. >> >> Kenn >> >> On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik <[email protected]> wrote: >> >>> I totally agree, but there are so many Java APIs (including ours) that >>> messed this up so everyone lives with the same hack. >>> >>> On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud <[email protected]> >>> wrote: >>> >>>> It seems like a terribly fragile way to pass arguments but my tests >>>> pass when I wrap the JDBC path into Beam pipeline execution with that >>>> pattern. >>>> >>>> Thanks! >>>> >>>> Andrew >>>> >>>> On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik <[email protected]> wrote: >>>> >>>>> It is a common mistake for APIs to not include a way to specify which >>>>> class loader to use when doing something like deserializing an instance of >>>>> a class via the ObjectInputStream. This common issue also affects Apache >>>>> Beam (SerializableCoder, PipelineOptionsFactory, ...) and the way that >>>>> typical Java APIs have gotten around this is to use to the thread context >>>>> class loader (TCCL) as the way to plumb this additional attribute through. >>>>> So Apache Beam is meant to in all places honor the TCCL if it has been set >>>>> as most Java libraries (not all) do the same hack. >>>>> >>>>> In most environments the TCCL is not set and we are working with a >>>>> single class loader. It turns out that in more complicated environments >>>>> (like when loading a JDBC driver, or JNDI, or an application server, ...) >>>>> this usually doesn't work without each caller knowing what class loading >>>>> context they should be in. A common work around for most scenarios is to >>>>> always set the TCCL to the current classes class loader like so before >>>>> invoking any APIs that do class loading so you don't propagate the TCCL of >>>>> the caller along since they may have set it for some other reason: >>>>> >>>>> ClassLoader originalClassLoader = >>>>> Thread.currentThread().getContextClassLoader();try { >>>>> >>>>> Thread.currentThread().setContextClassLoader(getClass().getClassLoader()); >>>>> // call some API that uses reflection without taking ClassLoader >>>>> param} finally { >>>>> Thread.currentThread().setContextClassLoader(originalClassLoader);} >>>>> >>>>> >>>>> >>>>> On Mon, Jun 4, 2018 at 1:57 PM Andrew Pilloud <[email protected]> >>>>> wrote: >>>>> >>>>>> I'm having class loading issues that go away when I revert the >>>>>> changes in our use of Class.forName added in >>>>>> https://github.com/apache/beam/pull/4674. The problem I'm having is >>>>>> that the typical JDBC GUI (SqlWorkbench/J, SQuirreL SQL) creates an >>>>>> isolated class loader to load our library. Things work if we call >>>>>> Class.forName with the default class loader [getClass().getClassLoader() >>>>>> or >>>>>> no argument] but not if we use the thread context class loader >>>>>> [Thread.currentThread().getContextClassLoader() or >>>>>> ReflectHelpers.findClassLoader()]. Why is using the default class loader >>>>>> not the right thing to do? How can I fix this problem? >>>>>> >>>>>> See this integration test for an example: >>>>>> https://github.com/apilloud/beam/blob/directrunner/sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/JdbcIT.java#L44 >>>>>> >>>>>> https://scans.gradle.com/s/iquqinhns2ymi/tests/slmg6ytuuqlus-akh5xpgshj32k >>>>>> >>>>>> Andrew >>>>>> >>>>>
