Perhaps we can also adopt a practice of making our own APIs explicitly pass a Classloader when appropriate so we only have to set this when we are entering code that does not have good hygiene. It might actually be nice to have a lightweight static analysis to forbid bad methods in our code.
Kenn On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik <lc...@google.com> wrote: > I totally agree, but there are so many Java APIs (including ours) that > messed this up so everyone lives with the same hack. > > On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud <apill...@google.com> wrote: > >> It seems like a terribly fragile way to pass arguments but my tests pass >> when I wrap the JDBC path into Beam pipeline execution with that pattern. >> >> Thanks! >> >> Andrew >> >> On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik <lc...@google.com> wrote: >> >>> It is a common mistake for APIs to not include a way to specify which >>> class loader to use when doing something like deserializing an instance of >>> a class via the ObjectInputStream. This common issue also affects Apache >>> Beam (SerializableCoder, PipelineOptionsFactory, ...) and the way that >>> typical Java APIs have gotten around this is to use to the thread context >>> class loader (TCCL) as the way to plumb this additional attribute through. >>> So Apache Beam is meant to in all places honor the TCCL if it has been set >>> as most Java libraries (not all) do the same hack. >>> >>> In most environments the TCCL is not set and we are working with a >>> single class loader. It turns out that in more complicated environments >>> (like when loading a JDBC driver, or JNDI, or an application server, ...) >>> this usually doesn't work without each caller knowing what class loading >>> context they should be in. A common work around for most scenarios is to >>> always set the TCCL to the current classes class loader like so before >>> invoking any APIs that do class loading so you don't propagate the TCCL of >>> the caller along since they may have set it for some other reason: >>> >>> ClassLoader originalClassLoader = >>> Thread.currentThread().getContextClassLoader();try { >>> >>> Thread.currentThread().setContextClassLoader(getClass().getClassLoader()); >>> // call some API that uses reflection without taking ClassLoader param} >>> finally { >>> Thread.currentThread().setContextClassLoader(originalClassLoader);} >>> >>> >>> >>> On Mon, Jun 4, 2018 at 1:57 PM Andrew Pilloud <apill...@google.com> >>> wrote: >>> >>>> I'm having class loading issues that go away when I revert the changes >>>> in our use of Class.forName added in >>>> https://github.com/apache/beam/pull/4674. The problem I'm having is >>>> that the typical JDBC GUI (SqlWorkbench/J, SQuirreL SQL) creates an >>>> isolated class loader to load our library. Things work if we call >>>> Class.forName with the default class loader [getClass().getClassLoader() or >>>> no argument] but not if we use the thread context class loader >>>> [Thread.currentThread().getContextClassLoader() or >>>> ReflectHelpers.findClassLoader()]. Why is using the default class loader >>>> not the right thing to do? How can I fix this problem? >>>> >>>> See this integration test for an example: >>>> https://github.com/apilloud/beam/blob/directrunner/sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/JdbcIT.java#L44 >>>> >>>> https://scans.gradle.com/s/iquqinhns2ymi/tests/slmg6ytuuqlus-akh5xpgshj32k >>>> >>>> Andrew >>>> >>>