Perhaps we can also adopt a practice of making our own APIs explicitly pass
a Classloader when appropriate so we only have to set this when we are
entering code that does not have good hygiene. It might actually be nice to
have a lightweight static analysis to forbid bad methods in our code.

Kenn

On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik <lc...@google.com> wrote:

> I totally agree, but there are so many Java APIs (including ours) that
> messed this up so everyone lives with the same hack.
>
> On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud <apill...@google.com> wrote:
>
>> It seems like a terribly fragile way to pass arguments but my tests pass
>> when I wrap the JDBC path into Beam pipeline execution with that pattern.
>>
>> Thanks!
>>
>> Andrew
>>
>> On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik <lc...@google.com> wrote:
>>
>>> It is a common mistake for APIs to not include a way to specify which
>>> class loader to use when doing something like deserializing an instance of
>>> a class via the ObjectInputStream. This common issue also affects Apache
>>> Beam (SerializableCoder, PipelineOptionsFactory, ...) and the way that
>>> typical Java APIs have gotten around this is to use to the thread context
>>> class loader (TCCL) as the way to plumb this additional attribute through.
>>> So Apache Beam is meant to in all places honor the TCCL if it has been set
>>> as most Java libraries (not all) do the same hack.
>>>
>>> In most environments the TCCL is not set and we are working with a
>>> single class loader. It turns out that in more complicated environments
>>> (like when loading a JDBC driver, or JNDI, or an application server, ...)
>>> this usually doesn't work without each caller knowing what class loading
>>> context they should be in. A common work around for most scenarios is to
>>> always set the TCCL to the current classes class loader like so before
>>> invoking any APIs that do class loading so you don't propagate the TCCL of
>>> the caller along since they may have set it for some other reason:
>>>
>>> ClassLoader originalClassLoader = 
>>> Thread.currentThread().getContextClassLoader();try {
>>>     
>>> Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
>>>     // call some API that uses reflection without taking ClassLoader param} 
>>> finally {
>>>     Thread.currentThread().setContextClassLoader(originalClassLoader);}
>>>
>>>
>>>
>>> On Mon, Jun 4, 2018 at 1:57 PM Andrew Pilloud <apill...@google.com>
>>> wrote:
>>>
>>>> I'm having class loading issues that go away when I revert the changes
>>>> in our use of Class.forName added in
>>>> https://github.com/apache/beam/pull/4674. The problem I'm having is
>>>> that the typical JDBC GUI (SqlWorkbench/J, SQuirreL SQL) creates an
>>>> isolated class loader to load our library. Things work if we call
>>>> Class.forName with the default class loader [getClass().getClassLoader() or
>>>> no argument] but not if we use the thread context class loader
>>>> [Thread.currentThread().getContextClassLoader() or
>>>> ReflectHelpers.findClassLoader()]. Why is using the default class loader
>>>> not the right thing to do? How can I fix this problem?
>>>>
>>>> See this integration test for an example:
>>>> https://github.com/apilloud/beam/blob/directrunner/sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/JdbcIT.java#L44
>>>>
>>>> https://scans.gradle.com/s/iquqinhns2ymi/tests/slmg6ytuuqlus-akh5xpgshj32k
>>>>
>>>> Andrew
>>>>
>>>

Reply via email to