(answered inline)

Le mer. 13 juin 2018 à 18:42, Lukasz Cwik <lc...@google.com> a écrit :

> Thanks for the example Romain.
>
> I took a look through it and was wondering whether it is only the root
> objects in the deserialization tree that need to implement
> SerializableService?
> Do lots of things need to implement SerializableService typically?
>

yes, on the one entering the (de)serialization process must handle that


> What do you do with types that you don't control (for example do you
> create wrapper types)?
>

Like beam classes? ;)

You can instrument their bytecode like in
https://github.com/Talend/component-runtime/blob/master/component-runtime-beam/src/main/java/org/talend/sdk/component/runtime/beam/transformer/BeamIOTransformer.java#L208
(sorry i dont use bytebuddy but directly asm). This is quite easy to do as
soon as you have a classloader for these classes or - if you reuse the jvm
classloader - a javaagent or equivalent.

If you don't want/can't change the bytecode then you manipulate a proxy
instead of the real instance (a wrapper):
https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/asm/ProxyGenerator.java#L118
.


>
> On Wed, Jun 6, 2018 at 9:56 PM Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
>> Note sure the example is atomic enough but in
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-manager/src/main/java/org/talend/sdk/component/runtime/manager/finder/StandaloneContainerFinder.java#L40
>> the "instance()" is a singleton used by all the runtime of the framework.
>>
>> Deserialization happens in
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/serialization/SerializableService.java#L26
>> and serialization is about creating this object in a write replace. Then
>> the runtime is switching its classloader (runner for beam) as in
>> https://github.com/Talend/component-runtime/blob/master/component-runtime-impl/src/main/java/org/talend/sdk/component/runtime/base/LifecycleImpl.java#L60
>> asap and resets it once done to not break its environment for reused jvms
>> case.
>>
>> If we take the case of an IO, the io would lazily creates its defined
>> classloader from its spec and use some reference counting logic to destroy
>> it when needed in its teardown for instance. The io then does the
>> classloader switch in its callbacks (setup/teardown/process/bundle hooks
>> etc).
>>
>>
>> Le mer. 6 juin 2018 23:33, Lukasz Cwik <lc...@google.com> a écrit :
>>
>>> Romain, can you point to an example of a global singleton registry that
>>> does this right for class loading (it may allow people to work towards such
>>> an effort)?
>>>
>>> On Tue, Jun 5, 2018 at 10:06 PM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
>>>> It is actually very localised in runner code where beam should reset
>>>> the classloader when the deserialization happens and then the runner owns
>>>> the classloader all the way in evaluators.
>>>>
>>>> If IO change the classloader they must indeed handle it too and patch
>>>> the deserialization too.
>>>>
>>>> Here again (we mentionned it multiple times in other threads) beam
>>>> misses a global singleton registry where you can register a "service" to
>>>> look it up based of a serialization configuration and a lifecycle allowing
>>>> to close the classloader in all instances without hacks.
>>>>
>>>>
>>>> Le mar. 5 juin 2018 23:50, Kenneth Knowles <k...@google.com> a écrit :
>>>>
>>>>> Perhaps we can also adopt a practice of making our own APIs explicitly
>>>>> pass a Classloader when appropriate so we only have to set this when we 
>>>>> are
>>>>> entering code that does not have good hygiene. It might actually be nice 
>>>>> to
>>>>> have a lightweight static analysis to forbid bad methods in our code.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Mon, Jun 4, 2018 at 3:43 PM Lukasz Cwik <lc...@google.com> wrote:
>>>>>
>>>>>> I totally agree, but there are so many Java APIs (including ours)
>>>>>> that messed this up so everyone lives with the same hack.
>>>>>>
>>>>>> On Mon, Jun 4, 2018 at 3:41 PM Andrew Pilloud <apill...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> It seems like a terribly fragile way to pass arguments but my tests
>>>>>>> pass when I wrap the JDBC path into Beam pipeline execution with that
>>>>>>> pattern.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Andrew
>>>>>>>
>>>>>>> On Mon, Jun 4, 2018 at 3:20 PM Lukasz Cwik <lc...@google.com> wrote:
>>>>>>>
>>>>>>>> It is a common mistake for APIs to not include a way to specify
>>>>>>>> which class loader to use when doing something like deserializing an
>>>>>>>> instance of a class via the ObjectInputStream. This common issue also
>>>>>>>> affects Apache Beam (SerializableCoder, PipelineOptionsFactory, ...) 
>>>>>>>> and
>>>>>>>> the way that typical Java APIs have gotten around this is to use to the
>>>>>>>> thread context class loader (TCCL) as the way to plumb this additional
>>>>>>>> attribute through. So Apache Beam is meant to in all places honor the 
>>>>>>>> TCCL
>>>>>>>> if it has been set as most Java libraries (not all) do the same hack.
>>>>>>>>
>>>>>>>> In most environments the TCCL is not set and we are working with a
>>>>>>>> single class loader. It turns out that in more complicated environments
>>>>>>>> (like when loading a JDBC driver, or JNDI, or an application server, 
>>>>>>>> ...)
>>>>>>>> this usually doesn't work without each caller knowing what class 
>>>>>>>> loading
>>>>>>>> context they should be in. A common work around for most scenarios is 
>>>>>>>> to
>>>>>>>> always set the TCCL to the current classes class loader like so before
>>>>>>>> invoking any APIs that do class loading so you don't propagate the 
>>>>>>>> TCCL of
>>>>>>>> the caller along since they may have set it for some other reason:
>>>>>>>>
>>>>>>>> ClassLoader originalClassLoader = 
>>>>>>>> Thread.currentThread().getContextClassLoader();try {
>>>>>>>>     
>>>>>>>> Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
>>>>>>>>     // call some API that uses reflection without taking ClassLoader 
>>>>>>>> param} finally {
>>>>>>>>     Thread.currentThread().setContextClassLoader(originalClassLoader);}
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jun 4, 2018 at 1:57 PM Andrew Pilloud <apill...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I'm having class loading issues that go away when I revert the
>>>>>>>>> changes in our use of Class.forName added in
>>>>>>>>> https://github.com/apache/beam/pull/4674. The problem I'm having
>>>>>>>>> is that the typical JDBC GUI (SqlWorkbench/J, SQuirreL SQL) creates an
>>>>>>>>> isolated class loader to load our library. Things work if we call
>>>>>>>>> Class.forName with the default class loader 
>>>>>>>>> [getClass().getClassLoader() or
>>>>>>>>> no argument] but not if we use the thread context class loader
>>>>>>>>> [Thread.currentThread().getContextClassLoader() or
>>>>>>>>> ReflectHelpers.findClassLoader()]. Why is using the default class 
>>>>>>>>> loader
>>>>>>>>> not the right thing to do? How can I fix this problem?
>>>>>>>>>
>>>>>>>>> See this integration test for an example:
>>>>>>>>> https://github.com/apilloud/beam/blob/directrunner/sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/JdbcIT.java#L44
>>>>>>>>>
>>>>>>>>> https://scans.gradle.com/s/iquqinhns2ymi/tests/slmg6ytuuqlus-akh5xpgshj32k
>>>>>>>>>
>>>>>>>>> Andrew
>>>>>>>>>
>>>>>>>>

Reply via email to