Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-19 Thread Jan Lukavský
c.

Jan

On 9/3/19 3:38 PM, Till Rohrmann wrote:

I see the problem Jan. What about the following proposal: Instead

of

using
the LocalEnvironment for local tests you always use the
RemoteEnvironment
but when testing it locally you spin up a MiniCluster in the same
process
and initialize the RemoteEnvironment with
`MiniCluster#getRestAddress`.
That way you would always submit a jar with the generated classes
and,
hence, not having to set the context class loader.

The contract of the LocalEnvironment is indeed that all classes it

is

supposed t execute must be present when being started.

Cheers,
Till

On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
guaishushu1...@163.com> wrote:



guaishushu1...@163.com

From: guaishushu1...@163.com
Date: 2019-09-03 20:25
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager
is not
using context classloader




guaishushu1...@163.com
From: guaishushu1...@163.com
Date: 2019-09-03 20:23
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager
is not
using context classloader
guaishushu1...@163.com
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not
using
context classloader
Hi Till,
the use-case is pretty much simple - I have a REPL shell in

groovy,

which generates classes at runtime. The actual hierarchy is
therefore
system class loader -> application classloader -> repl classloader
(GroovyClassLoader actually)
now, when a terminal (sink) operation in the shell occurs, I'm
able to
build a jar, which I can submit to remote cluster (in distributed
case).
But - during testing -  I run the code using local flink. There
is no
(legal) way of adding this new runtime generated jar to local
flink. As
I said, I have a hackish solution which works on JDK <= 8,
because it
uses reflection to call addURL on the application classloader (and
thefore "pretends", that the generated jar was there all the time
from
the JVM startup). This breaks on JDK >= 9. It might be possible
to work
around this somehow, but I think that the reason why
LocalEnvironment is
not having a way to add jars (as in case of RemoteEnvironment) is
that
is assumes, that you actually have all of the on classpath when
using
local runner. I think that this implies that it either has to use
context classloader (to be able to work on top of any
classloading user
might have), or is wrong and would need be fixed, so that
LocalEnvironment would accept files to "stage" - which would mean
adding
them to a class loader (probably URLClassLoader with the

application

class loader as parent).
Or, would you see any other option?
Jan
On 9/3/19 2:00 PM, Till Rohrmann wrote:

Hi Jan,

I've talked with Aljoscha and Stephan offline and we concluded
that we
would like to avoid the usage of context class loaders if

possible.

The
reason for this is that using the context class loader can easily
mess up
an otherwise clear class loader hierarchy which makes it hard to
reason
about and to debug class loader issues.

Given this, I think it would help to better understand the exact
problem
you are trying to solve by using the context class loader.
Usually the
usage of the context class loader points towards an API

deficiency

which

we

might be able to address differently.

Cheers,
Till

On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek

I’m not saying we can’t change that code to use the context

class

loader.

I’m just not sure whether this might break other things.

Best,
Aljoscha


On 2. Sep 2019, at 11:24, Jan Lukavský 

wrote:

Essentially, the class loader of Flink should be present in
parent

hierarchy of context class loader. If FlinkUserCodeClassLoader
doesn't

use

context class loader, then it is actually impossible to use a
hierarchy
like this:

 system class loader -> application class loader ->
user-defined class

loader (defines some UDFs to be used in flink program)

Flink now uses the application class loader, and so the UDFs
fail to

deserialize on local flink, because the user-defined class
loader is
bypassed. Moreover, there is no way to add additional classpath
elements
for LocalEnvironment (as opposed to RemoteEnvironment). I'm able
to hack
this by calling addURL method on the application class loader
(which is
terribly hackish), but that works only on JDK <= 8. No sensible

workaround

is available for JDK >= 9.

Alternative solution would be to enable adding jars to class
loader

when

using LocalEnvironment, but that looks a little odd.

Jan

On 9/2/19 11:02 AM, Aljoscha Krettek wrote:

Hi,

I actually don’t know whether that change would be ok.

FlinkUserCodeClassLoader has taken
FlinkUserCodeClassLoader.class.getClassLoader() as the parent

ClassLoader

before my change. See:


https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-19 Thread Stephan Ewen
gt; >>> to execute as if it had all classes on its class loader, then it
> > >>> currently breaks this contract. :-)
> > >>>
> > >>> Jan
> > >>>
> > >>> On 9/3/19 3:45 PM, Jan Lukavský wrote:
> > >>>> Hi Till,
> > >>>>
> > >>>> hmm, that sounds it might work. I would have to incorporate this
> > >>>> (either as default, or on demand) into Apache Beam. Would you see
> any
> > >>>> disadvantages of this approach? Would you suggest to make this
> default
> > >>>> behavior for local beam FlinkRunner? I can introduce a configuration
> > >>>> option to turn this behavior on, but that would bring additional
> > >>>> maintenance burden, etc., etc.
> > >>>>
> > >>>> Jan
> > >>>>
> > >>>> On 9/3/19 3:38 PM, Till Rohrmann wrote:
> > >>>>> I see the problem Jan. What about the following proposal: Instead
> of
> > >>>>> using
> > >>>>> the LocalEnvironment for local tests you always use the
> > >>>>> RemoteEnvironment
> > >>>>> but when testing it locally you spin up a MiniCluster in the same
> > >>>>> process
> > >>>>> and initialize the RemoteEnvironment with
> > >>>>> `MiniCluster#getRestAddress`.
> > >>>>> That way you would always submit a jar with the generated classes
> > >>>>> and,
> > >>>>> hence, not having to set the context class loader.
> > >>>>>
> > >>>>> The contract of the LocalEnvironment is indeed that all classes it
> is
> > >>>>> supposed t execute must be present when being started.
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Till
> > >>>>>
> > >>>>> On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
> > >>>>> guaishushu1...@163.com> wrote:
> > >>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> guaishushu1...@163.com
> > >>>>>>
> > >>>>>> From: guaishushu1...@163.com
> > >>>>>> Date: 2019-09-03 20:25
> > >>>>>> To: dev
> > >>>>>> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager
> > >>>>>> is not
> > >>>>>> using context classloader
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> guaishushu1...@163.com
> > >>>>>> From: guaishushu1...@163.com
> > >>>>>> Date: 2019-09-03 20:23
> > >>>>>> To: dev
> > >>>>>> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager
> > >>>>>> is not
> > >>>>>> using context classloader
> > >>>>>> guaishushu1...@163.com
> > >>>>>> From: Jan Lukavský
> > >>>>>> Date: 2019-09-03 20:17
> > >>>>>> To: dev
> > >>>>>> Subject: Re: ClassLoader created by BlobLibraryCacheManager is not
> > >>>>>> using
> > >>>>>> context classloader
> > >>>>>> Hi Till,
> > >>>>>> the use-case is pretty much simple - I have a REPL shell in
> groovy,
> > >>>>>> which generates classes at runtime. The actual hierarchy is
> > >>>>>> therefore
> > >>>>>> system class loader -> application classloader -> repl classloader
> > >>>>>> (GroovyClassLoader actually)
> > >>>>>> now, when a terminal (sink) operation in the shell occurs, I'm
> > >>>>>> able to
> > >>>>>> build a jar, which I can submit to remote cluster (in distributed
> > >>>>>> case).
> > >>>>>> But - during testing -  I run the code using local flink. There
> > >>>>>> is no
> > >>>>>> (legal) way of adding this new runtime generated jar to local
> > >>>>>> flink. As
> > >>>>>> I said, I have a hackish solution which works on JDK <= 8,
> >

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-10 Thread Till Rohrmann
;> maintenance burden, etc., etc.
> >>>>
> >>>> Jan
> >>>>
> >>>> On 9/3/19 3:38 PM, Till Rohrmann wrote:
> >>>>> I see the problem Jan. What about the following proposal: Instead of
> >>>>> using
> >>>>> the LocalEnvironment for local tests you always use the
> >>>>> RemoteEnvironment
> >>>>> but when testing it locally you spin up a MiniCluster in the same
> >>>>> process
> >>>>> and initialize the RemoteEnvironment with
> >>>>> `MiniCluster#getRestAddress`.
> >>>>> That way you would always submit a jar with the generated classes
> >>>>> and,
> >>>>> hence, not having to set the context class loader.
> >>>>>
> >>>>> The contract of the LocalEnvironment is indeed that all classes it is
> >>>>> supposed t execute must be present when being started.
> >>>>>
> >>>>> Cheers,
> >>>>> Till
> >>>>>
> >>>>> On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
> >>>>> guaishushu1...@163.com> wrote:
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> guaishushu1...@163.com
> >>>>>>
> >>>>>> From: guaishushu1...@163.com
> >>>>>> Date: 2019-09-03 20:25
> >>>>>> To: dev
> >>>>>> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager
> >>>>>> is not
> >>>>>> using context classloader
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> guaishushu1...@163.com
> >>>>>> From: guaishushu1...@163.com
> >>>>>> Date: 2019-09-03 20:23
> >>>>>> To: dev
> >>>>>> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager
> >>>>>> is not
> >>>>>> using context classloader
> >>>>>> guaishushu1...@163.com
> >>>>>> From: Jan Lukavský
> >>>>>> Date: 2019-09-03 20:17
> >>>>>> To: dev
> >>>>>> Subject: Re: ClassLoader created by BlobLibraryCacheManager is not
> >>>>>> using
> >>>>>> context classloader
> >>>>>> Hi Till,
> >>>>>> the use-case is pretty much simple - I have a REPL shell in groovy,
> >>>>>> which generates classes at runtime. The actual hierarchy is
> >>>>>> therefore
> >>>>>> system class loader -> application classloader -> repl classloader
> >>>>>> (GroovyClassLoader actually)
> >>>>>> now, when a terminal (sink) operation in the shell occurs, I'm
> >>>>>> able to
> >>>>>> build a jar, which I can submit to remote cluster (in distributed
> >>>>>> case).
> >>>>>> But - during testing -  I run the code using local flink. There
> >>>>>> is no
> >>>>>> (legal) way of adding this new runtime generated jar to local
> >>>>>> flink. As
> >>>>>> I said, I have a hackish solution which works on JDK <= 8,
> >>>>>> because it
> >>>>>> uses reflection to call addURL on the application classloader (and
> >>>>>> thefore "pretends", that the generated jar was there all the time
> >>>>>> from
> >>>>>> the JVM startup). This breaks on JDK >= 9. It might be possible
> >>>>>> to work
> >>>>>> around this somehow, but I think that the reason why
> >>>>>> LocalEnvironment is
> >>>>>> not having a way to add jars (as in case of RemoteEnvironment) is
> >>>>>> that
> >>>>>> is assumes, that you actually have all of the on classpath when
> >>>>>> using
> >>>>>> local runner. I think that this implies that it either has to use
> >>>>>> context classloader (to be able to work on top of any
> >>>>>> classloading user
> >>>>>> might have), or is wrong and would need be fixed, so that
> >>>>>> LocalEnvironment would accept files to "stage" - which would mean

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-05 Thread Jan Lukavský

Hi Till and Aljoscha,

I was investigating the other options, but unfortunately all of them 
look a little complicated (although possible, of course). But before 
going into a more complicated solutions, I'd like to know what issues do 
you actually see with using the context class loader. I can think of one 
difficulty - if (for whatever reason), the context class loader doesn't 
contain (in itself or as a parent) class loader that loaded flink core 
classes, that would probably cause troubles. So, what about a solution 
that we take as parent class loader of FlinkUserCodeClassLoaders a class 
loader that is:


 a) context class loader of current thread, if it either is actually 
class loader of flink core classes, or if it contains this class loader 
in its parent hierarchy, or


 b) class loader of flink core classes

That way, class loader of flink core classes would always be in parent 
hierarchy of FlinkUserCodeClassLoaders. Would that solve the issues you 
see? It works for me.


Jan

On 9/3/19 4:52 PM, Jan Lukavský wrote:

Answers inline.

On 9/3/19 4:01 PM, Till Rohrmann wrote:

How so? Does your REPL add the generated classes to the system class
loader? I assume the system class loader is used to load the Flink 
classes.
No, it does not. It cannot on JDK >= 9 (or would have to hack into 
jdk.internal.loader.ClassLoaders, which I don't want to :)). It just 
creates another class loader, and is able to create a jar from 
generated files. The jar is used for remote execution.


Ideally, what you would like to have is the option to provide the parent
class loader which is used load user code to the LocalEnvironment. 
This one

could then be forwarded to the TaskExecutor where it is used to generate
the user code class loader. But this is a bigger effort.
I'm not sure how this differs from using context classloader? Maybe 
there is subtle difference in that this is a little more explicit. On 
the other hand, users normally do not modify class loaders, so the 
practical impact is IMHO negligible. But maybe this opens another 
possibility - we probably could add optional ClassLoader parameter to 
LocalEnvironment, with default value of 
FlinkRunner.class.getClassLoader()? That might be a good compromise.


The downside to this approach is that it requires you to create a jar 
file

and to submit it via a REST call. The upside is that it is closer to the
production setting.


Yes, a REPL has to do that anyway to support distributed computing, so 
this is not an issue.


Jan



Cheers,
Till

On Tue, Sep 3, 2019 at 3:47 PM Jan Lukavský  wrote:


On the other hand, if you say, that the contract of LocalEnvironment is
to execute as if it had all classes on its class loader, then it
currently breaks this contract. :-)

Jan

On 9/3/19 3:45 PM, Jan Lukavský wrote:

Hi Till,

hmm, that sounds it might work. I would have to incorporate this
(either as default, or on demand) into Apache Beam. Would you see any
disadvantages of this approach? Would you suggest to make this default
behavior for local beam FlinkRunner? I can introduce a configuration
option to turn this behavior on, but that would bring additional
maintenance burden, etc., etc.

Jan

On 9/3/19 3:38 PM, Till Rohrmann wrote:

I see the problem Jan. What about the following proposal: Instead of
using
the LocalEnvironment for local tests you always use the
RemoteEnvironment
but when testing it locally you spin up a MiniCluster in the same
process
and initialize the RemoteEnvironment with 
`MiniCluster#getRestAddress`.
That way you would always submit a jar with the generated classes 
and,

hence, not having to set the context class loader.

The contract of the LocalEnvironment is indeed that all classes it is
supposed t execute must be present when being started.

Cheers,
Till

On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
guaishushu1...@163.com> wrote:




guaishushu1...@163.com

From: guaishushu1...@163.com
Date: 2019-09-03 20:25
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager 
is not

using context classloader




guaishushu1...@163.com
From: guaishushu1...@163.com
Date: 2019-09-03 20:23
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager 
is not

using context classloader
guaishushu1...@163.com
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not
using
context classloader
Hi Till,
the use-case is pretty much simple - I have a REPL shell in groovy,
which generates classes at runtime. The actual hierarchy is 
therefore

system class loader -> application classloader -> repl classloader
(GroovyClassLoader actually)
now, when a terminal (sink) operation in the shell occurs, I'm 
able to

build a jar, which I can submit to remote cluster (in distributed
case).
But - during testing -  I run the code using local flink. There 
is no
(legal) way of adding this new runtime generated jar to local 
fli

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský

Answers inline.

On 9/3/19 4:01 PM, Till Rohrmann wrote:

How so? Does your REPL add the generated classes to the system class
loader? I assume the system class loader is used to load the Flink classes.
No, it does not. It cannot on JDK >= 9 (or would have to hack into 
jdk.internal.loader.ClassLoaders, which I don't want to :)). It just 
creates another class loader, and is able to create a jar from generated 
files. The jar is used for remote execution.


Ideally, what you would like to have is the option to provide the parent
class loader which is used load user code to the LocalEnvironment. This one
could then be forwarded to the TaskExecutor where it is used to generate
the user code class loader. But this is a bigger effort.
I'm not sure how this differs from using context classloader? Maybe 
there is subtle difference in that this is a little more explicit. On 
the other hand, users normally do not modify class loaders, so the 
practical impact is IMHO negligible. But maybe this opens another 
possibility - we probably could add optional ClassLoader parameter to 
LocalEnvironment, with default value of 
FlinkRunner.class.getClassLoader()? That might be a good compromise.


The downside to this approach is that it requires you to create a jar file
and to submit it via a REST call. The upside is that it is closer to the
production setting.


Yes, a REPL has to do that anyway to support distributed computing, so 
this is not an issue.


Jan



Cheers,
Till

On Tue, Sep 3, 2019 at 3:47 PM Jan Lukavský  wrote:


On the other hand, if you say, that the contract of LocalEnvironment is
to execute as if it had all classes on its class loader, then it
currently breaks this contract. :-)

Jan

On 9/3/19 3:45 PM, Jan Lukavský wrote:

Hi Till,

hmm, that sounds it might work. I would have to incorporate this
(either as default, or on demand) into Apache Beam. Would you see any
disadvantages of this approach? Would you suggest to make this default
behavior for local beam FlinkRunner? I can introduce a configuration
option to turn this behavior on, but that would bring additional
maintenance burden, etc., etc.

Jan

On 9/3/19 3:38 PM, Till Rohrmann wrote:

I see the problem Jan. What about the following proposal: Instead of
using
the LocalEnvironment for local tests you always use the
RemoteEnvironment
but when testing it locally you spin up a MiniCluster in the same
process
and initialize the RemoteEnvironment with `MiniCluster#getRestAddress`.
That way you would always submit a jar with the generated classes and,
hence, not having to set the context class loader.

The contract of the LocalEnvironment is indeed that all classes it is
supposed t execute must be present when being started.

Cheers,
Till

On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
guaishushu1...@163.com> wrote:




guaishushu1...@163.com

From: guaishushu1...@163.com
Date: 2019-09-03 20:25
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
using context classloader




guaishushu1...@163.com
From: guaishushu1...@163.com
Date: 2019-09-03 20:23
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
using context classloader
guaishushu1...@163.com
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not
using
context classloader
Hi Till,
the use-case is pretty much simple - I have a REPL shell in groovy,
which generates classes at runtime. The actual hierarchy is therefore
system class loader -> application classloader -> repl classloader
(GroovyClassLoader actually)
now, when a terminal (sink) operation in the shell occurs, I'm able to
build a jar, which I can submit to remote cluster (in distributed
case).
But - during testing -  I run the code using local flink. There is no
(legal) way of adding this new runtime generated jar to local flink. As
I said, I have a hackish solution which works on JDK <= 8, because it
uses reflection to call addURL on the application classloader (and
thefore "pretends", that the generated jar was there all the time from
the JVM startup). This breaks on JDK >= 9. It might be possible to work
around this somehow, but I think that the reason why
LocalEnvironment is
not having a way to add jars (as in case of RemoteEnvironment) is that
is assumes, that you actually have all of the on classpath when using
local runner. I think that this implies that it either has to use
context classloader (to be able to work on top of any classloading user
might have), or is wrong and would need be fixed, so that
LocalEnvironment would accept files to "stage" - which would mean
adding
them to a class loader (probably URLClassLoader with the application
class loader as parent).
Or, would you see any other option?
Jan
On 9/3/19 2:00 PM, Till Rohrmann wrote:

Hi Jan,

I've talked with Aljoscha and Stephan offline and we concluded that we
would like to avoid t

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Till Rohrmann
How so? Does your REPL add the generated classes to the system class
loader? I assume the system class loader is used to load the Flink classes.

Ideally, what you would like to have is the option to provide the parent
class loader which is used load user code to the LocalEnvironment. This one
could then be forwarded to the TaskExecutor where it is used to generate
the user code class loader. But this is a bigger effort.

The downside to this approach is that it requires you to create a jar file
and to submit it via a REST call. The upside is that it is closer to the
production setting.

Cheers,
Till

On Tue, Sep 3, 2019 at 3:47 PM Jan Lukavský  wrote:

> On the other hand, if you say, that the contract of LocalEnvironment is
> to execute as if it had all classes on its class loader, then it
> currently breaks this contract. :-)
>
> Jan
>
> On 9/3/19 3:45 PM, Jan Lukavský wrote:
> > Hi Till,
> >
> > hmm, that sounds it might work. I would have to incorporate this
> > (either as default, or on demand) into Apache Beam. Would you see any
> > disadvantages of this approach? Would you suggest to make this default
> > behavior for local beam FlinkRunner? I can introduce a configuration
> > option to turn this behavior on, but that would bring additional
> > maintenance burden, etc., etc.
> >
> > Jan
> >
> > On 9/3/19 3:38 PM, Till Rohrmann wrote:
> >> I see the problem Jan. What about the following proposal: Instead of
> >> using
> >> the LocalEnvironment for local tests you always use the
> >> RemoteEnvironment
> >> but when testing it locally you spin up a MiniCluster in the same
> >> process
> >> and initialize the RemoteEnvironment with `MiniCluster#getRestAddress`.
> >> That way you would always submit a jar with the generated classes and,
> >> hence, not having to set the context class loader.
> >>
> >> The contract of the LocalEnvironment is indeed that all classes it is
> >> supposed t execute must be present when being started.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
> >> guaishushu1...@163.com> wrote:
> >>
> >>>
> >>>
> >>>
> >>> guaishushu1...@163.com
> >>>
> >>> From: guaishushu1...@163.com
> >>> Date: 2019-09-03 20:25
> >>> To: dev
> >>> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
> >>> using context classloader
> >>>
> >>>
> >>>
> >>>
> >>> guaishushu1...@163.com
> >>> From: guaishushu1...@163.com
> >>> Date: 2019-09-03 20:23
> >>> To: dev
> >>> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
> >>> using context classloader
> >>> guaishushu1...@163.com
> >>> From: Jan Lukavský
> >>> Date: 2019-09-03 20:17
> >>> To: dev
> >>> Subject: Re: ClassLoader created by BlobLibraryCacheManager is not
> >>> using
> >>> context classloader
> >>> Hi Till,
> >>> the use-case is pretty much simple - I have a REPL shell in groovy,
> >>> which generates classes at runtime. The actual hierarchy is therefore
> >>> system class loader -> application classloader -> repl classloader
> >>> (GroovyClassLoader actually)
> >>> now, when a terminal (sink) operation in the shell occurs, I'm able to
> >>> build a jar, which I can submit to remote cluster (in distributed
> >>> case).
> >>> But - during testing -  I run the code using local flink. There is no
> >>> (legal) way of adding this new runtime generated jar to local flink. As
> >>> I said, I have a hackish solution which works on JDK <= 8, because it
> >>> uses reflection to call addURL on the application classloader (and
> >>> thefore "pretends", that the generated jar was there all the time from
> >>> the JVM startup). This breaks on JDK >= 9. It might be possible to work
> >>> around this somehow, but I think that the reason why
> >>> LocalEnvironment is
> >>> not having a way to add jars (as in case of RemoteEnvironment) is that
> >>> is assumes, that you actually have all of the on classpath when using
> >>> local runner. I think that this implies that it either has to use
> >>> context classloader (to be able to work on top of any classloading user
> >>> might have), or is wrong and would need be fixed

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský
On the other hand, if you say, that the contract of LocalEnvironment is 
to execute as if it had all classes on its class loader, then it 
currently breaks this contract. :-)


Jan

On 9/3/19 3:45 PM, Jan Lukavský wrote:

Hi Till,

hmm, that sounds it might work. I would have to incorporate this 
(either as default, or on demand) into Apache Beam. Would you see any 
disadvantages of this approach? Would you suggest to make this default 
behavior for local beam FlinkRunner? I can introduce a configuration 
option to turn this behavior on, but that would bring additional 
maintenance burden, etc., etc.


Jan

On 9/3/19 3:38 PM, Till Rohrmann wrote:
I see the problem Jan. What about the following proposal: Instead of 
using
the LocalEnvironment for local tests you always use the 
RemoteEnvironment
but when testing it locally you spin up a MiniCluster in the same 
process

and initialize the RemoteEnvironment with `MiniCluster#getRestAddress`.
That way you would always submit a jar with the generated classes and,
hence, not having to set the context class loader.

The contract of the LocalEnvironment is indeed that all classes it is
supposed t execute must be present when being started.

Cheers,
Till

On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
guaishushu1...@163.com> wrote:





guaishushu1...@163.com

From: guaishushu1...@163.com
Date: 2019-09-03 20:25
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
using context classloader




guaishushu1...@163.com
From: guaishushu1...@163.com
Date: 2019-09-03 20:23
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
using context classloader
guaishushu1...@163.com
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not 
using

context classloader
Hi Till,
the use-case is pretty much simple - I have a REPL shell in groovy,
which generates classes at runtime. The actual hierarchy is therefore
system class loader -> application classloader -> repl classloader
(GroovyClassLoader actually)
now, when a terminal (sink) operation in the shell occurs, I'm able to
build a jar, which I can submit to remote cluster (in distributed 
case).

But - during testing -  I run the code using local flink. There is no
(legal) way of adding this new runtime generated jar to local flink. As
I said, I have a hackish solution which works on JDK <= 8, because it
uses reflection to call addURL on the application classloader (and
thefore "pretends", that the generated jar was there all the time from
the JVM startup). This breaks on JDK >= 9. It might be possible to work
around this somehow, but I think that the reason why 
LocalEnvironment is

not having a way to add jars (as in case of RemoteEnvironment) is that
is assumes, that you actually have all of the on classpath when using
local runner. I think that this implies that it either has to use
context classloader (to be able to work on top of any classloading user
might have), or is wrong and would need be fixed, so that
LocalEnvironment would accept files to "stage" - which would mean 
adding

them to a class loader (probably URLClassLoader with the application
class loader as parent).
Or, would you see any other option?
Jan
On 9/3/19 2:00 PM, Till Rohrmann wrote:

Hi Jan,

I've talked with Aljoscha and Stephan offline and we concluded that we
would like to avoid the usage of context class loaders if possible. 
The
reason for this is that using the context class loader can easily 
mess up
an otherwise clear class loader hierarchy which makes it hard to 
reason

about and to debug class loader issues.

Given this, I think it would help to better understand the exact 
problem

you are trying to solve by using the context class loader. Usually the
usage of the context class loader points towards an API deficiency 
which

we

might be able to address differently.

Cheers,
Till

On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
wrote:


I’m not saying we can’t change that code to use the context class

loader.

I’m just not sure whether this might break other things.

Best,
Aljoscha


On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:

Essentially, the class loader of Flink should be present in parent
hierarchy of context class loader. If FlinkUserCodeClassLoader 
doesn't

use
context class loader, then it is actually impossible to use a 
hierarchy

like this:
   system class loader -> application class loader -> 
user-defined class

loader (defines some UDFs to be used in flink program)

Flink now uses the application class loader, and so the UDFs fail to

deserialize on local flink, because the user-defined class loader is
bypassed. Moreover, there is no way to add additional classpath 
elements
for LocalEnvironment (as opposed to RemoteEnvironment). I'm able 
to hack
this by calling addURL method on the application class loader 
(which is

terribly hackis

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský

Hi Till,

hmm, that sounds it might work. I would have to incorporate this (either 
as default, or on demand) into Apache Beam. Would you see any 
disadvantages of this approach? Would you suggest to make this default 
behavior for local beam FlinkRunner? I can introduce a configuration 
option to turn this behavior on, but that would bring additional 
maintenance burden, etc., etc.


Jan

On 9/3/19 3:38 PM, Till Rohrmann wrote:

I see the problem Jan. What about the following proposal: Instead of using
the LocalEnvironment for local tests you always use the RemoteEnvironment
but when testing it locally you spin up a MiniCluster in the same process
and initialize the RemoteEnvironment with `MiniCluster#getRestAddress`.
That way you would always submit a jar with the generated classes and,
hence, not having to set the context class loader.

The contract of the LocalEnvironment is indeed that all classes it is
supposed t execute must be present when being started.

Cheers,
Till

On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
guaishushu1...@163.com> wrote:





guaishushu1...@163.com

From: guaishushu1...@163.com
Date: 2019-09-03 20:25
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
using context classloader




guaishushu1...@163.com
From: guaishushu1...@163.com
Date: 2019-09-03 20:23
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
using context classloader
guaishushu1...@163.com
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not using
context classloader
Hi Till,
the use-case is pretty much simple - I have a REPL shell in groovy,
which generates classes at runtime. The actual hierarchy is therefore
system class loader -> application classloader -> repl classloader
(GroovyClassLoader actually)
now, when a terminal (sink) operation in the shell occurs, I'm able to
build a jar, which I can submit to remote cluster (in distributed case).
But - during testing -  I run the code using local flink. There is no
(legal) way of adding this new runtime generated jar to local flink. As
I said, I have a hackish solution which works on JDK <= 8, because it
uses reflection to call addURL on the application classloader (and
thefore "pretends", that the generated jar was there all the time from
the JVM startup). This breaks on JDK >= 9. It might be possible to work
around this somehow, but I think that the reason why LocalEnvironment is
not having a way to add jars (as in case of RemoteEnvironment) is that
is assumes, that you actually have all of the on classpath when using
local runner. I think that this implies that it either has to use
context classloader (to be able to work on top of any classloading user
might have), or is wrong and would need be fixed, so that
LocalEnvironment would accept files to "stage" - which would mean adding
them to a class loader (probably URLClassLoader with the application
class loader as parent).
Or, would you see any other option?
Jan
On 9/3/19 2:00 PM, Till Rohrmann wrote:

Hi Jan,

I've talked with Aljoscha and Stephan offline and we concluded that we
would like to avoid the usage of context class loaders if possible. The
reason for this is that using the context class loader can easily mess up
an otherwise clear class loader hierarchy which makes it hard to reason
about and to debug class loader issues.

Given this, I think it would help to better understand the exact problem
you are trying to solve by using the context class loader. Usually the
usage of the context class loader points towards an API deficiency which

we

might be able to address differently.

Cheers,
Till

On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
wrote:


I’m not saying we can’t change that code to use the context class

loader.

I’m just not sure whether this might break other things.

Best,
Aljoscha


On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:

Essentially, the class loader of Flink should be present in parent

hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't

use

context class loader, then it is actually impossible to use a hierarchy
like this:

   system class loader -> application class loader -> user-defined class

loader (defines some UDFs to be used in flink program)

Flink now uses the application class loader, and so the UDFs fail to

deserialize on local flink, because the user-defined class loader is
bypassed. Moreover, there is no way to add additional classpath elements
for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack
this by calling addURL method on the application class loader (which is
terribly hackish), but that works only on JDK <= 8. No sensible

workaround

is available for JDK >= 9.

Alternative solution would be to enable adding jars to class loader

when

using LocalEnvironment, but that looks a little odd.

Jan

On 9/2/19 11:0

Re: Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Till Rohrmann
I see the problem Jan. What about the following proposal: Instead of using
the LocalEnvironment for local tests you always use the RemoteEnvironment
but when testing it locally you spin up a MiniCluster in the same process
and initialize the RemoteEnvironment with `MiniCluster#getRestAddress`.
That way you would always submit a jar with the generated classes and,
hence, not having to set the context class loader.

The contract of the LocalEnvironment is indeed that all classes it is
supposed t execute must be present when being started.

Cheers,
Till

On Tue, Sep 3, 2019 at 2:27 PM guaishushu1...@163.com <
guaishushu1...@163.com> wrote:

>
>
>
>
> guaishushu1...@163.com
>
> From: guaishushu1...@163.com
> Date: 2019-09-03 20:25
> To: dev
> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
> using context classloader
>
>
>
>
> guaishushu1...@163.com
> From: guaishushu1...@163.com
> Date: 2019-09-03 20:23
> To: dev
> Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not
> using context classloader
> guaishushu1...@163.com
> From: Jan Lukavský
> Date: 2019-09-03 20:17
> To: dev
> Subject: Re: ClassLoader created by BlobLibraryCacheManager is not using
> context classloader
> Hi Till,
> the use-case is pretty much simple - I have a REPL shell in groovy,
> which generates classes at runtime. The actual hierarchy is therefore
> system class loader -> application classloader -> repl classloader
> (GroovyClassLoader actually)
> now, when a terminal (sink) operation in the shell occurs, I'm able to
> build a jar, which I can submit to remote cluster (in distributed case).
> But - during testing -  I run the code using local flink. There is no
> (legal) way of adding this new runtime generated jar to local flink. As
> I said, I have a hackish solution which works on JDK <= 8, because it
> uses reflection to call addURL on the application classloader (and
> thefore "pretends", that the generated jar was there all the time from
> the JVM startup). This breaks on JDK >= 9. It might be possible to work
> around this somehow, but I think that the reason why LocalEnvironment is
> not having a way to add jars (as in case of RemoteEnvironment) is that
> is assumes, that you actually have all of the on classpath when using
> local runner. I think that this implies that it either has to use
> context classloader (to be able to work on top of any classloading user
> might have), or is wrong and would need be fixed, so that
> LocalEnvironment would accept files to "stage" - which would mean adding
> them to a class loader (probably URLClassLoader with the application
> class loader as parent).
> Or, would you see any other option?
> Jan
> On 9/3/19 2:00 PM, Till Rohrmann wrote:
> > Hi Jan,
> >
> > I've talked with Aljoscha and Stephan offline and we concluded that we
> > would like to avoid the usage of context class loaders if possible. The
> > reason for this is that using the context class loader can easily mess up
> > an otherwise clear class loader hierarchy which makes it hard to reason
> > about and to debug class loader issues.
> >
> > Given this, I think it would help to better understand the exact problem
> > you are trying to solve by using the context class loader. Usually the
> > usage of the context class loader points towards an API deficiency which
> we
> > might be able to address differently.
> >
> > Cheers,
> > Till
> >
> > On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
> > wrote:
> >
> >> I’m not saying we can’t change that code to use the context class
> loader.
> >> I’m just not sure whether this might break other things.
> >>
> >> Best,
> >> Aljoscha
> >>
> >>> On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:
> >>>
> >>> Essentially, the class loader of Flink should be present in parent
> >> hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't
> use
> >> context class loader, then it is actually impossible to use a hierarchy
> >> like this:
> >>>   system class loader -> application class loader -> user-defined class
> >> loader (defines some UDFs to be used in flink program)
> >>> Flink now uses the application class loader, and so the UDFs fail to
> >> deserialize on local flink, because the user-defined class loader is
> >> bypassed. Moreover, there is no way to add additional classpath elements
> >> for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack
> >> this by calling addURL method on the application

Re: Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread guaishushu1...@163.com




guaishushu1...@163.com
 
From: guaishushu1...@163.com
Date: 2019-09-03 20:25
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not using 
context classloader
 
 
 
 
guaishushu1...@163.com
From: guaishushu1...@163.com
Date: 2019-09-03 20:23
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not using 
context classloader
guaishushu1...@163.com
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not using 
context classloader
Hi Till,
the use-case is pretty much simple - I have a REPL shell in groovy, 
which generates classes at runtime. The actual hierarchy is therefore
system class loader -> application classloader -> repl classloader 
(GroovyClassLoader actually)
now, when a terminal (sink) operation in the shell occurs, I'm able to 
build a jar, which I can submit to remote cluster (in distributed case). 
But - during testing -  I run the code using local flink. There is no 
(legal) way of adding this new runtime generated jar to local flink. As 
I said, I have a hackish solution which works on JDK <= 8, because it 
uses reflection to call addURL on the application classloader (and 
thefore "pretends", that the generated jar was there all the time from 
the JVM startup). This breaks on JDK >= 9. It might be possible to work 
around this somehow, but I think that the reason why LocalEnvironment is 
not having a way to add jars (as in case of RemoteEnvironment) is that 
is assumes, that you actually have all of the on classpath when using 
local runner. I think that this implies that it either has to use 
context classloader (to be able to work on top of any classloading user 
might have), or is wrong and would need be fixed, so that 
LocalEnvironment would accept files to "stage" - which would mean adding 
them to a class loader (probably URLClassLoader with the application 
class loader as parent).
Or, would you see any other option?
Jan
On 9/3/19 2:00 PM, Till Rohrmann wrote:
> Hi Jan,
>
> I've talked with Aljoscha and Stephan offline and we concluded that we
> would like to avoid the usage of context class loaders if possible. The
> reason for this is that using the context class loader can easily mess up
> an otherwise clear class loader hierarchy which makes it hard to reason
> about and to debug class loader issues.
>
> Given this, I think it would help to better understand the exact problem
> you are trying to solve by using the context class loader. Usually the
> usage of the context class loader points towards an API deficiency which we
> might be able to address differently.
>
> Cheers,
> Till
>
> On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
> wrote:
>
>> I’m not saying we can’t change that code to use the context class loader.
>> I’m just not sure whether this might break other things.
>>
>> Best,
>> Aljoscha
>>
>>> On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:
>>>
>>> Essentially, the class loader of Flink should be present in parent
>> hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't use
>> context class loader, then it is actually impossible to use a hierarchy
>> like this:
>>>   system class loader -> application class loader -> user-defined class
>> loader (defines some UDFs to be used in flink program)
>>> Flink now uses the application class loader, and so the UDFs fail to
>> deserialize on local flink, because the user-defined class loader is
>> bypassed. Moreover, there is no way to add additional classpath elements
>> for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack
>> this by calling addURL method on the application class loader (which is
>> terribly hackish), but that works only on JDK <= 8. No sensible workaround
>> is available for JDK >= 9.
>>> Alternative solution would be to enable adding jars to class loader when
>> using LocalEnvironment, but that looks a little odd.
>>> Jan
>>>
>>> On 9/2/19 11:02 AM, Aljoscha Krettek wrote:
>>>> Hi,
>>>>
>>>> I actually don’t know whether that change would be ok.
>> FlinkUserCodeClassLoader has taken
>> FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader
>> before my change. See:
>> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
>> <
>> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
>>> .
>>>> I have the feeling that this might be on purpose beca

Re: Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread guaishushu1...@163.com




guaishushu1...@163.com
 
From: guaishushu1...@163.com
Date: 2019-09-03 20:23
To: dev
Subject: Re: Re: ClassLoader created by BlobLibraryCacheManager is not using 
context classloader
 
 
 
 
guaishushu1...@163.com
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not using 
context classloader
Hi Till,
the use-case is pretty much simple - I have a REPL shell in groovy, 
which generates classes at runtime. The actual hierarchy is therefore
system class loader -> application classloader -> repl classloader 
(GroovyClassLoader actually)
now, when a terminal (sink) operation in the shell occurs, I'm able to 
build a jar, which I can submit to remote cluster (in distributed case). 
But - during testing -  I run the code using local flink. There is no 
(legal) way of adding this new runtime generated jar to local flink. As 
I said, I have a hackish solution which works on JDK <= 8, because it 
uses reflection to call addURL on the application classloader (and 
thefore "pretends", that the generated jar was there all the time from 
the JVM startup). This breaks on JDK >= 9. It might be possible to work 
around this somehow, but I think that the reason why LocalEnvironment is 
not having a way to add jars (as in case of RemoteEnvironment) is that 
is assumes, that you actually have all of the on classpath when using 
local runner. I think that this implies that it either has to use 
context classloader (to be able to work on top of any classloading user 
might have), or is wrong and would need be fixed, so that 
LocalEnvironment would accept files to "stage" - which would mean adding 
them to a class loader (probably URLClassLoader with the application 
class loader as parent).
Or, would you see any other option?
Jan
On 9/3/19 2:00 PM, Till Rohrmann wrote:
> Hi Jan,
>
> I've talked with Aljoscha and Stephan offline and we concluded that we
> would like to avoid the usage of context class loaders if possible. The
> reason for this is that using the context class loader can easily mess up
> an otherwise clear class loader hierarchy which makes it hard to reason
> about and to debug class loader issues.
>
> Given this, I think it would help to better understand the exact problem
> you are trying to solve by using the context class loader. Usually the
> usage of the context class loader points towards an API deficiency which we
> might be able to address differently.
>
> Cheers,
> Till
>
> On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
> wrote:
>
>> I’m not saying we can’t change that code to use the context class loader.
>> I’m just not sure whether this might break other things.
>>
>> Best,
>> Aljoscha
>>
>>> On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:
>>>
>>> Essentially, the class loader of Flink should be present in parent
>> hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't use
>> context class loader, then it is actually impossible to use a hierarchy
>> like this:
>>>   system class loader -> application class loader -> user-defined class
>> loader (defines some UDFs to be used in flink program)
>>> Flink now uses the application class loader, and so the UDFs fail to
>> deserialize on local flink, because the user-defined class loader is
>> bypassed. Moreover, there is no way to add additional classpath elements
>> for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack
>> this by calling addURL method on the application class loader (which is
>> terribly hackish), but that works only on JDK <= 8. No sensible workaround
>> is available for JDK >= 9.
>>> Alternative solution would be to enable adding jars to class loader when
>> using LocalEnvironment, but that looks a little odd.
>>> Jan
>>>
>>> On 9/2/19 11:02 AM, Aljoscha Krettek wrote:
>>>> Hi,
>>>>
>>>> I actually don’t know whether that change would be ok.
>> FlinkUserCodeClassLoader has taken
>> FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader
>> before my change. See:
>> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
>> <
>> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
>>> .
>>>> I have the feeling that this might be on purpose because we want to
>> have the ClassLoader of the Flink Framework components be the parent
>> ClassLoader, but I could be wrong. Maybe Stephan would be most appropriate
>>

Re: Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread guaishushu1...@163.com




guaishushu1...@163.com
 
From: Jan Lukavský
Date: 2019-09-03 20:17
To: dev
Subject: Re: ClassLoader created by BlobLibraryCacheManager is not using 
context classloader
Hi Till,
 
the use-case is pretty much simple - I have a REPL shell in groovy, 
which generates classes at runtime. The actual hierarchy is therefore
 
 system class loader -> application classloader -> repl classloader 
(GroovyClassLoader actually)
 
now, when a terminal (sink) operation in the shell occurs, I'm able to 
build a jar, which I can submit to remote cluster (in distributed case). 
But - during testing -  I run the code using local flink. There is no 
(legal) way of adding this new runtime generated jar to local flink. As 
I said, I have a hackish solution which works on JDK <= 8, because it 
uses reflection to call addURL on the application classloader (and 
thefore "pretends", that the generated jar was there all the time from 
the JVM startup). This breaks on JDK >= 9. It might be possible to work 
around this somehow, but I think that the reason why LocalEnvironment is 
not having a way to add jars (as in case of RemoteEnvironment) is that 
is assumes, that you actually have all of the on classpath when using 
local runner. I think that this implies that it either has to use 
context classloader (to be able to work on top of any classloading user 
might have), or is wrong and would need be fixed, so that 
LocalEnvironment would accept files to "stage" - which would mean adding 
them to a class loader (probably URLClassLoader with the application 
class loader as parent).
 
Or, would you see any other option?
 
Jan
 
 
On 9/3/19 2:00 PM, Till Rohrmann wrote:
> Hi Jan,
>
> I've talked with Aljoscha and Stephan offline and we concluded that we
> would like to avoid the usage of context class loaders if possible. The
> reason for this is that using the context class loader can easily mess up
> an otherwise clear class loader hierarchy which makes it hard to reason
> about and to debug class loader issues.
>
> Given this, I think it would help to better understand the exact problem
> you are trying to solve by using the context class loader. Usually the
> usage of the context class loader points towards an API deficiency which we
> might be able to address differently.
>
> Cheers,
> Till
>
> On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
> wrote:
>
>> I’m not saying we can’t change that code to use the context class loader.
>> I’m just not sure whether this might break other things.
>>
>> Best,
>> Aljoscha
>>
>>> On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:
>>>
>>> Essentially, the class loader of Flink should be present in parent
>> hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't use
>> context class loader, then it is actually impossible to use a hierarchy
>> like this:
>>>   system class loader -> application class loader -> user-defined class
>> loader (defines some UDFs to be used in flink program)
>>> Flink now uses the application class loader, and so the UDFs fail to
>> deserialize on local flink, because the user-defined class loader is
>> bypassed. Moreover, there is no way to add additional classpath elements
>> for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack
>> this by calling addURL method on the application class loader (which is
>> terribly hackish), but that works only on JDK <= 8. No sensible workaround
>> is available for JDK >= 9.
>>> Alternative solution would be to enable adding jars to class loader when
>> using LocalEnvironment, but that looks a little odd.
>>> Jan
>>>
>>> On 9/2/19 11:02 AM, Aljoscha Krettek wrote:
>>>> Hi,
>>>>
>>>> I actually don’t know whether that change would be ok.
>> FlinkUserCodeClassLoader has taken
>> FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader
>> before my change. See:
>> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
>> <
>> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
>>> .
>>>> I have the feeling that this might be on purpose because we want to
>> have the ClassLoader of the Flink Framework components be the parent
>> ClassLoader, but I could be wrong. Maybe Stephan would be most appropriate
>> for answering this.
>>>> Best,
>>>> Aljoscha
>>>>
>>>>> On 30. Aug 2019, at 16:28, Till Rohrmann  wrote:
>>>>

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Jan Lukavský

Hi Till,

the use-case is pretty much simple - I have a REPL shell in groovy, 
which generates classes at runtime. The actual hierarchy is therefore


 system class loader -> application classloader -> repl classloader 
(GroovyClassLoader actually)


now, when a terminal (sink) operation in the shell occurs, I'm able to 
build a jar, which I can submit to remote cluster (in distributed case). 
But - during testing -  I run the code using local flink. There is no 
(legal) way of adding this new runtime generated jar to local flink. As 
I said, I have a hackish solution which works on JDK <= 8, because it 
uses reflection to call addURL on the application classloader (and 
thefore "pretends", that the generated jar was there all the time from 
the JVM startup). This breaks on JDK >= 9. It might be possible to work 
around this somehow, but I think that the reason why LocalEnvironment is 
not having a way to add jars (as in case of RemoteEnvironment) is that 
is assumes, that you actually have all of the on classpath when using 
local runner. I think that this implies that it either has to use 
context classloader (to be able to work on top of any classloading user 
might have), or is wrong and would need be fixed, so that 
LocalEnvironment would accept files to "stage" - which would mean adding 
them to a class loader (probably URLClassLoader with the application 
class loader as parent).


Or, would you see any other option?

Jan


On 9/3/19 2:00 PM, Till Rohrmann wrote:

Hi Jan,

I've talked with Aljoscha and Stephan offline and we concluded that we
would like to avoid the usage of context class loaders if possible. The
reason for this is that using the context class loader can easily mess up
an otherwise clear class loader hierarchy which makes it hard to reason
about and to debug class loader issues.

Given this, I think it would help to better understand the exact problem
you are trying to solve by using the context class loader. Usually the
usage of the context class loader points towards an API deficiency which we
might be able to address differently.

Cheers,
Till

On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
wrote:


I’m not saying we can’t change that code to use the context class loader.
I’m just not sure whether this might break other things.

Best,
Aljoscha


On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:

Essentially, the class loader of Flink should be present in parent

hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't use
context class loader, then it is actually impossible to use a hierarchy
like this:

  system class loader -> application class loader -> user-defined class

loader (defines some UDFs to be used in flink program)

Flink now uses the application class loader, and so the UDFs fail to

deserialize on local flink, because the user-defined class loader is
bypassed. Moreover, there is no way to add additional classpath elements
for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack
this by calling addURL method on the application class loader (which is
terribly hackish), but that works only on JDK <= 8. No sensible workaround
is available for JDK >= 9.

Alternative solution would be to enable adding jars to class loader when

using LocalEnvironment, but that looks a little odd.

Jan

On 9/2/19 11:02 AM, Aljoscha Krettek wrote:

Hi,

I actually don’t know whether that change would be ok.

FlinkUserCodeClassLoader has taken
FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader
before my change. See:
https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
<
https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java

.

I have the feeling that this might be on purpose because we want to

have the ClassLoader of the Flink Framework components be the parent
ClassLoader, but I could be wrong. Maybe Stephan would be most appropriate
for answering this.

Best,
Aljoscha


On 30. Aug 2019, at 16:28, Till Rohrmann  wrote:

Hi Jan,

this looks to me like a bug for which you could create a JIRA and PR

to fix it. Just to make sure, I've pulled in Aljoscha who is the author of
this change to check with him whether we are forgetting something.

Cheers,
Till

On Fri, Aug 30, 2019 at 3:44 PM Jan Lukavský 
je...@seznam.cz>> wrote:

Hi,

I have come across an issue with classloading in Flink's MiniCluster.
The issue is that when I run local flink job from a thread, that has a
non-default context classloader (for whatever reason), this classloader
is not taken into account when classloading user defined functions.

This

is due to [1]. Is this behavior intentional, or can I file a JIRA and
use Thread.currentThread.getContextClassLoader() there? I have

validated

that it fixes issues I'm facing.

Jan

[1]


https://github.com/apache/flink/blob/ce

Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-03 Thread Till Rohrmann
Hi Jan,

I've talked with Aljoscha and Stephan offline and we concluded that we
would like to avoid the usage of context class loaders if possible. The
reason for this is that using the context class loader can easily mess up
an otherwise clear class loader hierarchy which makes it hard to reason
about and to debug class loader issues.

Given this, I think it would help to better understand the exact problem
you are trying to solve by using the context class loader. Usually the
usage of the context class loader points towards an API deficiency which we
might be able to address differently.

Cheers,
Till

On Mon, Sep 2, 2019 at 11:32 AM Aljoscha Krettek 
wrote:

> I’m not saying we can’t change that code to use the context class loader.
> I’m just not sure whether this might break other things.
>
> Best,
> Aljoscha
>
> > On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:
> >
> > Essentially, the class loader of Flink should be present in parent
> hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't use
> context class loader, then it is actually impossible to use a hierarchy
> like this:
> >
> >  system class loader -> application class loader -> user-defined class
> loader (defines some UDFs to be used in flink program)
> >
> > Flink now uses the application class loader, and so the UDFs fail to
> deserialize on local flink, because the user-defined class loader is
> bypassed. Moreover, there is no way to add additional classpath elements
> for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack
> this by calling addURL method on the application class loader (which is
> terribly hackish), but that works only on JDK <= 8. No sensible workaround
> is available for JDK >= 9.
> >
> > Alternative solution would be to enable adding jars to class loader when
> using LocalEnvironment, but that looks a little odd.
> >
> > Jan
> >
> > On 9/2/19 11:02 AM, Aljoscha Krettek wrote:
> >> Hi,
> >>
> >> I actually don’t know whether that change would be ok.
> FlinkUserCodeClassLoader has taken
> FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader
> before my change. See:
> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
> <
> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
> >.
> >>
> >> I have the feeling that this might be on purpose because we want to
> have the ClassLoader of the Flink Framework components be the parent
> ClassLoader, but I could be wrong. Maybe Stephan would be most appropriate
> for answering this.
> >>
> >> Best,
> >> Aljoscha
> >>
> >>> On 30. Aug 2019, at 16:28, Till Rohrmann  wrote:
> >>>
> >>> Hi Jan,
> >>>
> >>> this looks to me like a bug for which you could create a JIRA and PR
> to fix it. Just to make sure, I've pulled in Aljoscha who is the author of
> this change to check with him whether we are forgetting something.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Fri, Aug 30, 2019 at 3:44 PM Jan Lukavský  je...@seznam.cz>> wrote:
> >>> Hi,
> >>>
> >>> I have come across an issue with classloading in Flink's MiniCluster.
> >>> The issue is that when I run local flink job from a thread, that has a
> >>> non-default context classloader (for whatever reason), this classloader
> >>> is not taken into account when classloading user defined functions.
> This
> >>> is due to [1]. Is this behavior intentional, or can I file a JIRA and
> >>> use Thread.currentThread.getContextClassLoader() there? I have
> validated
> >>> that it fixes issues I'm facing.
> >>>
> >>> Jan
> >>>
> >>> [1]
> >>>
> https://github.com/apache/flink/blob/ce557839d762b5f1ec92aa1885fd3d2ae33d0d0b/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java#L280
> <
> https://github.com/apache/flink/blob/ce557839d762b5f1ec92aa1885fd3d2ae33d0d0b/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java#L280
> >
> >>>
> >>
>
>


Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-02 Thread Aljoscha Krettek
I’m not saying we can’t change that code to use the context class loader. I’m 
just not sure whether this might break other things.

Best,
Aljoscha

> On 2. Sep 2019, at 11:24, Jan Lukavský  wrote:
> 
> Essentially, the class loader of Flink should be present in parent hierarchy 
> of context class loader. If FlinkUserCodeClassLoader doesn't use context 
> class loader, then it is actually impossible to use a hierarchy like this:
> 
>  system class loader -> application class loader -> user-defined class loader 
> (defines some UDFs to be used in flink program)
> 
> Flink now uses the application class loader, and so the UDFs fail to 
> deserialize on local flink, because the user-defined class loader is 
> bypassed. Moreover, there is no way to add additional classpath elements for 
> LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack this by 
> calling addURL method on the application class loader (which is terribly 
> hackish), but that works only on JDK <= 8. No sensible workaround is 
> available for JDK >= 9.
> 
> Alternative solution would be to enable adding jars to class loader when 
> using LocalEnvironment, but that looks a little odd.
> 
> Jan
> 
> On 9/2/19 11:02 AM, Aljoscha Krettek wrote:
>> Hi,
>> 
>> I actually don’t know whether that change would be ok. 
>> FlinkUserCodeClassLoader has taken 
>> FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader 
>> before my change. See: 
>> https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
>>  
>> .
>> 
>> I have the feeling that this might be on purpose because we want to have the 
>> ClassLoader of the Flink Framework components be the parent ClassLoader, but 
>> I could be wrong. Maybe Stephan would be most appropriate for answering this.
>> 
>> Best,
>> Aljoscha
>> 
>>> On 30. Aug 2019, at 16:28, Till Rohrmann  wrote:
>>> 
>>> Hi Jan,
>>> 
>>> this looks to me like a bug for which you could create a JIRA and PR to fix 
>>> it. Just to make sure, I've pulled in Aljoscha who is the author of this 
>>> change to check with him whether we are forgetting something.
>>> 
>>> Cheers,
>>> Till
>>> 
>>> On Fri, Aug 30, 2019 at 3:44 PM Jan Lukavský >> > wrote:
>>> Hi,
>>> 
>>> I have come across an issue with classloading in Flink's MiniCluster.
>>> The issue is that when I run local flink job from a thread, that has a
>>> non-default context classloader (for whatever reason), this classloader
>>> is not taken into account when classloading user defined functions. This
>>> is due to [1]. Is this behavior intentional, or can I file a JIRA and
>>> use Thread.currentThread.getContextClassLoader() there? I have validated
>>> that it fixes issues I'm facing.
>>> 
>>> Jan
>>> 
>>> [1]
>>> https://github.com/apache/flink/blob/ce557839d762b5f1ec92aa1885fd3d2ae33d0d0b/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java#L280
>>>  
>>> 
>>> 
>> 



Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-02 Thread Jan Lukavský
Essentially, the class loader of Flink should be present in parent 
hierarchy of context class loader. If FlinkUserCodeClassLoader doesn't 
use context class loader, then it is actually impossible to use a 
hierarchy like this:


 system class loader -> application class loader -> user-defined class 
loader (defines some UDFs to be used in flink program)


Flink now uses the application class loader, and so the UDFs fail to 
deserialize on local flink, because the user-defined class loader is 
bypassed. Moreover, there is no way to add additional classpath elements 
for LocalEnvironment (as opposed to RemoteEnvironment). I'm able to hack 
this by calling addURL method on the application class loader (which is 
terribly hackish), but that works only on JDK <= 8. No sensible 
workaround is available for JDK >= 9.


Alternative solution would be to enable adding jars to class loader when 
using LocalEnvironment, but that looks a little odd.


Jan

On 9/2/19 11:02 AM, Aljoscha Krettek wrote:

Hi,

I actually don’t know whether that change would be ok. FlinkUserCodeClassLoader has 
taken FlinkUserCodeClassLoader.class.getClassLoader() as the parent ClassLoader 
before my change. See: 
https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
 
.

I have the feeling that this might be on purpose because we want to have the 
ClassLoader of the Flink Framework components be the parent ClassLoader, but I 
could be wrong. Maybe Stephan would be most appropriate for answering this.

Best,
Aljoscha


On 30. Aug 2019, at 16:28, Till Rohrmann  wrote:

Hi Jan,

this looks to me like a bug for which you could create a JIRA and PR to fix it. 
Just to make sure, I've pulled in Aljoscha who is the author of this change to 
check with him whether we are forgetting something.

Cheers,
Till

On Fri, Aug 30, 2019 at 3:44 PM Jan Lukavský mailto:je...@seznam.cz>> wrote:
Hi,

I have come across an issue with classloading in Flink's MiniCluster.
The issue is that when I run local flink job from a thread, that has a
non-default context classloader (for whatever reason), this classloader
is not taken into account when classloading user defined functions. This
is due to [1]. Is this behavior intentional, or can I file a JIRA and
use Thread.currentThread.getContextClassLoader() there? I have validated
that it fixes issues I'm facing.

Jan

[1]
https://github.com/apache/flink/blob/ce557839d762b5f1ec92aa1885fd3d2ae33d0d0b/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java#L280
 






Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-09-02 Thread Aljoscha Krettek
Hi,

I actually don’t know whether that change would be ok. FlinkUserCodeClassLoader 
has taken FlinkUserCodeClassLoader.class.getClassLoader() as the parent 
ClassLoader before my change. See: 
https://github.com/apache/flink/blob/release-1.2/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoader.java
 
.

I have the feeling that this might be on purpose because we want to have the 
ClassLoader of the Flink Framework components be the parent ClassLoader, but I 
could be wrong. Maybe Stephan would be most appropriate for answering this.

Best,
Aljoscha

> On 30. Aug 2019, at 16:28, Till Rohrmann  wrote:
> 
> Hi Jan,
> 
> this looks to me like a bug for which you could create a JIRA and PR to fix 
> it. Just to make sure, I've pulled in Aljoscha who is the author of this 
> change to check with him whether we are forgetting something.
> 
> Cheers,
> Till
> 
> On Fri, Aug 30, 2019 at 3:44 PM Jan Lukavský  > wrote:
> Hi,
> 
> I have come across an issue with classloading in Flink's MiniCluster. 
> The issue is that when I run local flink job from a thread, that has a 
> non-default context classloader (for whatever reason), this classloader 
> is not taken into account when classloading user defined functions. This 
> is due to [1]. Is this behavior intentional, or can I file a JIRA and 
> use Thread.currentThread.getContextClassLoader() there? I have validated 
> that it fixes issues I'm facing.
> 
> Jan
> 
> [1] 
> https://github.com/apache/flink/blob/ce557839d762b5f1ec92aa1885fd3d2ae33d0d0b/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java#L280
>  
> 
> 



Re: ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-08-30 Thread Till Rohrmann
Hi Jan,

this looks to me like a bug for which you could create a JIRA and PR to fix
it. Just to make sure, I've pulled in Aljoscha who is the author of this
change to check with him whether we are forgetting something.

Cheers,
Till

On Fri, Aug 30, 2019 at 3:44 PM Jan Lukavský  wrote:

> Hi,
>
> I have come across an issue with classloading in Flink's MiniCluster.
> The issue is that when I run local flink job from a thread, that has a
> non-default context classloader (for whatever reason), this classloader
> is not taken into account when classloading user defined functions. This
> is due to [1]. Is this behavior intentional, or can I file a JIRA and
> use Thread.currentThread.getContextClassLoader() there? I have validated
> that it fixes issues I'm facing.
>
> Jan
>
> [1]
>
> https://github.com/apache/flink/blob/ce557839d762b5f1ec92aa1885fd3d2ae33d0d0b/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java#L280
>
>


ClassLoader created by BlobLibraryCacheManager is not using context classloader

2019-08-30 Thread Jan Lukavský

Hi,

I have come across an issue with classloading in Flink's MiniCluster. 
The issue is that when I run local flink job from a thread, that has a 
non-default context classloader (for whatever reason), this classloader 
is not taken into account when classloading user defined functions. This 
is due to [1]. Is this behavior intentional, or can I file a JIRA and 
use Thread.currentThread.getContextClassLoader() there? I have validated 
that it fixes issues I'm facing.


Jan

[1] 
https://github.com/apache/flink/blob/ce557839d762b5f1ec92aa1885fd3d2ae33d0d0b/flink-runtime/src/main/java/org/apache/flink/runtime/execution/librarycache/BlobLibraryCacheManager.java#L280