Re: Class Cache

2017-08-01 Thread Mike Accola
Thank you to those of you who replied.

I can't really explain why, but yesterday afternoon, out of the blue, this 
problem disappeared.  Everything was working fine for a while.  Until some 
different strange behavior came up this afternoon.  I don't believe any 
code changes I made should affect this.

I suspect this is all related to some kind of caching somewhere somehow.

I was already putting mylib2.jar into flink's lib directory.  I was doing 
this because mylib2.jar has some native methods and I had had better luck 
loading these.

Now I am running into a problem where my application runs successfully.  I 
then turn around and run the same application a 2nd time (it should get 
exact same results). Except this 2nd time I get a ClassNotFoundException 
for one of the classes in mylib2.jar.  I've temporarily taken references 
to the class that uses the native library out of the code just to rule out 
that this is related to any native loading problems.  If I do 
stop-local.sh and start-local.sh, I get the same result:  works the first 
time I run, but get the ClassNotFoundException the 2nd time.

I am running flink 1.3.0 on linux (RHEL 6.8). I am not using yarn.  Just 
plain, out of the box local mode.

Is it possible that the cache in this blob server is getting corrupted? Is 
there a way to tell?  Is there a way to disable the blob server?

Any other ideas on things to look at?






From:   Ufuk Celebi <u...@apache.org>
To: dev@flink.apache.org
Date:   07/31/2017 04:13 PM
Subject:Re: Class Cache



Hey Mike!

Thanks for the detailed information about your setup. I'm also puzzled
by this...

(1) Which version of Flink are you using? We recently merged some
changes to the JAR distribution components, which might cause this
(although I think that's unlikely).

(2) As a temporary work around you could try putting mylib2.jar into
the /lib folder and not pulling it in via --classpath. After doing
this you would need to stop/start the cluster and resubmit the job.

– Ufuk


On Mon, Jul 31, 2017 at 10:17 PM, Stephan Ewen <se...@apache.org> wrote:
> Hi Mike!
>
> Flink does in fact cache jar files in the "blob server". But these are
> cached subject to the following conditions:
>
>   - No caching across "sessions", meaning start/stop of the
> cluster/jobmanager. If you run the per-job-yarn setup, the job does not
> cache anything.
>
>   - Files are cached under a content hash, meaning as soon as the 
contents
> changes, the artifact is not reused. So if you actually change the jar
> file, no caching should happen.
>
> I cannot really explain what you are observing and have never seen that
> myself...
>
> Stephan
>
>
> On Mon, Jul 31, 2017 at 9:00 PM, Mike Accola <macc...@us.ibm.com> wrote:
>
>> No, I did not explicitly  create an uber-jar.  The mylib1.jar is very
>> light. It only contains my main application class (including
>> ProcessFunction).
>>
>> I have been specifying --classpath option on my flink run command to 
pull
>> in the mylib2.jar .
>>
>> Plus, I have been rebuilding mylib1.jar frequently just to be safe and 
it
>> hasn't made a difference.
>>
>>
>> Mike Accola
>> macc...@us.ibm.com
>>
>>
>>
>>
>>
>> From:   Eron Wright <eronwri...@gmail.com>
>> To: dev@flink.apache.org
>> Date:   07/31/2017 01:47 PM
>> Subject:Re: Class Cache
>>
>>
>>
>> A Flink program is typically packaged as an 'uber-jar' containing its
>> dependencies.  The Flink quickstart project illustrates this (see the 
use
>> of the shading plugin in pom.xml).   Based on your description, the
>> classes
>> of mylib2.jar were copied into mylib1.jar when the latter was built. 
Try
>> rebuilding mylib1.jar to effect the change.
>>
>> -Eron
>>
>> On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <macc...@us.ibm.com> 
wrote:
>>
>> > Are classes cached somewhere in flink?  I am running in a very basic,
>> > local environment on Linux (start_local.sh).  I've somehow gotten my
>> > environment into a strange state that I don't understand.  I feel 
like I
>> > am overlooking something simple, but I've checked everything I can 
think
>> > of.
>> >
>> > My main flink application with a ProcessFunction is embedded in
>> > mylib1.jar.  Within my ProcessFunction I use another class that is
>> > embedded in mylib2.jar.
>> >
>> > When I made changes to function in mylib2.jar and rebuilt the jar, I
>> > realized the changes weren't taking affect.  In fact, I then delete
>> > mylib2.jar entirely and my application still worked.  I can't figure 
out
>> > where my application is 

Re: Class Cache

2017-07-31 Thread Mike Accola
No, I did not explicitly  create an uber-jar.  The mylib1.jar is very 
light. It only contains my main application class (including 
ProcessFunction). 

I have been specifying --classpath option on my flink run command to pull 
in the mylib2.jar . 

Plus, I have been rebuilding mylib1.jar frequently just to be safe and it 
hasn't made a difference.


Mike Accola 
macc...@us.ibm.com





From:   Eron Wright <eronwri...@gmail.com>
To: dev@flink.apache.org
Date:   07/31/2017 01:47 PM
Subject:Re: Class Cache



A Flink program is typically packaged as an 'uber-jar' containing its
dependencies.  The Flink quickstart project illustrates this (see the use
of the shading plugin in pom.xml).   Based on your description, the 
classes
of mylib2.jar were copied into mylib1.jar when the latter was built.  Try
rebuilding mylib1.jar to effect the change.

-Eron

On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <macc...@us.ibm.com> wrote:

> Are classes cached somewhere in flink?  I am running in a very basic,
> local environment on Linux (start_local.sh).  I've somehow gotten my
> environment into a strange state that I don't understand.  I feel like I
> am overlooking something simple, but I've checked everything I can think
> of.
>
> My main flink application with a ProcessFunction is embedded in
> mylib1.jar.  Within my ProcessFunction I use another class that is
> embedded in mylib2.jar.
>
> When I made changes to function in mylib2.jar and rebuilt the jar, I
> realized the changes weren't taking affect.  In fact, I then delete
> mylib2.jar entirely and my application still worked.  I can't figure out
> where my application is picking up the function contained in mylib2.jar. 
I
> have checked any temp directories, library paths, etc.  I have 
repeatedly
> stopped/started my flink environment just to be safe.
>
> I tried adding -verbose:class to env.java.opts.  It output a lot of 
class
> loading info to the stdout log, but there were no references to my class
> in mylib2.jar.
>
> This has to be caching this code somehow whether it is in flink or in 
the
> jvm.  Any ideas what could be happening or how to debug this further?
>
> Thanks
>
>
>






Class Cache

2017-07-31 Thread Mike Accola
Are classes cached somewhere in flink?  I am running in a very basic, 
local environment on Linux (start_local.sh).  I've somehow gotten my 
environment into a strange state that I don't understand.  I feel like I 
am overlooking something simple, but I've checked everything I can think 
of.

My main flink application with a ProcessFunction is embedded in 
mylib1.jar.  Within my ProcessFunction I use another class that is 
embedded in mylib2.jar.

When I made changes to function in mylib2.jar and rebuilt the jar, I 
realized the changes weren't taking affect.  In fact, I then delete 
mylib2.jar entirely and my application still worked.  I can't figure out 
where my application is picking up the function contained in mylib2.jar. I 
have checked any temp directories, library paths, etc.  I have repeatedly 
stopped/started my flink environment just to be safe. 

I tried adding -verbose:class to env.java.opts.  It output a lot of class 
loading info to the stdout log, but there were no references to my class 
in mylib2.jar.

This has to be caching this code somehow whether it is in flink or in the 
jvm.  Any ideas what could be happening or how to debug this further?

Thanks




Re: Using native library in Flink

2017-07-19 Thread Mike Accola
Timo/Eron -

Thank you for the responses. To answer a few of your questions:

- For now, I am just running in a simple, local environment 
(start-local.sh)
- I have this entry in the flink-conf.yaml file:  env.java.opts : 
"-Djava.library.path=/myPathWithTheLibrary".  From looking at logs, it 
looks like the JVM is picking up this setting.  (plus if I remove the 
setting, things don't work at all).
- I am loading the library within the processElement() method of my 
ProcessFunction class.

I applied Eron's example of adding the library to my jar file and then 
extracting/loading.  This  seems to work for me.  So at least I have a 
workaround for now (thank you!).  However, it really seems like this is a 
hack that I should not have to do.  I am running on a single system and 
the java.library.path is an absolute path on this system.  I'd love to 
figure out why this is happening and a better way to get around it.

One things I've noted:  In the job manager logs, it appears 
java.library.path is getting set as expected.  But if I do 
System.getProperty("java.library.path") within my processElement method to 
check the property, the results are erratic:  Sometimes I see the value 
from my flink-conf.yaml.  Other times I see something totally different 
that appears to be the jvm default.  More confusing is that seeing these 
different values for java.library.path do NOT seem to correlate to whether 
the library loads successfully or not.  If I run this same application 
twice in succession, am I running in different processes or JVMs?

Please reply if anyone has suggestions on other things to try.

--Mike





From:   Eron Wright <eronwri...@gmail.com>
To: dev@flink.apache.org
Date:   07/18/2017 04:40 PM
Subject:Re: Using native library in Flink



The solution mentioned by Timo works well with a standalone Flink cluster
but might not work with a YARN or Mesos cluster.  An alternative is to 
have
your Java library contain the native library within itself, and to extract
it to a temporary directory before calling `System.loadLibrary(...)`.
Note that you lose the advantages of using the native OS's packaging 
system
(e.g. security patches, dependency management).   The TensorFlow Java
library demonstrates the technique:

https://github.com/tensorflow/tensorflow/blob/v1.2.1/tensorflow/java/src/main/java/org/tensorflow/NativeLibrary.java


-Eron

On Tue, Jul 18, 2017 at 8:02 AM, Timo Walther <twal...@apache.org> wrote:

> Hi Mike,
>
> do you run Flink locally or in a cluster? You have to make sure that VM
> argument -Djava.library.path is set for all Flink JVMs. Job Manager and
> Task Managers might run in separate JVMs. Make also sure that the 
library
> is accessible from all node. I don't know what happens if the file is
> accessed by multiple processes/threads at the same time. It might also
> important where you put the static { ... } loading. It should be in the
> Function, because these classes get deserialized on the TaskManager.
>
> I hope this helps.
>
> Timo
>
>
> Am 17.07.17 um 21:30 schrieb Mike Accola:
>
> I am new Flink user just trying to learn a little bit.  I am trying to
>> incorporate an existing C++ library into a new Flink application.  I am
>> seeing some strange behavior when trying to link in the native (C++)
>> library using java via JNI.
>>   I am running this on Linux (RHEL6)
>>   I can run my application once without error.  Sometimes it will run
>> successfully a 2nd or 3rd time.  However, eventually on a subsequent 
run,
>> I get an exception about the the native library not being found:
>>   java.lang.UnsatisfiedLinkError: no dummy2native in java.library.path
>>  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
>>  at java.lang.Runtime.loadLibrary0(Runtime.java:870)
>>  at java.lang.System.loadLibrary(System.java:1122)
>>  at com.att.flink.tdata.spss.TinyLoader.loadNative(Dummy2.java:
>> 10)
>>   For debugging purposes for now, my native library does not have any
>> external references.  It really contains 1 method that essentially does
>> nothing.
>>   The behavior seems to indicate that there is some kind of cleanup 
being
>> done that "unloads" the native library.  I suspect this is somehow 
related
>> to Flink's implementation of its library cache manager, but I have not
>> been able to prove this yet.
>>   A few more details:
>>   - I have a c++ library libdummy2native.so that contains a method that
>> can
>> be invoked via JNI.
>> - I have a jar containing a class, called Dummy2.  The Dummy2 
constructor
>> will invoke the JNI method.
>> - The libdummy2native.so library is invoked with System.loadLibrary() 
like
>> this:
>>   static {System.loadLibr

Using native library in Flink

2017-07-17 Thread Mike Accola
I am new Flink user just trying to learn a little bit.  I am trying to 
incorporate an existing C++ library into a new Flink application.  I am 
seeing some strange behavior when trying to link in the native (C++) 
library using java via JNI.
 
I am running this on Linux (RHEL6)
 
I can run my application once without error.  Sometimes it will run 
successfully a 2nd or 3rd time.  However, eventually on a subsequent run, 
I get an exception about the the native library not being found:
 
java.lang.UnsatisfiedLinkError: no dummy2native in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.att.flink.tdata.spss.TinyLoader.loadNative(Dummy2.java:10)
 
For debugging purposes for now, my native library does not have any 
external references.  It really contains 1 method that essentially does 
nothing.
 
The behavior seems to indicate that there is some kind of cleanup being 
done that "unloads" the native library.  I suspect this is somehow related 
to Flink's implementation of its library cache manager, but I have not 
been able to prove this yet.
 
A few more details:
 
- I have a c++ library libdummy2native.so that contains a method that can 
be invoked via JNI.
- I have a jar containing a class, called Dummy2.  The Dummy2 constructor 
will invoke the JNI method.
- The libdummy2native.so library is invoked with System.loadLibrary() like 
this:
 static {System.loadLibrary("dummy2native"); }
- In my simple Flink application, I have extended the ProcessFunction 
class.  Within this class, I have overriden processElement method that 
declares a Dummy2 object.
- The Dummy2 class can be called and invoked without error when used in a 
standalone java program.
 
Any thoughts or ideas on what to try next would be appreciated. Initially, 
I'd be happy to be able to just explain this behavior.  I will worry about 
fixing it afterwards.
 
Thanks.