Re: Off heap memory issue

2017-12-13 Thread Piotr Nowojski
Hi, OOMs from metaspace probably mean that your jars are not releasing some resources: https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/debugging_classloading.html#unloading-of-dynamically-loaded-classes

Re: Off heap memory issue

2017-12-12 Thread Javier Lopez
Hi Piotr, We found out which one was the problem in the workers. After setting a value for XX:MaxMetaspaceSize we started to get OOM exceptions from the metaspace. We found out how Flink manages the User classes here

Re: Off heap memory issue

2017-11-15 Thread Piotr Nowojski
Hi, I have been able to observe some off heap memory “issues” by submitting Kafka job provided by Javier Lopez (in different mailing thread). TL;DR; There was no memory leak, just memory pool “Metaspace” and “Compressed Class Space” are growing in size over time and are only rarely garbage

Re: Off heap memory issue

2017-11-13 Thread Flavio Pompermaier
Unfortunately the issue I've opened [1] was not a problem of Flink but was just caused by an ever increasing job plan. So no help from that..Let's hope to find out the real source of the problem. Maybe using -Djdk.nio.maxCachedBufferSize could help (but I didn't try it yet) Best, Flavio [1]

Re: Off heap memory issue

2017-10-18 Thread Kien Truong
Hi, We saw a similar issue in one of our job due to ByteBuffer memory leak[1]. We fixed it using the solution in the article, setting -Djdk.nio.maxCachedBufferSize This variable is available for Java > 8u102 Best regards, Kien [1]http://www.evanjones.ca/java-bytebuffer-leak.html On

Re: Off heap memory issue

2017-10-18 Thread Flavio Pompermaier
We also faced the same problem, but the number of jobs we can run before restarting the cluster depends on the volume of the data to shuffle around the network. We even had problems with a single job and in order to avoid OOM issues we had to put some configuration to limit Netty memory usage,

Re: Off heap memory issue

2017-10-18 Thread Javier Lopez
Hi Robert, Sorry to reply this late. We did a lot of tests, trying to identify if the problem was in our custom sources/sinks. We figured out that none of our custom components is causing this problem. We came up with a small test, and realized that the Flink nodes run out of non-heap JVM memory

Re: Off heap memory issue

2017-08-30 Thread Robert Metzger
Hi Javier, I'm not aware of such issues with Flink, but if you could give us some more details on your setup, I might get some more ideas on what to look for. are you using the RocksDBStateBackend? (RocksDB is doing some JNI allocations, that could potentially leak memory) Also, are you passing

Off heap memory issue

2017-08-28 Thread Javier Lopez
Hi all, we are starting a lot of Flink jobs (streaming), and after we have started 200 or more jobs we see that the non-heap memory in the taskmanagers increases a lot, to the point of killing the instances. We found out that every time we start a new job, the committed non-heap memory increases