Hi everyone, 

since upgrading to Flink 1.1.3 we observe frequent OOME Permgen Taskmanager 
Failures. Monitoring the permgen size on one of the Taskamanagers you can see 
that each Job (New Job and Restarts) adds a few MB, which can not be collected. 
Eventually, the OOME happens. This happens with all our Jobs, Streaming and 
Batch, on Yarn 2.4 as well as Stand-Alone. 

On Flink 1.0.2 this was not a problem, but I will investigate it further.

The assumption is that Flink is somehow using one of the classes, which comes 
with our jar and by that prevents the gc of the whole class loader. Our Jars do 
not include any flink dependencies though (compileOnly), but of course many 
others.

Any ideas anyone? 

Cheers and thank you, 

Konstantin 

sent from my phone. Plz excuse brevity and tpyos.
---
Konstantin Knauf *konstantin.kn...@tngtech.com * +49-174-3413182
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke

Reply via email to