Rohit Singh created FLINK-9080:
----------------------------------
Summary: Flink Scheduler goes OOM, suspecting a memory leak
Key: FLINK-9080
URL: https://issues.apache.org/jira/browse/FLINK-9080
Project: Flink
Issue Type: Bug
Components: JobManager
Affects Versions: 1.4.0
Reporter: Rohit Singh
Attachments: Classloaded vs unloaded.png, Top Level packages.JPG, Top
level classes.JPG
Running FLink version 1.4.0. on mesos,scheduler running along with job manager
in single container, whereas task managers running in seperate containers.
Couple of jobs were running continously, Flink scheduler was working
properlyalong with task managers. Due to some change in data, one of the jobs
started failing continuously. In the meantime,there was a surge in flink
scheduler memory usually eventually died out off OOM
Memory dump analysis was done,
Following were findings !Top Level packages.JPG!!Top level
classes.JPG!!Classloaded vs unloaded.png!
* Majority of top loaded packages retaining heap indicated towards
Flinkuserclassloader, glassfish(jersey library), Finalizer classes. (Top level
package image)
* Top level classes were of Flinkuserclassloader, (Top Level class image)
* The number of classes loaded vs unloaded was quite less PFA,inspite of
adding jvm options of -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -
* There were custom classes as well which were duplicated during subsequent
class uploads
PFA all the images of heap dump. Can you suggest some pointers on as to how to
overcome this issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)