[
https://issues.apache.org/jira/browse/FLINK-19005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183928#comment-17183928
]
ShenDa edited comment on FLINK-19005 at 8/25/20, 10:58 AM:
-----------------------------------------------------------
[~chesnay] I'm willing to know how you can draw a conclusion that the class
leaking is caused by java.sql.DriverManager from the dump files. I'm still no
thinking to locate the key problem.
BTW, I tried several times to using wordcount job to reproduce metaspace OOM.
But this time flink was running well and no metaspace OOM occurred, so It was
my mistake.
was (Author: dadashen):
[~chesnay] I'm willing to know how you can analyze the class leaking is caused
by java.sql.DriverManager from the dump files. I'm still no thinking to locate
the key problem.
BTW, I tried several times to using wordcount job to reproduce metaspace OOM.
But this time flink was running well and no metaspace OOM occurred, so It was
my mistake.
> used metaspace grow on every execution
> --------------------------------------
>
> Key: FLINK-19005
> URL: https://issues.apache.org/jira/browse/FLINK-19005
> Project: Flink
> Issue Type: Bug
> Components: Client / Job Submission, Runtime / Configuration,
> Runtime / Coordination
> Affects Versions: 1.11.1
> Reporter: Guillermo Sánchez
> Assignee: Chesnay Schepler
> Priority: Major
> Attachments: heap_dump_after_10_executions.zip,
> heap_dump_after_1_execution.zip, heap_dump_echo_lee.tar.xz
>
>
> Hi !
> Im running a 1.11.1 flink cluster, where I execute batch jobs made with
> DataSet API.
> I submit these jobs every day to calculate daily data.
> In every execution, cluster's used metaspace increase by 7MB and its never
> released.
> This ends up with an OutOfMemoryError caused by Metaspace every 15 days and i
> need to restart the cluster to clean the metaspace
> taskmanager.memory.jvm-metaspace.size is set to 512mb
> Any idea of what could be causing this metaspace grow and why is it not
> released ?
>
> ================================================
> === Summary ======================================
> ================================================
> Case 1, reported by [~gestevez]:
> * Flink 1.11.1
> * Java 11
> * Maximum Metaspace size set to 512mb
> * Custom Batch job, submitted daily
> * Requires restart every 15 days after an OOM
> Case 2, reported by [~Echo Lee]:
> * Flink 1.11.0
> * Java 11
> * G1GC
> * WordCount Batch job, submitted every second / every 5 minutes
> * eventually fails TaskExecutor with OOM
> Case 3, reported by [~DaDaShen]
> * Flink 1.11.0
> * Java 11
> * WordCount Batch job, submitted every 5 seconds
> * growing Metaspace, eventually OOM
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)