I went through the JM & TM logs but could not find any valuable clue.
The exception is actually thrown by kafka-producer-network-thread.
Maybe @Qingsheng could also take a look?


Best,
Yangze Guo

On Thu, Apr 8, 2021 at 10:39 AM 太平洋 <495635...@qq.com> wrote:
>
> I have configured to 512M, but problem still exist. Now the memory size is 
> still 256M.
> Attachments are TM and JM logs.
>
> Look forward to your reply.
>
> ------------------ 原始邮件 ------------------
> 发件人: "Yangze Guo" <karma...@gmail.com>;
> 发送时间: 2021年4月6日(星期二) 晚上6:35
> 收件人: "太平洋"<495635...@qq.com>;
> 抄送: "user"<user@flink.apache.org>;"guowei.mgw"<guowei....@gmail.com>;
> 主题: Re: period batch job lead to OutOfMemoryError: Metaspace problem
>
> > I have tried this method, but the problem still exist.
> How much memory do you configure for it?
>
> > is 21 instances of "org.apache.flink.util.ChildFirstClassLoader" normal
> Not quite sure about it. AFAIK, each job will have a classloader.
> Multiple tasks of the same job in the same TM will share the same
> classloader. The classloader will be removed if there is no more task
> running on the TM. Classloader without reference will be finally
> cleanup by GC. Could you share JM and TM logs for further analysis?
> I'll also involve @Guowei Ma in this thread.
>
>
> Best,
> Yangze Guo
>
> On Tue, Apr 6, 2021 at 6:05 PM 太平洋 <495635...@qq.com> wrote:
> >
> > I have tried this method, but the problem still exist.
> > by heap dump analysis, is 21 instances of 
> > "org.apache.flink.util.ChildFirstClassLoader" normal?
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "Yangze Guo" <karma...@gmail.com>;
> > 发送时间: 2021年4月6日(星期二) 下午4:32
> > 收件人: "太平洋"<495635...@qq.com>;
> > 抄送: "user"<user@flink.apache.org>;
> > 主题: Re: period batch job lead to OutOfMemoryError: Metaspace problem
> >
> > I think you can try to increase the JVM metaspace option for
> > TaskManagers through taskmanager.memory.jvm-metaspace.size. [1]
> >
> > [1] 
> > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_trouble/#outofmemoryerror-metaspace
> >
> > Best,
> > Yangze Guo
> >
> > Best,
> > Yangze Guo
> >
> >
> > On Tue, Apr 6, 2021 at 4:22 PM 太平洋 <495635...@qq.com> wrote:
> > >
> > > batch job:
> > > read data from s3 by sql,then by some operators and write data to 
> > > clickhouse and kafka.
> > > after some times, task-manager quit with OutOfMemoryError: Metaspace.
> > >
> > > env:
> > > flink version:1.12.2
> > > task-manager slot count: 5
> > > deployment: standalone kubernetes session 模式
> > > dependencies:
> > >
> > >     <dependency>
> > >
> > >       <groupId>org.apache.flink</groupId>
> > >
> > >       <artifactId>flink-connector-kafka_2.11</artifactId>
> > >
> > >       <version>${flink.version}</version>
> > >
> > >     </dependency>
> > >
> > >     <dependency>
> > >
> > >       <groupId>com.google.code.gson</groupId>
> > >
> > >       <artifactId>gson</artifactId>
> > >
> > >       <version>2.8.5</version>
> > >
> > >     </dependency>
> > >
> > >     <dependency>
> > >
> > >       <groupId>org.apache.flink</groupId>
> > >
> > >       <artifactId>flink-connector-jdbc_2.11</artifactId>
> > >
> > >       <version>${flink.version}</version>
> > >
> > >     </dependency>
> > >
> > >     <dependency>
> > >
> > >       <groupId>ru.yandex.clickhouse</groupId>
> > >
> > >       <artifactId>clickhouse-jdbc</artifactId>
> > >
> > >       <version>0.3.0</version>
> > >
> > >     </dependency>
> > >
> > >     <dependency>
> > >
> > >       <groupId>org.apache.flink</groupId>
> > >
> > >         <artifactId>flink-parquet_2.11</artifactId>
> > >
> > >         <version>${flink.version}</version>
> > >
> > >     </dependency>
> > >
> > >     <dependency>
> > >
> > >          <groupId>org.apache.flink</groupId>
> > >
> > >          <artifactId>flink-json</artifactId>
> > >
> > >          <version>${flink.version}</version>
> > >
> > >     </dependency>
> > >
> > >
> > > heap dump1:
> > >
> > > Leak Suspects
> > >
> > > System Overview
> > >
> > >  Leaks
> > >
> > >  Overview
> > >
> > >
> > >   Problem Suspect 1
> > >
> > > 21 instances of "org.apache.flink.util.ChildFirstClassLoader", loaded by 
> > > "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy 29,656,880 
> > > (41.16%) bytes.
> > >
> > > Biggest instances:
> > >
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ca2a1e8 - 1,474,760 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2af820 - 1,474,168 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cdcaa10 - 1,474,160 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cf6aab0 - 1,474,160 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d1111d8 - 1,474,160 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2bb108 - 1,474,128 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73de202e0 - 1,474,120 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73dadc778 - 1,474,112 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d5f70e8 - 1,474,064 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d93aa38 - 1,474,064 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e179638 - 1,474,064 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73dc80418 - 1,474,056 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73dfcda60 - 1,474,056 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e4bcd38 - 1,474,056 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d6006e8 - 1,474,032 
> > > (2.05%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73c7d2ad8 - 1,461,944 
> > > (2.03%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ca1bb98 - 1,460,752 
> > > (2.03%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73bf203f0 - 1,460,744 
> > > (2.03%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e3284a8 - 1,445,232 
> > > (2.01%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e65de00 - 1,445,232 
> > > (2.01%) bytes.
> > >
> > >
> > >
> > > Keywords
> > > org.apache.flink.util.ChildFirstClassLoader
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > > Details »
> > >
> > >   Problem Suspect 2
> > >
> > > 34,407 instances of "org.apache.flink.core.memory.HybridMemorySegment", 
> > > loaded by "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy 
> > > 7,707,168 (10.70%) bytes.
> > >
> > > Keywords
> > > org.apache.flink.core.memory.HybridMemorySegment
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > >
> > > Details »
> > >
> > >
> > >
> > > heap dump2:
> > >
> > > Leak Suspects
> > >
> > > System Overview
> > >
> > >  Leaks
> > >
> > >  Overview
> > >
> > >   Problem Suspect 1
> > >
> > > 21 instances of "org.apache.flink.util.ChildFirstClassLoader", loaded by 
> > > "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy 26,061,408 
> > > (30.68%) bytes.
> > >
> > > Biggest instances:
> > >
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e9e9930 - 1,474,224 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73edce0b8 - 1,474,224 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73f1ad7d0 - 1,474,168 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73f3e5118 - 1,474,168 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73f5d3fe0 - 1,474,168 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ebd8d28 - 1,474,160 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73efc00c0 - 1,474,160 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e2251a8 - 1,474,136 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cc24af0 - 1,474,064 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cdca3e0 - 1,474,064 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cf6f860 - 1,474,064 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d114768 - 1,474,064 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ca6f878 - 1,474,056 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2b7640 - 1,474,056 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2c1d80 - 1,474,040 
> > > (1.74%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73c7e2868 - 1,469,720 
> > > (1.73%) bytes.
> > > org.apache.flink.util.ChildFirstClassLoader @ 0x73bf34a98 - 1,460,808 
> > > (1.72%) bytes.
> > >
> > >
> > >
> > > Keywords
> > > org.apache.flink.util.ChildFirstClassLoader
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > > Details »
> > >
> > >   Problem Suspect 2
> > >
> > > 4 instances of 
> > > "org.apache.flink.streaming.runtime.tasks.OneInputStreamTask", loaded by 
> > > "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy 11,644,200 
> > > (13.71%) bytes.
> > >
> > > Biggest instances:
> > >
> > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @ 0x73e2d0cb0 
> > > - 4,364,536 (5.14%) bytes.
> > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @ 0x73d62fb88 
> > > - 3,643,576 (4.29%) bytes.
> > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @ 0x73dae0270 
> > > - 3,635,952 (4.28%) bytes.
> > >
> > >
> > >
> > > Keywords
> > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0
> > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask
> > > Details »
> > >
> > >

Reply via email to