I went through the JM & TM logs but could not find any valuable clue. The exception is actually thrown by kafka-producer-network-thread. Maybe @Qingsheng could also take a look?
Best, Yangze Guo On Thu, Apr 8, 2021 at 10:39 AM 太平洋 <495635...@qq.com> wrote: > > I have configured to 512M, but problem still exist. Now the memory size is > still 256M. > Attachments are TM and JM logs. > > Look forward to your reply. > > ------------------ 原始邮件 ------------------ > 发件人: "Yangze Guo" <karma...@gmail.com>; > 发送时间: 2021年4月6日(星期二) 晚上6:35 > 收件人: "太平洋"<495635...@qq.com>; > 抄送: "user"<user@flink.apache.org>;"guowei.mgw"<guowei....@gmail.com>; > 主题: Re: period batch job lead to OutOfMemoryError: Metaspace problem > > > I have tried this method, but the problem still exist. > How much memory do you configure for it? > > > is 21 instances of "org.apache.flink.util.ChildFirstClassLoader" normal > Not quite sure about it. AFAIK, each job will have a classloader. > Multiple tasks of the same job in the same TM will share the same > classloader. The classloader will be removed if there is no more task > running on the TM. Classloader without reference will be finally > cleanup by GC. Could you share JM and TM logs for further analysis? > I'll also involve @Guowei Ma in this thread. > > > Best, > Yangze Guo > > On Tue, Apr 6, 2021 at 6:05 PM 太平洋 <495635...@qq.com> wrote: > > > > I have tried this method, but the problem still exist. > > by heap dump analysis, is 21 instances of > > "org.apache.flink.util.ChildFirstClassLoader" normal? > > > > > > ------------------ 原始邮件 ------------------ > > 发件人: "Yangze Guo" <karma...@gmail.com>; > > 发送时间: 2021年4月6日(星期二) 下午4:32 > > 收件人: "太平洋"<495635...@qq.com>; > > 抄送: "user"<user@flink.apache.org>; > > 主题: Re: period batch job lead to OutOfMemoryError: Metaspace problem > > > > I think you can try to increase the JVM metaspace option for > > TaskManagers through taskmanager.memory.jvm-metaspace.size. [1] > > > > [1] > > https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_trouble/#outofmemoryerror-metaspace > > > > Best, > > Yangze Guo > > > > Best, > > Yangze Guo > > > > > > On Tue, Apr 6, 2021 at 4:22 PM 太平洋 <495635...@qq.com> wrote: > > > > > > batch job: > > > read data from s3 by sql,then by some operators and write data to > > > clickhouse and kafka. > > > after some times, task-manager quit with OutOfMemoryError: Metaspace. > > > > > > env: > > > flink version:1.12.2 > > > task-manager slot count: 5 > > > deployment: standalone kubernetes session 模式 > > > dependencies: > > > > > > <dependency> > > > > > > <groupId>org.apache.flink</groupId> > > > > > > <artifactId>flink-connector-kafka_2.11</artifactId> > > > > > > <version>${flink.version}</version> > > > > > > </dependency> > > > > > > <dependency> > > > > > > <groupId>com.google.code.gson</groupId> > > > > > > <artifactId>gson</artifactId> > > > > > > <version>2.8.5</version> > > > > > > </dependency> > > > > > > <dependency> > > > > > > <groupId>org.apache.flink</groupId> > > > > > > <artifactId>flink-connector-jdbc_2.11</artifactId> > > > > > > <version>${flink.version}</version> > > > > > > </dependency> > > > > > > <dependency> > > > > > > <groupId>ru.yandex.clickhouse</groupId> > > > > > > <artifactId>clickhouse-jdbc</artifactId> > > > > > > <version>0.3.0</version> > > > > > > </dependency> > > > > > > <dependency> > > > > > > <groupId>org.apache.flink</groupId> > > > > > > <artifactId>flink-parquet_2.11</artifactId> > > > > > > <version>${flink.version}</version> > > > > > > </dependency> > > > > > > <dependency> > > > > > > <groupId>org.apache.flink</groupId> > > > > > > <artifactId>flink-json</artifactId> > > > > > > <version>${flink.version}</version> > > > > > > </dependency> > > > > > > > > > heap dump1: > > > > > > Leak Suspects > > > > > > System Overview > > > > > > Leaks > > > > > > Overview > > > > > > > > > Problem Suspect 1 > > > > > > 21 instances of "org.apache.flink.util.ChildFirstClassLoader", loaded by > > > "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy 29,656,880 > > > (41.16%) bytes. > > > > > > Biggest instances: > > > > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ca2a1e8 - 1,474,760 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2af820 - 1,474,168 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cdcaa10 - 1,474,160 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cf6aab0 - 1,474,160 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d1111d8 - 1,474,160 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2bb108 - 1,474,128 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73de202e0 - 1,474,120 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73dadc778 - 1,474,112 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d5f70e8 - 1,474,064 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d93aa38 - 1,474,064 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e179638 - 1,474,064 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73dc80418 - 1,474,056 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73dfcda60 - 1,474,056 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e4bcd38 - 1,474,056 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d6006e8 - 1,474,032 > > > (2.05%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73c7d2ad8 - 1,461,944 > > > (2.03%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ca1bb98 - 1,460,752 > > > (2.03%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73bf203f0 - 1,460,744 > > > (2.03%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e3284a8 - 1,445,232 > > > (2.01%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e65de00 - 1,445,232 > > > (2.01%) bytes. > > > > > > > > > > > > Keywords > > > org.apache.flink.util.ChildFirstClassLoader > > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0 > > > Details » > > > > > > Problem Suspect 2 > > > > > > 34,407 instances of "org.apache.flink.core.memory.HybridMemorySegment", > > > loaded by "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy > > > 7,707,168 (10.70%) bytes. > > > > > > Keywords > > > org.apache.flink.core.memory.HybridMemorySegment > > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0 > > > > > > Details » > > > > > > > > > > > > heap dump2: > > > > > > Leak Suspects > > > > > > System Overview > > > > > > Leaks > > > > > > Overview > > > > > > Problem Suspect 1 > > > > > > 21 instances of "org.apache.flink.util.ChildFirstClassLoader", loaded by > > > "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy 26,061,408 > > > (30.68%) bytes. > > > > > > Biggest instances: > > > > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e9e9930 - 1,474,224 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73edce0b8 - 1,474,224 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73f1ad7d0 - 1,474,168 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73f3e5118 - 1,474,168 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73f5d3fe0 - 1,474,168 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ebd8d28 - 1,474,160 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73efc00c0 - 1,474,160 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73e2251a8 - 1,474,136 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cc24af0 - 1,474,064 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cdca3e0 - 1,474,064 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73cf6f860 - 1,474,064 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d114768 - 1,474,064 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73ca6f878 - 1,474,056 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2b7640 - 1,474,056 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73d2c1d80 - 1,474,040 > > > (1.74%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73c7e2868 - 1,469,720 > > > (1.73%) bytes. > > > org.apache.flink.util.ChildFirstClassLoader @ 0x73bf34a98 - 1,460,808 > > > (1.72%) bytes. > > > > > > > > > > > > Keywords > > > org.apache.flink.util.ChildFirstClassLoader > > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0 > > > Details » > > > > > > Problem Suspect 2 > > > > > > 4 instances of > > > "org.apache.flink.streaming.runtime.tasks.OneInputStreamTask", loaded by > > > "sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0" occupy 11,644,200 > > > (13.71%) bytes. > > > > > > Biggest instances: > > > > > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @ 0x73e2d0cb0 > > > - 4,364,536 (5.14%) bytes. > > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @ 0x73d62fb88 > > > - 3,643,576 (4.29%) bytes. > > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask @ 0x73dae0270 > > > - 3,635,952 (4.28%) bytes. > > > > > > > > > > > > Keywords > > > sun.misc.Launcher$AppClassLoader @ 0x73b2d42e0 > > > org.apache.flink.streaming.runtime.tasks.OneInputStreamTask > > > Details » > > > > > >