Given the numbers above from JCMD. You think I should be ok with 2GB metaspace? That was captured while all jobs where running on the cluster for that 1 node.
I set it to 2GB. But none of the above numbers indicated max 2GB metaspace. On Mon., Dec. 27, 2021, 10:47 a.m. John Smith, <java.dev....@gmail.com> wrote: > Ok all settings above are for smaller dev cluster and I'm experimenting to > set metasize to 2GB. It runs same jobs as production just less volume in > terms of data. > > The below snapshot of JCMD are of a slightly bigger task manager and the > active cluster... It also once in a while does metaspace so thinking > updating metaspace to 2GB. This is what started the actual investigation. > > taskmanager.memory.flink.size: 10240m > taskmanager.memory.jvm-metaspace.size: 1024m <------ Up to 2GB. > taskmanager.numberOfTaskSlots: 12 > > jcmd 2128 GC.heap_info > 2128: > garbage-first heap total 5111808K, used 2530277K [0x0000000688800000, > 0x0000000688a04e00, 0x00000007c0800000) > region size 2048K, 810 young (1658880K), 4 survivors (8192K) > Metaspace used 998460K, capacity 1022929K, committed 1048576K, > reserved 1972224K > class space used 112823K, capacity 121063K, committed 126024K, > reserved 1048576K > > On Mon, 27 Dec 2021 at 10:27, John Smith <java.dev....@gmail.com> wrote: > >> Yes standalone cluster. 3 zoo, 3 job, 3 tasks. >> >> The task managers have taskslots at double core. So 2*4 >> >> I think metaspace of 2GB is ok. I'll try to get some jcmd stats. >> >> The jobs are fairly straight forward ETL they read from Kafka, do some >> json parsing, using vertx.io json parser and either Insert to apache >> ignite cache or jdbc db. >> >> >> On Sun., Dec. 26, 2021, 8:46 p.m. Xintong Song, <tonysong...@gmail.com> >> wrote: >> >>> Hi John, >>> >>> Sounds to me you have a Flink standalone cluster deployed directly on >>> physical hosts. If that is the case, use `t.m.flink.size` instead of >>> `t.m.process.size`. The latter does not limit the overall memory >>> consumption of the processes, and is only used for calculating how much >>> non-JVM memory the process should leave in a containerized setup, which >>> does no good in a non-containerized setup. >>> >>> When running into a Metaspace OOM, the standard solution is to increase >>> `t.m.jvm-metaspace.size`. If this is impractical due to the physical >>> limitations, you may also try to decrease `taskmanager.numberOfTaskSlots`. >>> If you have multiple jobs submitted to a shared Flink cluster, decreasing >>> the number of slots in a task manager should also reduce the amount of >>> classes loaded by the JVM, thus requiring less metaspace. >>> >>> Thank you~ >>> >>> Xintong Song >>> >>> >>> >>> On Mon, Dec 27, 2021 at 9:08 AM John Smith <java.dev....@gmail.com> >>> wrote: >>> >>>> Ok I tried taskmanager.memory.process.size: 7168m >>>> >>>> It's worst, the task manager can barely start before it throws >>>> java.lang.OutOfMemoryError: Metaspace >>>> >>>> I will try... >>>> taskmanager.memory.flink.size: 5120m >>>> taskmanager.memory.jvm-metaspace.size: 2048m >>>> >>>> >>>> On Sun, 26 Dec 2021 at 19:46, John Smith <java.dev....@gmail.com> >>>> wrote: >>>> >>>>> Hi running Flink 1.10 >>>>> >>>>> I have >>>>> >>>>> taskmanager.memory.flink.size: 6144m >>>>> taskmanager.memory.jvm-metaspace.size: 1024m >>>>> taskmanager.numberOfTaskSlots: 8 >>>>> parallelism.default: 1 >>>>> >>>>> 1- The host has a physical ram of 8GB. I'm better off just to >>>>> configure "taskmanager.memory.process.size" as 7GB and let flink figure it >>>>> out? >>>>> 2- Is there a way for me to calculate how much metspace my jobs >>>>> require or are using? >>>>> >>>>> 2021-12-24 04:53:32,511 ERROR >>>>> org.apache.flink.runtime.util.FatalExitExceptionHandler - FATAL: >>>>> Thread 'flink-akka.actor.default-dispatcher-86' produced an uncaught >>>>> exception. Stopping the process... >>>>> java.lang.OutOfMemoryError: Metaspace >>>>> >>>>