These metrics should also be available via REST. You can check the original design doc [1] for which metrics the UI is using.
Thank you~ Xintong Song [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager On Tue, Apr 13, 2021 at 9:08 PM Alexis Sarda-Espinosa < alexis.sarda-espin...@microfocus.com> wrote: > Hi Xintong, > > > > Thanks for the info. Is there any way to access these metrics outside of > the UI? I suppose Flink’s reporters might provide them, but will they also > be available through the REST interface (or another interface)? > > > > Regards, > > Alexis. > > > > *From:* Xintong Song <tonysong...@gmail.com> > *Sent:* Tuesday, 13 April 2021 14:30 > *To:* Alexis Sarda-Espinosa <alexis.sarda-espin...@microfocus.com> > *Cc:* user@flink.apache.org > *Subject:* Re: Clarification about Flink's managed memory and metric > monitoring > > > > Hi Alexis, > > > > First of all, I strongly recommend not to look into the JVM metrics. These > metrics are fetched directly from JVM and do not well correspond to Flink's > memory configurations. They were introduced a long time ago and are > preserved mostly for compatibility. IMO, they bring more confusion than > convenience. In Flink-1.12, there is a newly designed TM metrics page in > the web ui, which clearly shows how the metrics correspond to Flink's > memory configurations (if any). > > > > Concerning your questions. > > 1. Yes, increasing framework/task off-heap memory sizes should increase > the direct memory capacity. Increasing the network memory size should also > do that. > > 2. When 'state.backend.rocksdb.memory.managed' is true, RocksDB uses > managed memory. Managed memory is not measured by any JVM metrics. It's not > managed by JVM, meaning that it's not limited by '-XX:MaxDirectMemorySize' > and is not controlled by the garbage collectors. > > > Thank you~ > > Xintong Song > > > > > > On Tue, Apr 13, 2021 at 7:53 PM Alexis Sarda-Espinosa < > alexis.sarda-espin...@microfocus.com> wrote: > > Hello, > > > > I have a Flink TM configured with taskmanager.memory.managed.size: 1372m. > There is a streaming job using RocksDB for checkpoints, so I assume some of > this memory will indeed be used. > > > > I was looking at the metrics exposed through the REST interface, and I > queried some of them: > > > > /taskmanagers/c3c960d79c1eb2341806bfa2b2d66328/metrics?get=Status.JVM.Memory.Heap.Committed,Status.JVM.Memory.NonHeap.Committed,Status.JVM.Memory.Direct.MemoryUsed > | jq > > [ > > { > > "id": "Status.JVM.Memory.Heap.Committed", > > "value": "1652031488" > > }, > > { > > "id": "Status.JVM.Memory.NonHeap.Committed", > > "value": "234291200" > 223 MiB > > }, > > { > > "id": "Status.JVM.Memory.Direct.MemoryUsed", > > "value": "375015427" > 358 MiB > > }, > > { > > "id": "Status.JVM.Memory.Direct.TotalCapacity", > > "value": "375063552" > 358 MiB > > } > > ] > > > > I presume direct memory is being used by Flink and its networking stack, > as well as by the JVM itself. To be sure: > > > > 1. Increasing "taskmanager.memory.framework.off-heap.size" or > "taskmanager.memory.task.off-heap.size" should increase > Status.JVM.Memory.Direct.TotalCapacity, right? > 2. I presume the native memory used by RocksDB cannot be tracked with > these JVM metrics even if "state.backend.rocksdb.memory.managed" is true, > right? > > > > Based on this question: > https://stackoverflow.com/questions/30622818/off-heap-native-heap-direct-memory-and-native-memory, > I imagine Flink/RocksDB either allocates memory completely independently of > the JVM, or it uses unsafe. Since the documentation ( > https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/memory/mem_setup_tm.html#managed-memory) > states that "Managed memory is managed by Flink and is allocated as native > memory (off-heap)", I thought this native memory might show up as part of > direct memory tracking, but I guess it doesn’t. > > > > Regards, > > Alexis. > > > >