[
https://issues.apache.org/jira/browse/FLINK-35489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853858#comment-17853858
]
Rui Fan commented on FLINK-35489:
---------------------------------
Merged to main(1.9.0) via: c3e94aec0c6f081a99b7467bd0bcee551b841600
> Metaspace size can be too little after autotuning change memory setting
> -----------------------------------------------------------------------
>
> Key: FLINK-35489
> URL: https://issues.apache.org/jira/browse/FLINK-35489
> Project: Flink
> Issue Type: Improvement
> Components: Kubernetes Operator
> Affects Versions: 1.8.0
> Reporter: Nicolas Fraison
> Assignee: Nicolas Fraison
> Priority: Major
> Labels: pull-request-available
> Fix For: kubernetes-operator-1.9.0
>
>
> We have enable the autotuning feature on one of our flink job with below
> config
> {code:java}
> # Autoscaler configuration
> job.autoscaler.enabled: "true"
> job.autoscaler.stabilization.interval: 1m
> job.autoscaler.metrics.window: 10m
> job.autoscaler.target.utilization: "0.8"
> job.autoscaler.target.utilization.boundary: "0.1"
> job.autoscaler.restart.time: 2m
> job.autoscaler.catch-up.duration: 10m
> job.autoscaler.memory.tuning.enabled: true
> job.autoscaler.memory.tuning.overhead: 0.5
> job.autoscaler.memory.tuning.maximize-managed-memory: true{code}
> During a scale down the autotuning decided to give all the memory to to JVM
> (having heap being scale by 2) settting taskmanager.memory.managed.size to 0b.
> Here is the config that was compute by the autotuning for a TM running on a
> 4GB pod:
> {code:java}
> taskmanager.memory.network.max: 4063232b
> taskmanager.memory.network.min: 4063232b
> taskmanager.memory.jvm-overhead.max: 433791712b
> taskmanager.memory.task.heap.size: 3699934605b
> taskmanager.memory.framework.off-heap.size: 134217728b
> taskmanager.memory.jvm-metaspace.size: 22960020b
> taskmanager.memory.framework.heap.size: "0 bytes"
> taskmanager.memory.flink.size: 3838215565b
> taskmanager.memory.managed.size: 0b {code}
> This has lead to some issue starting the TM because we are relying on some
> javaagent performing some memory allocation outside of the JVM (rely on some
> C bindings).
> Tuning the overhead or disabling the scale-down-compensation.enabled could
> have helped for that particular event but this can leads to other issue as it
> could leads to too little HEAP size being computed.
> It would be interesting to be able to set a min memory.managed.size to be
> taken in account by the autotuning.
> What do you think about this? Do you think that some other specific config
> should have been applied to avoid this issue?
>
> Edit see this comment that leads to the metaspace issue:
> https://issues.apache.org/jira/browse/FLINK-35489?focusedCommentId=17850694&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17850694
--
This message was sent by Atlassian Jira
(v8.20.10#820010)