[jira] [Updated] (FLINK-34152) Tune TaskManager memory of austoscaled jobs

Maximilian Michels (Jira) Wed, 28 Feb 2024 09:14:13 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-34152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Maximilian Michels updated FLINK-34152:
---------------------------------------
    Summary: Tune TaskManager memory of austoscaled jobs  (was: Tune 
TaskManager memory)

> Tune TaskManager memory of austoscaled jobs
> -------------------------------------------
>
>                 Key: FLINK-34152
>                 URL: https://issues.apache.org/jira/browse/FLINK-34152
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Autoscaler, Kubernetes Operator
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: kubernetes-operator-1.8.0
>
>
> The current autoscaling algorithm adjusts the parallelism of the job task 
> vertices according to the processing needs. By adjusting the parallelism, we 
> systematically scale the amount of CPU for a task. At the same time, we also 
> indirectly change the amount of memory tasks have at their dispense. However, 
> there are some problems with this.
>  # Memory is overprovisioned: On scale up we may add more memory than we 
> actually need. Even on scale down, the memory / cpu ratio can still be off 
> and too much memory is used.
>  # Memory is underprovisioned: For stateful jobs, we risk running into 
> OutOfMemoryErrors on scale down. Even before running out of memory, too 
> little memory can have a negative impact on the effectiveness of the scaling.
> We lack the capability to tune memory proportionally to the processing needs. 
> In the same way that we measure CPU usage and size the tasks accordingly, we 
> need to evaluate memory usage and adjust the heap memory size.
> https://docs.google.com/document/d/19GXHGL_FvN6WBgFvLeXpDABog2H_qqkw1_wrpamkFSc/edit
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-34152) Tune TaskManager memory of austoscaled jobs

Reply via email to