[jira] [Updated] (FLINK-34538) Tune Flink config of autoscaled jobs

2024-03-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-34538:
---
Labels: pull-request-available  (was: )

> Tune Flink config of autoscaled jobs
> 
>
> Key: FLINK-34538
> URL: https://issues.apache.org/jira/browse/FLINK-34538
> Project: Flink
>  Issue Type: New Feature
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: pull-request-available
>
> Umbrella issue to tackle tuning the Flink configuration as part of Flink 
> Autoscaling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34538) Tune Flink config of autoscaled jobs

2024-02-28 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-34538:
---
Description: Umbrella issue to tackle tuning the Flink configuration as 
part of Flink Autoscaling.  (was: Umbrella issue to tackle)

> Tune Flink config of autoscaled jobs
> 
>
> Key: FLINK-34538
> URL: https://issues.apache.org/jira/browse/FLINK-34538
> Project: Flink
>  Issue Type: New Feature
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> Umbrella issue to tackle tuning the Flink configuration as part of Flink 
> Autoscaling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34538) Tune Flink config of autoscaled jobs

2024-02-28 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-34538:
---
Labels:   (was: pull-request-available)

> Tune Flink config of autoscaled jobs
> 
>
> Key: FLINK-34538
> URL: https://issues.apache.org/jira/browse/FLINK-34538
> Project: Flink
>  Issue Type: New Feature
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> Umbrella issue to tackle



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34538) Tune Flink config of autoscaled jobs

2024-02-28 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-34538:
---
Description: Umbrella issue to tackle  (was: The current autoscaling 
algorithm adjusts the parallelism of the job task vertices according to the 
processing needs. By adjusting the parallelism, we systematically scale the 
amount of CPU for a task. At the same time, we also indirectly change the 
amount of memory tasks have at their dispense. However, there are some problems 
with this.
 # Memory is overprovisioned: On scale up we may add more memory than we 
actually need. Even on scale down, the memory / cpu ratio can still be off and 
too much memory is used.
 # Memory is underprovisioned: For stateful jobs, we risk running into 
OutOfMemoryErrors on scale down. Even before running out of memory, too little 
memory can have a negative impact on the effectiveness of the scaling.

We lack the capability to tune memory proportionally to the processing needs. 
In the same way that we measure CPU usage and size the tasks accordingly, we 
need to evaluate memory usage and adjust the heap memory size.

[https://docs.google.com/document/d/19GXHGL_FvN6WBgFvLeXpDABog2H_qqkw1_wrpamkFSc/edit])

> Tune Flink config of autoscaled jobs
> 
>
> Key: FLINK-34538
> URL: https://issues.apache.org/jira/browse/FLINK-34538
> Project: Flink
>  Issue Type: New Feature
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.8.0
>
>
> Umbrella issue to tackle



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34538) Tune Flink config of autoscaled jobs

2024-02-28 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-34538:
---
Fix Version/s: (was: kubernetes-operator-1.8.0)

> Tune Flink config of autoscaled jobs
> 
>
> Key: FLINK-34538
> URL: https://issues.apache.org/jira/browse/FLINK-34538
> Project: Flink
>  Issue Type: New Feature
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: pull-request-available
>
> Umbrella issue to tackle



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34538) Tune Flink config of autoscaled jobs

2024-02-28 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-34538:
---
Summary: Tune Flink config of autoscaled jobs  (was: Tune memory of 
autoscaled jobs)

> Tune Flink config of autoscaled jobs
> 
>
> Key: FLINK-34538
> URL: https://issues.apache.org/jira/browse/FLINK-34538
> Project: Flink
>  Issue Type: New Feature
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.8.0
>
>
> The current autoscaling algorithm adjusts the parallelism of the job task 
> vertices according to the processing needs. By adjusting the parallelism, we 
> systematically scale the amount of CPU for a task. At the same time, we also 
> indirectly change the amount of memory tasks have at their dispense. However, 
> there are some problems with this.
>  # Memory is overprovisioned: On scale up we may add more memory than we 
> actually need. Even on scale down, the memory / cpu ratio can still be off 
> and too much memory is used.
>  # Memory is underprovisioned: For stateful jobs, we risk running into 
> OutOfMemoryErrors on scale down. Even before running out of memory, too 
> little memory can have a negative impact on the effectiveness of the scaling.
> We lack the capability to tune memory proportionally to the processing needs. 
> In the same way that we measure CPU usage and size the tasks accordingly, we 
> need to evaluate memory usage and adjust the heap memory size.
> [https://docs.google.com/document/d/19GXHGL_FvN6WBgFvLeXpDABog2H_qqkw1_wrpamkFSc/edit]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)