[
https://issues.apache.org/jira/browse/FLINK-36557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sai Sharath Dandi updated FLINK-36557:
--------------------------------------
Description:
The KubernetesJobAutoScalerContext is
[cached|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkResourceContext.java#L59]
in the FlinkResourceContext and reused. If the JobAutoscalerContext is
initialized before the job reaches Running state, it can cause the autoscaler
to not trigger -
[link|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobAutoScalerImpl.java#L98]
We need to either refresh the AutoScalerContext similar to the standalone
[implementation|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler-standalone/src/main/java/org/apache/flink/autoscaler/standalone/StandaloneAutoscalerExecutor.java#L127]
or the autoscaler module itself needs to refresh the job status
was:
The KubernetesJobAutoScalerContext is
[cached|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkResourceContext.java#L59]
in the FlinkResourceContext and reused. If the JobAutoscalerContext is
initialized before the job reaches Running state, it can cause the autoscaler
to not trigger -
[link|[https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobAutoScalerImpl.java#L98].]
We need to either refresh the AutoScalerContext similar to the standalone
[implementation|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler-standalone/src/main/java/org/apache/flink/autoscaler/standalone/StandaloneAutoscalerExecutor.java#L127]
or the autoscaler module itself needs to refresh the job status
> Stale Autoscaler Context in Kubernetes Operator
> -----------------------------------------------
>
> Key: FLINK-36557
> URL: https://issues.apache.org/jira/browse/FLINK-36557
> Project: Flink
> Issue Type: Improvement
> Components: Autoscaler, Kubernetes Operator
> Reporter: Sai Sharath Dandi
> Priority: Minor
>
> The KubernetesJobAutoScalerContext is
> [cached|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkResourceContext.java#L59]
> in the FlinkResourceContext and reused. If the JobAutoscalerContext is
> initialized before the job reaches Running state, it can cause the autoscaler
> to not trigger -
> [link|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobAutoScalerImpl.java#L98]
>
> We need to either refresh the AutoScalerContext similar to the standalone
> [implementation|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler-standalone/src/main/java/org/apache/flink/autoscaler/standalone/StandaloneAutoscalerExecutor.java#L127]
> or the autoscaler module itself needs to refresh the job status
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
