[
https://issues.apache.org/jira/browse/FLINK-32170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725789#comment-17725789
]
Maximilian Michels commented on FLINK-32170:
--------------------------------------------
Yes, this is the prerequisite. If we kept an in-memory copy of the job topology
after the job leaves the RUNNING phase, it should be easy to assert this.
> Continue metric collection on intermittant job restarts
> -------------------------------------------------------
>
> Key: FLINK-32170
> URL: https://issues.apache.org/jira/browse/FLINK-32170
> Project: Flink
> Issue Type: Improvement
> Components: Autoscaler, Kubernetes Operator
> Reporter: Maximilian Michels
> Priority: Major
>
> If the underlying infrastructure is not stable, e.g. Kubernetes pod eviction,
> the jobs will sometimes restart. This will restart the metric collection
> process for the autoscaler and discard any existing metrics. If the
> interruption time is short, e.g. less than one minute, we could consider
> resuming metric collection after the job goes back into RUNNING state.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)