Maximilian Michels created FLINK-33710:
------------------------------------------

             Summary: Autoscaler redeploys pipeline for a NOOP parallelism 
change
                 Key: FLINK-33710
                 URL: https://issues.apache.org/jira/browse/FLINK-33710
             Project: Flink
          Issue Type: Bug
          Components: Autoscaler, Kubernetes Operator
    Affects Versions: kubernetes-operator-1.7.0, kubernetes-operator-1.6.0
            Reporter: Maximilian Michels
            Assignee: Maximilian Michels
             Fix For: kubernetes-operator-1.8.0


The operator supports two modes to apply autoscaler changes:

# Use the internal Flink config {{pipeline.jobvertex-parallelism-overrides}} 
# Make use of Flink's Rescale API 

For (1), a string has to be generated for the Flink config with the actual 
overrides. This string has to be deterministic for a given map. But it is not.

Consider the following observed log:

{noformat}
  >>> Event  | Info    | SPECCHANGED     | SCALE change(s) detected (Diff: 
FlinkDeploymentSpec[flinkConfiguration.pipeline.jobvertex-parallelism-overrides 
: 
92542d1280187bd464274368a5f86977:3,9f979ed859083299d29f281832cb5be0:1,84881d7bda0dc3d44026e37403420039:1,1652184ffd0522859c7840a24936847c:1
 -> 
9f979ed859083299d29f281832cb5be0:1,84881d7bda0dc3d44026e37403420039:1,92542d1280187bd464274368a5f86977:3,1652184ffd0522859c7840a24936847c:1]),
 starting reconciliation. 
{noformat}

The overrides are identical but the order is different which triggers a 
redeploy. This does not seem to happen often but some deterministic string 
generation (e.g. sorting by key) is required to prevent any NOOP updates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to