Morning all,

Recently been testing the Flink k8s operator (flink.apache.org/v1beta1) and
although the jobs do startup and run perfectly fine, their status in ArgoCD
is not yet as it should be, some details:

When describing the flinkdeployment I'm currently trying to test, the
follows appears in events:

  Type    Reason         Age   From                  Message
  ----    ------         ----  ----                  -------
  Normal  Submit         22m   JobManagerDeployment  Starting deployment
  Normal  StatusChanged  21m   Job                   Job status changed
from RECONCILING to CREATED
  Normal  StatusChanged  20m   Job                   Job status changed
from CREATED to RUNNING

On top of it, the reconciliation timestamp and the state are as follows:

    Reconciliation Timestamp:  1670581014190
    State:                     DEPLOYED

>From what I've read in the docs, the flinkdeployment is not considered
healthy until that state: STABLE, right?

- DEPLOYED : The resource is deployed/submitted to Kubernetes, but it’s not
yet considered to be stable and might be rolled back in the future
- STABLE : The resource deployment is considered to be stable and won’t be
rolled back

The jobs have been running for some hours already, one of them would throw
some exceptions but won't cause downtime. What does it take for the job to
be in STABLE state rather than just DEPLOYED? Would that be the cause of
the Processing... health status in ArgoCD or just that internally in k8s
the flinkoperator can't really notice the pods running well?

Reply via email to