gyfora opened a new pull request, #311:
URL: https://github.com/apache/flink-kubernetes-operator/pull/311
## What is the purpose of the change
Introduce histogram metrics for tracking how long resource lifecycle state
transitions take between the following states:
- CREATED
- SUSPENDED
- UPGRADING
- DEPLOYED
- STABLE
- ROLLING_BACK
- ROLLED_BACK
- FAILED
New metrics:
```
FlinkDeployment.Lifecycle.Transition.ResumeTimeSeconds: count=0, min=0,
max=0, mean=NaN, ...
FlinkDeployment.Lifecycle.Transition.SuspendTimeSeconds: count=1, min=2,
max=2, mean=2.0, ...
FlinkDeployment.Lifecycle.Transition.UpgradeTimeSeconds: count=1, min=33,
max=33, mean=33.0, ...
FlinkDeployment.Lifecycle.Transition.StabilizationTimeSeconds: count=1,
min=29, max=29, mean=29.0, ...
FlinkDeployment.Lifecycle.Transition.RollbackTimeSeconds: count=0, min=0,
max=0, mean=NaN, ...
FlinkDeployment.Lifecycle.Transition.SubmissionTimeSeconds: count=1, min=1,
max=1, mean=1.0, ...
FlinkDeployment.Lifecycle.State.STATE_NAME.Count: 0
```
## Brief change log
- Introduce ResourceLifecycleState derived from the resource status
- Add mechanism to track ResourceLifecycleState transitions
- Create histogram metrics for select transitions
- Add count metrics for each state
- Add tests
## Verifying this change
New unit tests + manually verified on minikube
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`:
no
- Core observer or reconciler logic that is regularly executed: no
## Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? **[TODO]**
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]