pgrefviau opened a new pull request, #926:
URL: https://github.com/apache/flink-kubernetes-operator/pull/926

   ## What is the purpose of the change
   
   This PR adds new metrics that help track the current value of different 
states/statuses at the resource level. In some cases, metrics already exists 
for some of these statuses/states, but those metrics represent namespace or 
system-wide counts, as opposed to per-resource gauges that indicate whether or 
not a deployment/session job is in a particular state.
   
   In other cases, some statuses/states that weren't yet tracked through a 
dedicated metric (ex: job status) now have a resource-level gauge and 
namespace-level counter.
   
   ## Brief change log
   
   Summary of the changes for each state/status:
   - `JobManagerDeploymentStatus`: state gauge added at resource-level 
(FlinkDeployment only)
   - `JobStatus`: status gauge added at resource-level (FlinkDeployment only), 
status counter at namespace-level
   - `ResourceLifecycleState`: state gauge added at resource-level
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   - Updated test cases for resource lifecycle metrics and Flink deployment 
metrics to account for new resource-level metrics
     - Also added utility methods to test classes to reduce duplicated test 
logic
   - Changes were deployed and tested using our own fork/instance of the 
operator
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
     - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     
   N.B. While these changes might not represent a full-on "feature", I'm 
planning to update the documentation that generates 
[this](https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/metrics-logging/)
 page. However, I've held off doing this as part of this initial commit in 
order to settle the naming and implementation. Once this is done, I can update 
the documentation accordingly.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to