james-kan-shopify opened a new pull request, #1052:
URL: https://github.com/apache/flink-kubernetes-operator/pull/1052

   ## What is the purpose of the change
   
   Jira Issue: https://issues.apache.org/jira/browse/FLINK-38787
   
   This pull request adds lifecycle metrics support for 
`FlinkBlueGreenDeployment` resources, enabling operators to monitor blue-green 
deployment state transitions and timing. This provides observability into the 
deployment pipeline, helping identify bottlenecks and track deployment health. 
The implementation heavily mirrors the existing FlinkDeployment metrics' 
implementation to ensure consistency. 
   
   Note: Most lines of code introduced are for test files! 
   
   ## Brief change log
   
   1. **Real-time State Distribution Tracking**
      - Namespace-level gauges showing current count of deployments in each 
blue-green state and Flink job status.
   
   2. **Lifecycle Transition Timing**
      - Histogram metrics measuring duration of key transitions (initial 
deployment, blue-to-green, green-to-blue) and time spent in each state, 
available at system and namespace levels.
   
   3. **Historical Failure Tracking**
      - Accumulating counter that increments on each transition to FAILING 
state, enabling failure rate calculation and long-term reliability monitoring.
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
     - Added `BlueGreenLifecycleMetricsTest` to verify histogram creation, 
namespace isolation, and metric registration
   
     - Added `BlueGreenResourceLifecycleMetricTrackerTest` to verify state 
transition timing, rollback scenarios, and intermediate state recording
   
     - Tests cover initial deployment, blue-to-green transitions, green-to-blue 
transitions, and failed transition rollbacks
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
   
     - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
   
     - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? yes
   
     - If yes, how is the feature documented? docs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to