james-kan-shopify opened a new pull request, #1052: URL: https://github.com/apache/flink-kubernetes-operator/pull/1052
## What is the purpose of the change Jira Issue: https://issues.apache.org/jira/browse/FLINK-38787 This pull request adds lifecycle metrics support for `FlinkBlueGreenDeployment` resources, enabling operators to monitor blue-green deployment state transitions and timing. This provides observability into the deployment pipeline, helping identify bottlenecks and track deployment health. The implementation heavily mirrors the existing FlinkDeployment metrics' implementation to ensure consistency. Note: Most lines of code introduced are for test files! ## Brief change log 1. **Real-time State Distribution Tracking** - Namespace-level gauges showing current count of deployments in each blue-green state and Flink job status. 2. **Lifecycle Transition Timing** - Histogram metrics measuring duration of key transitions (initial deployment, blue-to-green, green-to-blue) and time spent in each state, available at system and namespace levels. 3. **Historical Failure Tracking** - Accumulating counter that increments on each transition to FAILING state, enabling failure rate calculation and long-term reliability monitoring. ## Verifying this change This change added tests and can be verified as follows: - Added `BlueGreenLifecycleMetricsTest` to verify histogram creation, namespace isolation, and metric registration - Added `BlueGreenResourceLifecycleMetricTrackerTest` to verify state transition timing, rollback scenarios, and intermediate state recording - Tests cover initial deployment, blue-to-green transitions, green-to-blue transitions, and failed transition rollbacks ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changes to the `CustomResourceDescriptors`: no - Core observer or reconciler logic that is regularly executed: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
