Oh, FWIW, I do have operator HA enabled with 2 replicas running, but in my examples there, I am curl-ing the leader flink operator pod.
On Mon, May 22, 2023 at 2:47 PM Andrew Otto <o...@wikimedia.org> wrote: > Hello! > > I'm doing some grafana+prometheus dashboarding for > flink-kubernetes-operator. Reading metrics docs > <https://stackoverflow.com/a/61795256>, I see that I have nice per k8s > namespace lifecycle current count gauge metrics in Prometheus. > > Using kubectl, I can see that I have one FlinkDeployment in my namespace: > > # kubectl -n stream-enrichment-poc get flinkdeployments > NAME JOB STATUS LIFECYCLE STATE > flink-app-main RUNNING STABLE > > But, prometheus is reporting that I have 2 FlinkDeployments in the STABLE > state. > > # curl -s <pod_ip>:<prom_port> | grep > flink_k8soperator_namespace_Lifecycle_State_STABLE_Count > flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",} > 2.0 > > I'm not sure why I see 2.0 reported. > flink_k8soperator_namespace_JmDeploymentStatus_READY_Count reports only > one FlinkDeployment. > > # curl <pod_ip>:<prom_port>/metrics | grep > flink_k8soperator_namespace_JmDeploymentStatus_READY_Count > flink_k8soperator_namespace_JmDeploymentStatus_READY_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",} > 1.0 > > Is it possible that > flink_k8soperator_namespace_Lifecycle_State_STABLE_Count is being > reported as an incrementing counter instead of a guage? > > Thanks > -Andrew Otto > Wikimedia Foundation > >