Also!  I do have 2 FlinkDeployments deployed with this operator, but they
are in different namespaces, and each of the per namespace metrics reports
that it has 2 Deployments in them, even though there is only one according
to kubectl.

Actually...we just tried to deploy a change (enabling some checkpointing)
that caused one of our FlinkDeployments to fail.  Now, both namespace
STABLE_Counts each report 1.

# curl -s <pod_ip>:<prom_port> | grep
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
1.0
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="rdf_streaming_updater",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
1.0

It looks like maybe this metric is not reporting per namespace, but a
global count.



On Mon, May 22, 2023 at 2:56 PM Andrew Otto <o...@wikimedia.org> wrote:

> Oh, FWIW, I do have operator HA enabled with 2 replicas running, but in my
> examples there, I am curl-ing the leader flink operator pod.
>
>
>
> On Mon, May 22, 2023 at 2:47 PM Andrew Otto <o...@wikimedia.org> wrote:
>
>> Hello!
>>
>> I'm doing some grafana+prometheus dashboarding for
>> flink-kubernetes-operator.  Reading metrics docs
>> <https://stackoverflow.com/a/61795256>, I see that I have nice per k8s
>> namespace lifecycle current count gauge metrics in Prometheus.
>>
>> Using kubectl, I can see that I have one FlinkDeployment in my namespace:
>>
>> # kubectl -n stream-enrichment-poc get flinkdeployments
>> NAME             JOB STATUS   LIFECYCLE STATE
>> flink-app-main   RUNNING      STABLE
>>
>> But, prometheus is reporting that I have 2 FlinkDeployments in the STABLE
>> state.
>>
>> # curl -s <pod_ip>:<prom_port>  | grep
>> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count
>> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
>> 2.0
>>
>> I'm not sure why I see 2.0 reported.
>> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count reports only
>> one FlinkDeployment.
>>
>> # curl <pod_ip>:<prom_port>/metrics | grep
>> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count
>> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
>> 1.0
>>
>> Is it possible that
>> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count is being
>> reported as an incrementing counter instead of a guage?
>>
>> Thanks
>> -Andrew Otto
>>  Wikimedia Foundation
>>
>>

Reply via email to