wmedvede commented on code in PR #540:
URL:
https://github.com/apache/incubator-kie-kogito-serverless-operator/pull/540#discussion_r1804792648
##########
test/e2e/testdata/workflows/prometheus/k8s_deployment/01-sonataflow_platform.yaml:
##########
@@ -0,0 +1,21 @@
+# Copyright 2024 Apache Software Foundation (ASF)
Review Comment:
License header not good. See latest fixes in main for the headers.
##########
internal/controller/profiles/dev/states_dev.go:
##########
@@ -142,6 +151,14 @@ func (e *ensureRunningWorkflowState) Do(ctx
context.Context, workflow *operatora
return ctrl.Result{RequeueAfter: constants.RequeueAfterIsRunning},
objs, nil
}
+func (e *ensureRunningWorkflowState) ensureServiceMonitor(ctx context.Context,
workflow *operatorapi.SonataFlow, pl *operatorapi.SonataFlowPlatform)
(client.Object, error) {
+ if monitoring.IsMonitoringEnabled(pl) {
+ serviceMonitor, _, err := e.ensurers.serviceMonitor.Ensure(ctx,
workflow)
Review Comment:
@jianrongzhang89 The other day we chatted that the devmode image is not
incorporating the metrics extensions, Okay, we have PR waiting to be merged for
that, so, in the short term, the proper extension will be there, and the
metrics available for the dev profile.
But the question is: what happens in these situations, where a WF image do
not incorporate the metrics extensions?
The ServiceMonitor is created, and produce log errors while trying to
connect with it? can we see any failing condition in the corresponding Status?
Any other undesired bad effect along the Prometheus execution path?
Then, thinking aloud, and about the gitops profile, preview profile, or any
other situation where the user provides it own workflow image, maybe in a
complementary PR, the images verification work by @treblereel, could be
extended a bit, to also try to determine if a particular WF image has the
metrics extension present. And, if not, we can skip creating that
ServiceMonitor and simply log a warning instead of creating a "failing"
ServiceMonitor.
Here we might have different alternatives.
If spec.monitoring.enabled == true , and WF has not metric extensions
present.
1) we can fail-fast the WF deployment saying that it's not prepared for
monitoing
2) we can log a message, skip the creation of the ServiceMonitor, and
proceed with the WF deployment.
3) or always create the ServiceMonitor even when we know it'll in a failing
situation.
4) any other
does it make sense @jianrongzhang89 @ricardozanini ?
##########
test/e2e/testdata/workflows/prometheus/k8s_deployment/02-sonataflow_greetings.yaml:
##########
@@ -0,0 +1,37 @@
+# Copyright 2023 Red Hat, Inc. and/or its affiliates
Review Comment:
License header not good. See latest fixes in main for the headers.
##########
internal/controller/profiles/common/object_creators.go:
##########
@@ -439,10 +443,37 @@ func UserPropsConfigMapCreator(workflow
*operatorapi.SonataFlow) (client.Object,
// ManagedPropsConfigMapCreator creates an empty ConfigMap to hold the
external application properties
func ManagedPropsConfigMapCreator(workflow *operatorapi.SonataFlow, platform
*operatorapi.SonataFlowPlatform) (client.Object, error) {
-
props, err := properties.ApplicationManagedProperties(workflow,
platform)
if err != nil {
return nil, err
}
return workflowproj.CreateNewManagedPropsConfigMap(workflow, props), nil
}
+
+// ServiceMonitorCreator is an ObjectsCreator for Service Monitor for the
workflow service.
+func ServiceMonitorCreator(workflow *operatorapi.SonataFlow) (client.Object,
error) {
+ lbl := workflowproj.GetMergedLabels(workflow)
+ spec := &prometheus.ServiceMonitorSpec{
+ Selector: metav1.LabelSelector{
+ MatchLabels: map[string]string{
Review Comment:
If we compare the pattern applied at the ServiceCreator we have that:
1) the Service use workflowproj.GetMergedLabels(workflow) for it's selector
2) the Service Labels are == workflowproj.GetMergedLabels(workflow)
For the ServiceMonitorCreator I see a different pattern.
1) The ServiceMonitor use two individual labels for it's selector
2) The ServiceMonitor Labels: are == workflowproj.GetMergedLabels(workflow)
Maybe we can unify the pattern and use
workflowproj.GetMergedLabels(workflow) for the ServiceMonitor selector too.
##########
test/e2e/testdata/workflows/prometheus/k8s_deployment/kustomization.yaml:
##########
@@ -0,0 +1,20 @@
+# Copyright 2024 Apache Software Foundation (ASF)
Review Comment:
same comment
##########
internal/controller/monitoring/monitoring.go:
##########
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package monitoring
+
+import (
+ operatorapi
"github.com/apache/incubator-kie-kogito-serverless-operator/api/v1alpha08"
+ "github.com/apache/incubator-kie-kogito-serverless-operator/utils"
+ "k8s.io/client-go/rest"
+)
+
+const (
+ prometheusGroup = "monitoring.coreos.com"
+)
+
+func GetPrometheusAvailability(cfg *rest.Config) (bool, error) {
Review Comment:
It looks like the prometheus operator can only be installed in a
per-namespace basis.
See screenshot below.
My question is: is the query above GetPrometheusAvailability is namespace
sensitive?
Also, if I'm not wrong, It looks like in the sonataflow_platformcontroller
we ask this function only to print a message saying if prometheus is available
or not, be we don't do the query when we create the ServiceMonitor for a
particular worklow.
So, it might happen that:
1) in namespace1 (without prometheus deployed) we still mark the platform
with spec.monitoring.enabled. (we'll get a warning in the logs)
2) we deploy a WF in namespace1
3) we still try to create the ServiceMonitor in namesapce1, not good.
@jianrongzhang89 I am correct?

##########
test/testdata/grafana.yaml:
##########
@@ -0,0 +1,26 @@
+# Copyright 2024 Apache Software Foundation (ASF)
Review Comment:
same here
##########
test/testdata/prometheus.yaml:
##########
@@ -0,0 +1,69 @@
+# Copyright 2024 Apache Software Foundation (ASF)
Review Comment:
same here
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]