kevdoran commented on a change in pull request #3257: NIFI-5435 Prometheus /metrics http endpoint for monitoring integration URL: https://github.com/apache/nifi/pull/3257#discussion_r272283341
########## File path: nifi-nar-bundles/nifi-prometheus-bundle/nifi-prometheus-reporting-task/src/main/java/org/apache/nifi/reporting/prometheus/api/PrometheusMetricsUtil.java ########## @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.nifi.reporting.prometheus.api; + +import java.util.Collection; +import java.util.Map; + +import org.apache.nifi.controller.status.ProcessGroupStatus; +import org.apache.nifi.controller.status.ProcessorStatus; + +import com.yammer.metrics.core.VirtualMachineMetrics; + +import io.prometheus.client.CollectorRegistry; +import io.prometheus.client.Gauge; + +public class PrometheusMetricsUtil { + private static final CollectorRegistry NIFI_REGISTRY = new CollectorRegistry(); + private static final CollectorRegistry JVM_REGISTRY = new CollectorRegistry(); + + private static final Gauge AMOUNT_FLOWFILES_TOTAL = Gauge.build() + .name("process_group_amount_flowfiles_total") + .help("Total number of FlowFiles in ProcessGroup") + .labelNames("status", "application", "process_group") + .register(NIFI_REGISTRY); + + private static final Gauge AMOUNT_BYTES_TOTAL = Gauge.build() + .name("process_group_amount_bytes_total") + .help("Total number of Bytes in ProcessGroup") + .labelNames("status", "application", "process_group") + .register(NIFI_REGISTRY); + + private static final Gauge AMOUNT_THREADS_TOTAL = Gauge.build() + .name("process_group_amount_threads_total") + .help("Total amount of threads in ProcessGroup") + .labelNames("status", "application", "process_group") + .register(NIFI_REGISTRY); + + private static final Gauge SIZE_CONTENT_TOTAL = Gauge.build() + .name("process_group_size_content_total") + .help("Total size of content in ProcessGroup") + .labelNames("status", "application", "process_group") + .register(NIFI_REGISTRY); + + private static final Gauge AMOUNT_ITEMS = Gauge.build() + .name("process_group_amount_items") + .help("Total amount of items in ProcessGroup") + .labelNames("status", "application", "process_group") + .register(NIFI_REGISTRY); + + private static final Gauge PROCESSOR_COUNTERS = Gauge.build() + .name("processor_counters") + .help("Counters exposed by Processors") + .labelNames("processor_name", "counter_name", "processor_id") + .register(NIFI_REGISTRY); + + private static final Gauge JVM_HEAP = Gauge.build() + .name("jvm_heap_stats") + .help("The JVM heap stats") + .labelNames("status") + .register(JVM_REGISTRY); + + private static final Gauge JVM_THREAD = Gauge.build() + .name("jvm_thread_stats") + .help("The JVM thread stats") + .labelNames("status") + .register(JVM_REGISTRY); + + private static final Gauge JVM_STATUS = Gauge.build() + .name("jvm_general_stats") + .help("The JVM general stats") + .labelNames("status") Review comment: Questions and Comments: 1. Instead of using application=nifi as a label, I think it would be better to give all metrics a `nifi_` prefix so that they are easily located in a multi-application prometheus. I would include this prefix for JVM metrics as well, i.e., `nifi_jvm_` 2. As stated in my other comment, an `instance` label is missing. I think it should be included on by NiFi and JVM metrics, correct? 3. I am not comfortable with the approach of using the metric `name` as a category and the `status` label value as the actual metric. Please refer to and follow (within reason) the naming best practices set forth in the [Prometheus Documentation for Metric and Label Naming](https://prometheus.io/docs/practices/naming/). In particular: _A single metric name should represent the same logical thing being measured_. The other naming best practices in the link above apply as well. 4. For process_group label values, you are using process group names, which do not need to be unique. For example, if I create a process group named "NiFi Flow", its metrics get combined with the root process group's metrics. I think it would be better to use process group ID, which is guaranteed to be unique and cannot change. alternatively, you could use a "fully qualified" process group name, which includes the names of the PGs in its hierarchy, but I'm not sure how easy that will be to create each time... ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
