asafm commented on code in PR #17041:
URL: https://github.com/apache/pulsar/pull/17041#discussion_r959266056


##########
pulsar-broker/src/test/java/org/apache/pulsar/broker/stats/PrometheusMetricsTest.java:
##########
@@ -1557,49 +1557,40 @@ public void testMetadataStoreStats() throws Exception {
         String metricsStr = output.toString();
         Multimap<String, Metric> metricsMap = parseMetrics(metricsStr);
 
-        Collection<Metric> getOpsFailed = 
metricsMap.get("pulsar_metadata_store_get_failed" + "_total");
-        Collection<Metric> delOpsFailed = 
metricsMap.get("pulsar_metadata_store_del_failed" + "_total");
-        Collection<Metric> putOpsFailed = 
metricsMap.get("pulsar_metadata_store_put_failed" + "_total");
-
-        Collection<Metric> getOpsLatency = 
metricsMap.get("pulsar_metadata_store_get_latency_ms" + "_sum");
-        Collection<Metric> delOpsLatency = 
metricsMap.get("pulsar_metadata_store_del_latency_ms" + "_sum");
-        Collection<Metric> putOpsLatency = 
metricsMap.get("pulsar_metadata_store_put_latency_ms" + "_sum");
-
+        Collection<Metric> opsFailed = 
metricsMap.get("pulsar_metadata_store_ops_failed" + "_total");
+        Collection<Metric> opsLatency = 
metricsMap.get("pulsar_metadata_store_ops_latency_ms" + "_sum");
         Collection<Metric> putBytes = 
metricsMap.get("pulsar_metadata_store_put_bytes" + "_total");
 
-        Assert.assertTrue(getOpsFailed.size() > 1);
-        Assert.assertTrue(delOpsFailed.size() > 1);
-        Assert.assertTrue(putOpsFailed.size() > 1);
-        Assert.assertTrue(getOpsLatency.size() > 1);
-        Assert.assertTrue(delOpsLatency.size() > 1);
-        Assert.assertTrue(putOpsLatency.size() > 1);
+        Assert.assertTrue(opsFailed.size() > 1);

Review Comment:
   I personally like [AssertJ](https://assertj.github.io/): 
`assertThat(opsFailed.size()).isGreaterThen(1);`
   When the test fails, you see the value of `size()`



##########
pulsar-broker/src/test/java/org/apache/pulsar/broker/stats/PrometheusMetricsTest.java:
##########
@@ -1557,49 +1557,40 @@ public void testMetadataStoreStats() throws Exception {
         String metricsStr = output.toString();
         Multimap<String, Metric> metricsMap = parseMetrics(metricsStr);
 
-        Collection<Metric> getOpsFailed = 
metricsMap.get("pulsar_metadata_store_get_failed" + "_total");
-        Collection<Metric> delOpsFailed = 
metricsMap.get("pulsar_metadata_store_del_failed" + "_total");
-        Collection<Metric> putOpsFailed = 
metricsMap.get("pulsar_metadata_store_put_failed" + "_total");
-
-        Collection<Metric> getOpsLatency = 
metricsMap.get("pulsar_metadata_store_get_latency_ms" + "_sum");
-        Collection<Metric> delOpsLatency = 
metricsMap.get("pulsar_metadata_store_del_latency_ms" + "_sum");
-        Collection<Metric> putOpsLatency = 
metricsMap.get("pulsar_metadata_store_put_latency_ms" + "_sum");
-
+        Collection<Metric> opsFailed = 
metricsMap.get("pulsar_metadata_store_ops_failed" + "_total");
+        Collection<Metric> opsLatency = 
metricsMap.get("pulsar_metadata_store_ops_latency_ms" + "_sum");
         Collection<Metric> putBytes = 
metricsMap.get("pulsar_metadata_store_put_bytes" + "_total");
 
-        Assert.assertTrue(getOpsFailed.size() > 1);
-        Assert.assertTrue(delOpsFailed.size() > 1);
-        Assert.assertTrue(putOpsFailed.size() > 1);
-        Assert.assertTrue(getOpsLatency.size() > 1);
-        Assert.assertTrue(delOpsLatency.size() > 1);
-        Assert.assertTrue(putOpsLatency.size() > 1);
+        Assert.assertTrue(opsFailed.size() > 1);
+        Assert.assertTrue(opsLatency.size() > 1);
         Assert.assertTrue(putBytes.size() > 1);
 
-        for (Metric m : getOpsFailed) {
-            Assert.assertEquals(m.tags.get("cluster"), "test");
-            Assert.assertTrue(m.tags.get("name").startsWith("metadata-store"));
-        }
-        for (Metric m : delOpsFailed) {
-            Assert.assertEquals(m.tags.get("cluster"), "test");
-            Assert.assertTrue(m.tags.get("name").startsWith("metadata-store"));
-        }
-        for (Metric m : putOpsFailed) {
-            Assert.assertEquals(m.tags.get("cluster"), "test");
-            Assert.assertTrue(m.tags.get("name").startsWith("metadata-store"));
-        }
-        for (Metric m : getOpsLatency) {
-            Assert.assertEquals(m.tags.get("cluster"), "test");
-            Assert.assertTrue(m.tags.get("name").startsWith("metadata-store"));
-            Assert.assertTrue(m.value > 0);
-        }
-        for (Metric m : delOpsLatency) {
+        for (Metric m : opsFailed) {
             Assert.assertEquals(m.tags.get("cluster"), "test");
             Assert.assertTrue(m.tags.get("name").startsWith("metadata-store"));
+            if (m.tags.get("type").equals("get")) {
+                Assert.assertTrue(m.value >= 0);

Review Comment:
   If you're not using assertJ, I would at least import static `Assert`



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/stats/MetadataStoreStats.java:
##########
@@ -0,0 +1,109 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pulsar.metadata.impl.stats;
+
+import io.prometheus.client.Counter;
+import io.prometheus.client.Histogram;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public final class MetadataStoreStats implements AutoCloseable {
+    private static final double[] BUCKETS = new double[]{1, 3, 5, 10, 20, 50, 
100, 200, 500, 1000, 2000, 5000};
+    private static final String OPS_TYPE_LABEL_NAME = "type";
+    private static final String METADATA_STORE_LABEL_NAME = "name";
+    private static final String OPS_TYPE_GET = "get";
+    private static final String OPS_TYPE_DEL = "del";
+    private static final String OPS_TYPE_PUT = "put";
+    protected static final String PREFIX = "pulsar_metadata_store_";
+
+    private static final Histogram OPS_SUCCEED = Histogram
+            .build(PREFIX + "ops_latency", "-")
+            .unit("ms")
+            .buckets(BUCKETS)
+            .labelNames(METADATA_STORE_LABEL_NAME, OPS_TYPE_LABEL_NAME)
+            .register();
+    private static final Counter OPS_FAILED = Counter
+            .build(PREFIX + "ops_failed", "-")
+            .labelNames(METADATA_STORE_LABEL_NAME, OPS_TYPE_LABEL_NAME)
+            .register();
+    private static final Counter PUT_BYTES = Counter
+            .build(PREFIX + "put_", "-")

Review Comment:
   Based on Prometheus code you need to change `put_` to `put`:
   ```java
       unit = b.unit;
       if (!unit.isEmpty() && !name.endsWith("_" + unit)) {
         name += "_" + unit;
       }
   ```



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/stats/MetadataStoreStats.java:
##########
@@ -24,48 +24,32 @@
 
 public final class MetadataStoreStats implements AutoCloseable {
     private static final double[] BUCKETS = new double[]{1, 3, 5, 10, 20, 50, 
100, 200, 500, 1000, 2000, 5000};
-    private static final String LABEL_NAME = "name";
+    private static final String OPS_TYPE_LABEL_NAME = "type";
+    private static final String METADATA_STORE_LABEL_NAME = "name";
+    private static final String OPS_TYPE_GET = "get";
+    private static final String OPS_TYPE_DEL = "del";
+    private static final String OPS_TYPE_PUT = "put";
     protected static final String PREFIX = "pulsar_metadata_store_";
 
-    private static final Histogram GET_OPS_SUCCEED = Histogram
-            .build(PREFIX + "get_latency", "-")
+    private static final Histogram OPS_SUCCEED = Histogram

Review Comment:
   I would camelCase this - although it's static, it's not a constant, it's an 
actual object, so `opsSucceededHistogram`



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/AbstractMetadataStore.java:
##########
@@ -126,6 +131,9 @@ public CompletableFuture<Boolean> asyncReload(String key, 
Boolean oldValue,
                         }
                     }
                 });
+
+        this.metadataStoreName = poolName + "-" + ID.getAndIncrement();
+        this.metadataStoreStats = new MetadataStoreStats(metadataStoreName);

Review Comment:
   `this.` seem redundant here. It's relevant to all places in this PR.
   



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/AbstractMetadataStore.java:
##########
@@ -126,6 +131,9 @@ public CompletableFuture<Boolean> asyncReload(String key, 
Boolean oldValue,
                         }
                     }
                 });
+
+        this.metadataStoreName = poolName + "-" + ID.getAndIncrement();

Review Comment:
   I don't get this at all.
   1. Do we have more than one metadata store in Pulsar Broker?
   2. Why a name is derived from a thread pool name?
   3. Why the name is actually an auto-generated name?



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/stats/MetadataStoreStats.java:
##########
@@ -0,0 +1,109 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pulsar.metadata.impl.stats;
+
+import io.prometheus.client.Counter;
+import io.prometheus.client.Histogram;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public final class MetadataStoreStats implements AutoCloseable {
+    private static final double[] BUCKETS = new double[]{1, 3, 5, 10, 20, 50, 
100, 200, 500, 1000, 2000, 5000};
+    private static final String OPS_TYPE_LABEL_NAME = "type";
+    private static final String METADATA_STORE_LABEL_NAME = "name";
+    private static final String OPS_TYPE_GET = "get";
+    private static final String OPS_TYPE_DEL = "del";
+    private static final String OPS_TYPE_PUT = "put";
+    protected static final String PREFIX = "pulsar_metadata_store_";
+
+    private static final Histogram OPS_SUCCEED = Histogram
+            .build(PREFIX + "ops_latency", "-")
+            .unit("ms")
+            .buckets(BUCKETS)
+            .labelNames(METADATA_STORE_LABEL_NAME, OPS_TYPE_LABEL_NAME)
+            .register();
+    private static final Counter OPS_FAILED = Counter
+            .build(PREFIX + "ops_failed", "-")
+            .labelNames(METADATA_STORE_LABEL_NAME, OPS_TYPE_LABEL_NAME)
+            .register();
+    private static final Counter PUT_BYTES = Counter
+            .build(PREFIX + "put_", "-")
+            .unit("bytes")
+            .labelNames(METADATA_STORE_LABEL_NAME)
+            .register();
+
+    private final Histogram.Child getOpsSucceedChild;
+    private final Histogram.Child delOpsSucceedChild;
+    private final Histogram.Child putOpsSucceedChild;
+    private final Counter.Child getOpsFailedChild;
+    private final Counter.Child delOpsFailedChild;
+    private final Counter.Child putOpsFailedChild;
+    private final Counter.Child putBytesChild;
+    private final String metadataStoreName;
+    private final AtomicBoolean closed = new AtomicBoolean(false);
+
+    public MetadataStoreStats(String metadataStoreName) {
+        this.metadataStoreName = metadataStoreName;
+
+        this.getOpsSucceedChild = OPS_SUCCEED.labels(metadataStoreName, 
OPS_TYPE_GET);
+        this.delOpsSucceedChild = OPS_SUCCEED.labels(metadataStoreName, 
OPS_TYPE_DEL);
+        this.putOpsSucceedChild = OPS_SUCCEED.labels(metadataStoreName, 
OPS_TYPE_PUT);
+        this.getOpsFailedChild = OPS_FAILED.labels(metadataStoreName, 
OPS_TYPE_GET);
+        this.delOpsFailedChild = OPS_FAILED.labels(metadataStoreName, 
OPS_TYPE_DEL);
+        this.putOpsFailedChild = OPS_FAILED.labels(metadataStoreName, 
OPS_TYPE_PUT);
+        this.putBytesChild = PUT_BYTES.labels(metadataStoreName);
+    }
+
+    public void recordGetOpsLatency(long millis) {
+        this.getOpsSucceedChild.observe(millis);
+    }
+
+    public void recordDelOpsLatency(long millis) {
+        this.delOpsSucceedChild.observe(millis);
+    }
+
+    public void recordPutOpsLatency(long millis, int bytes) {
+        this.putOpsSucceedChild.observe(millis);
+        this.putBytesChild.inc(bytes);
+    }
+
+    public void recordGetOpsFailed() {
+        this.getOpsFailedChild.inc();
+    }
+
+    public void recordDelOpsFailed() {
+        this.delOpsFailedChild.inc();
+    }
+
+    public void recordPutOpsFailed() {
+        this.putOpsFailedChild.inc();
+    }
+
+    @Override
+    public void close() throws Exception {
+        if (this.closed.compareAndSet(false, true)) {

Review Comment:
   I wouldn't bother with this check - it's a developer bug to call close twice



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/stats/MetadataStoreStats.java:
##########
@@ -76,12 +60,12 @@ public final class MetadataStoreStats implements 
AutoCloseable {
     public MetadataStoreStats(String metadataStoreName) {
         this.metadataStoreName = metadataStoreName;
 
-        this.getOpsSucceedChild = GET_OPS_SUCCEED.labels(metadataStoreName);
-        this.delOpsSucceedChild = DEL_OPS_SUCCEED.labels(metadataStoreName);
-        this.puttOpsSucceedChild = PUT_OPS_SUCCEED.labels(metadataStoreName);
-        this.getOpsFailedChild = GET_OPS_FAILED.labels(metadataStoreName);
-        this.delOpsFailedChild = DEL_OPS_FAILED.labels(metadataStoreName);
-        this.putOpsFailedChild = PUT_OPS_FAILED.labels(metadataStoreName);
+        this.getOpsSucceedChild = OPS_SUCCEED.labels(metadataStoreName, 
OPS_TYPE_GET);

Review Comment:
   why `this.` prefix? There is no collision in constructor arguments.



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/stats/MetadataStoreStats.java:
##########
@@ -0,0 +1,109 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pulsar.metadata.impl.stats;
+
+import io.prometheus.client.Counter;
+import io.prometheus.client.Histogram;
+import java.util.concurrent.atomic.AtomicBoolean;
+
+public final class MetadataStoreStats implements AutoCloseable {
+    private static final double[] BUCKETS = new double[]{1, 3, 5, 10, 20, 50, 
100, 200, 500, 1000, 2000, 5000};
+    private static final String OPS_TYPE_LABEL_NAME = "type";
+    private static final String METADATA_STORE_LABEL_NAME = "name";
+    private static final String OPS_TYPE_GET = "get";
+    private static final String OPS_TYPE_DEL = "del";
+    private static final String OPS_TYPE_PUT = "put";
+    protected static final String PREFIX = "pulsar_metadata_store_";
+
+    private static final Histogram OPS_SUCCEED = Histogram
+            .build(PREFIX + "ops_latency", "-")
+            .unit("ms")
+            .buckets(BUCKETS)
+            .labelNames(METADATA_STORE_LABEL_NAME, OPS_TYPE_LABEL_NAME)
+            .register();
+    private static final Counter OPS_FAILED = Counter

Review Comment:
   I think we're missing a key stat here - latency for failed ops. From my 
experience, this is sometimes key to figuring out production issues. I would 
record latency for this and not just counter



##########
pulsar-metadata/src/main/java/org/apache/pulsar/metadata/impl/stats/MetadataStoreStats.java:
##########
@@ -24,48 +24,32 @@
 
 public final class MetadataStoreStats implements AutoCloseable {
     private static final double[] BUCKETS = new double[]{1, 3, 5, 10, 20, 50, 
100, 200, 500, 1000, 2000, 5000};
-    private static final String LABEL_NAME = "name";
+    private static final String OPS_TYPE_LABEL_NAME = "type";
+    private static final String METADATA_STORE_LABEL_NAME = "name";
+    private static final String OPS_TYPE_GET = "get";
+    private static final String OPS_TYPE_DEL = "del";
+    private static final String OPS_TYPE_PUT = "put";
     protected static final String PREFIX = "pulsar_metadata_store_";
 
-    private static final Histogram GET_OPS_SUCCEED = Histogram
-            .build(PREFIX + "get_latency", "-")
+    private static final Histogram OPS_SUCCEED = Histogram
+            .build(PREFIX + "ops_latency", "-")
             .unit("ms")
             .buckets(BUCKETS)
-            .labelNames(LABEL_NAME)
+            .labelNames(METADATA_STORE_LABEL_NAME, OPS_TYPE_LABEL_NAME)

Review Comment:
   Can you explain the meaning of the metadata store name? I haven't seen in 
the implementation of metadata stores any name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to