codelipenghui commented on code in PR #22058:
URL: https://github.com/apache/pulsar/pull/22058#discussion_r1510510254


##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java:
##########
@@ -403,15 +411,41 @@ public BrokerService(PulsarService pulsar, EventLoopGroup 
eventLoopGroup) throws
         this.defaultServerBootstrap = defaultServerBootstrap();
 
         this.pendingLookupRequests = 
ObserverGauge.build("pulsar_broker_lookup_pending_requests", "-")
-                .supplier(() -> 
pulsar.getConfig().getMaxConcurrentLookupRequest()
-                        - lookupRequestSemaphore.get().availablePermits())
+                .supplier(this::getPendingLookupRequest)
                 .register();
+        this.pendingLookupRequestsCounter = 
pulsar.getOpenTelemetry().getMeter()
+                .gaugeBuilder("pulsar.broker.lookup.pending.request.usage")
+                .ofLongs()
+                .setDescription("The number of pending lookup requests in the 
broker. "
+                        + "When it reaches threshold 
\"maxConcurrentLookupRequest\" defined in broker.conf, "
+                        + "new requests are rejected.")
+                .buildWithCallback(measurement -> 
measurement.record(getPendingLookupRequest()));
+        this.pendingLookupRequestsLimit = pulsar.getOpenTelemetry().getMeter()
+                .gaugeBuilder("pulsar.broker.lookup.pending.request.limit")
+                .ofLongs()
+                .setDescription("The maximum number of pending lookup requests 
in the broker. "
+                        + "Equal to \"maxConcurrentLookupRequest\" defined in 
broker.conf.")
+                .buildWithCallback(
+                        measurement -> 
measurement.record(pulsar.getConfig().getMaxConcurrentLookupRequest()));
 
         this.pendingTopicLoadRequests = ObserverGauge.build(
-                "pulsar_broker_topic_load_pending_requests", "-")
-                .supplier(() -> 
pulsar.getConfig().getMaxConcurrentTopicLoadRequest()
-                        - topicLoadRequestSemaphore.get().availablePermits())
+                        "pulsar_broker_topic_load_pending_requests", "-")
+                .supplier(this::getPendingTopicLoadRequests)
                 .register();
+        this.pendingTopicLoadRequestsCounter = 
pulsar.getOpenTelemetry().getMeter()
+                .gaugeBuilder("pulsar.broker.topic.load.pending.request.usage")

Review Comment:
   @asafm 
   
   Oh, I thought the namespace means Pulsar namespace before.
   
   1. So I understand correctly, we will have a couple of metric namespaces:
   
   - pulsar.broker.*
   - pulsar.tenant.*
   - pulsar.namespace.*
   - pulsar.topic.*
   - pulsar.subscription.*
   
   Is it correct? And, `pulsar.broker.topic.load.concurrent.*` also looks good 
to me.
   
   2. But I think it should be covered by metrics for other APIs? For example, 
deleting a topic will validate the ownership of the topic. The the requested 
broker is not the owner of the topic, the broker will return 307 to the HTTP 
client. I think we should have metrics for topic deletion API. For topic 
lookup, we usually pay more attention to the client lookup requests and the 
metadata operations from the lookup request (We will skip the metadata 
operation if the broker cached metadata), not the internal calls to the broker 
cached metadata (validate topic ownership). And for the metadata operation, we 
will have metadata driver level metrics.
   
   And, at least we need `lookup.request` for the client side lookup requests. 
Otherwise, we don't know how many lookup `operation` is from client side lookup 
requests or internal topic ownership validation.



##########
pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java:
##########
@@ -403,15 +411,41 @@ public BrokerService(PulsarService pulsar, EventLoopGroup 
eventLoopGroup) throws
         this.defaultServerBootstrap = defaultServerBootstrap();
 
         this.pendingLookupRequests = 
ObserverGauge.build("pulsar_broker_lookup_pending_requests", "-")
-                .supplier(() -> 
pulsar.getConfig().getMaxConcurrentLookupRequest()
-                        - lookupRequestSemaphore.get().availablePermits())
+                .supplier(this::getPendingLookupRequest)
                 .register();
+        this.pendingLookupRequestsCounter = 
pulsar.getOpenTelemetry().getMeter()
+                .gaugeBuilder("pulsar.broker.lookup.pending.request.usage")
+                .ofLongs()
+                .setDescription("The number of pending lookup requests in the 
broker. "
+                        + "When it reaches threshold 
\"maxConcurrentLookupRequest\" defined in broker.conf, "
+                        + "new requests are rejected.")
+                .buildWithCallback(measurement -> 
measurement.record(getPendingLookupRequest()));
+        this.pendingLookupRequestsLimit = pulsar.getOpenTelemetry().getMeter()
+                .gaugeBuilder("pulsar.broker.lookup.pending.request.limit")
+                .ofLongs()
+                .setDescription("The maximum number of pending lookup requests 
in the broker. "
+                        + "Equal to \"maxConcurrentLookupRequest\" defined in 
broker.conf.")
+                .buildWithCallback(
+                        measurement -> 
measurement.record(pulsar.getConfig().getMaxConcurrentLookupRequest()));
 
         this.pendingTopicLoadRequests = ObserverGauge.build(
-                "pulsar_broker_topic_load_pending_requests", "-")
-                .supplier(() -> 
pulsar.getConfig().getMaxConcurrentTopicLoadRequest()
-                        - topicLoadRequestSemaphore.get().availablePermits())
+                        "pulsar_broker_topic_load_pending_requests", "-")
+                .supplier(this::getPendingTopicLoadRequests)
                 .register();
+        this.pendingTopicLoadRequestsCounter = 
pulsar.getOpenTelemetry().getMeter()
+                .gaugeBuilder("pulsar.broker.topic.load.pending.request.usage")

Review Comment:
   @asafm 
   
   Oh, I thought the namespace means Pulsar namespace before.
   
   1. So I understand correctly, we will have a couple of metric namespaces:
   
   - pulsar.broker.*
   - pulsar.tenant.*
   - pulsar.namespace.*
   - pulsar.topic.*
   - pulsar.subscription.*
   
   Is it correct? And, `pulsar.broker.topic.load.concurrent.*` also looks good 
to me.
   
   2. But I think it should be covered by metrics for other APIs? For example, 
deleting a topic will validate the ownership of the topic. The the requested 
broker is not the owner of the topic, the broker will return 307 to the HTTP 
client. I think we should have metrics for topic deletion API. For topic 
lookup, we usually pay more attention to the client lookup requests and the 
metadata operations from the lookup request (We will skip the metadata 
operation if the broker cached metadata), not the internal calls to the broker 
cached metadata (validate topic ownership). And for the metadata operation, we 
will have metadata driver level metrics.
   
   And, at least we still need `lookup.request` for the client side lookup 
requests. Otherwise, we don't know how many lookup `operation` is from client 
side lookup requests or internal topic ownership validation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to