rdhabalia opened a new issue, #20265:
URL: https://github.com/apache/pulsar/issues/20265
# Motivation
Right now, Pulsar provides the topic's stats and stats-internal over HTTP
admin API, and this stats data is used by user applications and also by Pulsar
internal components such as Pulsar-functions to derive the certain states of
the applications.
for example, there are use cases where the application wants to check the
topic's backlog, subscription's state (readPosition, list of subscriptions),
numberOfEntriesSinceFirstNotAckedMessage, etc to bootstrap the application or
handle the application’s resiliency and state dynamically. Applications can
retrieve this stats information by using the broker’s admin HTTP APIs.
However, stats retrieval over HTTP API doesn’t work well in use cases when
users would like to access this API at a higher scale when a large number of
application nodes would like to use it over HTTP which could overload brokers
and sometimes makes broker irresponsive and impact admin API performance. It’s
also become difficult when Pulsar is deployed in the cloud behind the SNI proxy
and applications also want to access large-scale stats information periodically
over different HTTP port instead it would be better if applications can fetch
stats over on same binary protocol for scalability and accessibility reasons.
Therefore, there are multiple use cases where producer/consumer applications
need stats information for topics using the client library over binary
protocol. Hence, this PIP introduces client API for producers and consumers to
access topic stats/internal-stats information which can be used by applications
as needed.
# Client library changes
The client library will allow both producer and consumer to retrieve stats
using `TopicStatsProvider` that fetches stats/internal-stats for both
persistent and non-persistent topics. Producer/consumer of a partition can use
the same connection with the broker to access topic and retrieve topics stats.
Pulsar also allows application to access multiple different partitions/topics
using single producer/consumer so, that producer/consumer should be able to
provide generic API to access stats of single or multi topics/partitions served
by them.
Therefore, producer/consumer will have additional API that returns
`TopicStatsProvider` which allows the application to fetch stats/internal-stats
of all topics/partitions served by that producer/consumer.
```
Producer.java / Consumer.java
/**
* Returns {@link TopicStatsProvider} to retrieve stats/internal-stats of the
topic.
*/
TopicStatsProvider getTopicStatsProvider();
```
```
TopicStatsProvider.java
/**
* TopicStatsProvider provides API to access topic’s stats and internal-stat
information.
*/
public interface TopicStatsProvider {
/**
* @return the topic stats
*/
CompletableFuture<TopicStatsInfo> getStats();
/**
* @return the internal topic stats
*/
CompletableFuture<TopicInternalStatsInfo> getInternalStats();
}
```
```
TopicStatsInfo.java
/**
* TopicStatsInfo contains the stats information of all partitions of a given
topic. It allows the client to access stats of each individual partition from
the partition stats map.
*
public class TopicStatsInfo {
private Map<String, TopicStats> partitions;
}
TopicInternalStatsInfo.java
/**
* TopicInternalStatsInfo contains internal-stats information of topic. It
allows the client to access stats of partitioned and non-partitioned topics.
*/
public class TopicInternalStatsInfo {
private Map<String, PersistentTopicInternalStats> partitions;
}
```
We would like to create create a generic stats protocol between client and
broker so, broker can send any type of stats (eg: stats or internal-stats) by
serializing into specific format and client can deserialize it into appropriate
format and return back to the client application.
Therefore, this PIP will introduce new wire-protocol for `TopicStats` where
client sends stats-type and topic-partition to the broker and broker sends back
serialized stats response which can be deserialized by client based on stats
type format.
```
message CommandTopicStats {
enum StatsType {
STATS = 0;
STATS_INTERNAL = 1;
}
required uint64 request_id = 1;
required string topic_name = 2;
required StatsType stats_type = 3;
}
message CommandTopicStatsResponse {
required uint64 request_id = 1;
optional ServerError error_code = 2;
optional string error_message = 3;
optional string stats_json = 4;
}
```
# Broker changes
A broker will have support to handle the stats command for a partition and
return stats/internal-stats response for a partition. Broker will authorize
connected client if client is authorized to access topic stats and returns the
stats once client is successfully authorized.
This PIP will also restrict clients and brokers to handle a number of
concurrent requests to protect brokers against a large number of such stats
requests.
[Sample Prototype with new API
support](https://github.com/apache/pulsar/compare/master...rdhabalia:stats_proto?expand=1)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]