asafm commented on PR #15558: URL: https://github.com/apache/pulsar/pull/15558#issuecomment-1216433375
@marksilcox I'm compiling a document detailing exactly how the metric system works today in Pulsar, and while doing so I've come to understand how Pulsar Function metrics work. The bad news: They also violate the grouping of the metric name. You will only experience it when you scrape Pulsar Function Worker process metrics or if you run it embedded within a Pulsar Broker. The code is in `FunctionsStatsGenerator`. So, the function there iterates over the different function runtimes. For example, if this worker runs function A with a replication factor of 2, and mode Process, it will spin up two operating system processes running `JavaInstanceStarter` which executes the function. Inside it runs a Prometheus HTTPServer which exposes its Prometheus metrics on a defined metrics port. Each process exposes the same metric name, so when FunctionsStatsGenerator aggregates them, it simply dumps one output into the same stream. This will probably require a separate PR with a different fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
