xBis7 opened a new pull request, #3878:
URL: https://github.com/apache/ozone/pull/3878
## What changes were proposed in this pull request?
On the Prometheus endpoint for the OM, in the DecayRpcScheduler summary for
users, the username is exposed in the metric name. It makes almost impossible
to monitor these values as every time a new user shows up we need to register a
new metrics name.
The metric name from `org_apache_hadoop_ipc_decay_rpc_scheduler_call_volume`
becomes `org_apache_hadoop_ipc_decay_rpc_scheduler_caller_hadoop_volume` for a
user with `hadoop` username.
The proposed solution is to remove the username from the metric and add it
in a username tag.
This metric comes from `hadoop-common-3.3.4.jar/DecayRpcScheduler` and more
specifically
```
Metrics2Util.NameValuePair entry =
(Metrics2Util.NameValuePair)topNCallers.poll();
String topCaller = "Caller(" + entry.getName() + ")";
String topCallerVolume = topCaller + ".Volume";
String topCallerPriority = topCaller + ".Priority";
rb.addCounter(Interns.info(topCallerVolume, topCallerVolume),
entry.getValue());
```
The name is in the format `Caller(username).MetricType` eg.
`Caller(hadoop).Volume`. The cleanest way to deal with this seems to filter the
metric in `PrometheusMetricsSink`.
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7394
## How was this patch tested?
This patch was tested manually, with a docker cluster and the OM `/prom`
endpoint.
To test it:
in `compose/ozone` add in docker-config
```
CORE-SITE.XML_ipc.9862.callqueue.impl=org.apache.hadoop.ipc.FairCallQueue
CORE-SITE.XML_ipc.9862.scheduler.impl=org.apache.hadoop.ipc.DecayRpcScheduler
CORE-SITE.XML_ipc.9862.scheduler.priority.levels=2
CORE-SITE.XML_ipc.9862.backoff.enable=true
CORE-SITE.XML_ipc.9862.faircallqueue.multiplexer.weights=99,1
CORE-SITE.XML_ipc.9862.decay-scheduler.thresholds=90
OZONE-SITE.XML_ozone.om.address=0.0.0.0:9862
```
then
```
$ export COMPOSE_FILE=docker-compose.yaml:monitoring.yaml
$ docker-compose up --scale datanode=3 -d
$ docker exec -it ozone_s3g_1 bash
bash-4.2$ export AWS_ACCESS_KEY=test AWS_SECRET_KEY=pass
bash-4.2$ ozone freon s3bg -t 1 -n 10
```
on your browser go to `http://localhost:9874/prom` and you should see
```
# TYPE org_apache_hadoop_ipc_decay_rpc_scheduler_priority counter
org_apache_hadoop_ipc_decay_rpc_scheduler_priority{context="ipc.9862",hostname="e32f2e3bddb9",username="hadoop"}
1
...
...
...
# TYPE org_apache_hadoop_ipc_decay_rpc_scheduler_volume counter
org_apache_hadoop_ipc_decay_rpc_scheduler_volume{context="ipc.9862",hostname="e32f2e3bddb9",username="hadoop"}
21
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]