Marton Greber has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21723 )

Change subject: Add Prometheus HTTP service discovery
......................................................................


Patch Set 3:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@14
PS3, Line 14: servers. This returns the Kudu Master and TServers grouped, 
according
> BTW, how is it going to work when Kudu webserver end-points are secured usi
Yes you are correct, with --webserver_require_spnego Prometheus wont work.
Created: KUDU-3674 Implement the ability for endpoints to skip auth.

My plan is there to first create an option to be able to skip auth for 
Prometheus endpoints. Then we can look into adding supported auth mechanism for 
Prometheus.

Let me know what you think about this.


http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@20
PS3, Line 20:    {
            :       "targets":[
            :          "127.0.0.1:8765"
            :       ],
            :       "labels":{
            :          "job":"Kudu",
            :          "group":"masters"
            :       }
            :    },
            :    {
            :       "targets":[
            :          "127.0.0.1:9871",
            :          "127.0.0.1:9875",
            :          "127.0.0.1:9873"
            :       ],
            :       "labels":{
            :          "job":"Kudu",
            :          "group":"tservers"
            :       }
            :    }
> What's the idea behind this type of grouping the entities in output in addi
The structure of the json is just a means for prometheus to setup targets and 
associated labels.
Regarding the targets, these two produce the equivalent results:

config 1:
[
  { "targets": ["127.0.0.1:8765"], "labels": { … } },
  { "targets": ["127.0.0.1:8767"], "labels": { … } },
  { "targets": ["127.0.0.1:8769"], "labels": { … } }
]

config 2:
[
  { "targets": ["127.0.0.1:8765","127.0.0.1:8767","127.0.0.1:8769"],
    "labels": { … } }
]

In my previous revision I used the structure of  "config 2" for masters.
Now in the latest revision the whole /prometheus-sd follows "config 1". The 
reason was to make it have somewhat unified structure and avoid confusion.

Regarding the labels:
Labels provide a way to filter metrics in Prometheus when looking at data. Eg.: 
in the latest revision i've added the "location" label to Tserver targets. Then 
in Prometheus I can do the following query to get 
kudu_tserver_rpc_incoming_queue_time for "/rack1":
kudu_tserver_rpc_incoming_queue_time{location="/rack1"}.
These label names can be arbitrary, Prometheus parses them and even provides 
nice autocomplete when querying. Maybe it's worth to add a blurb about labels 
we expose in followup patch/blogpost.

For the meta_ labels: SD plugins can expose specific labels prefixed with "__" 
(double underscores). These are by default hidden (eg don't come up in 
Prometheus GUI). With relabeling these can be tweaked to have labels matching 
users requirements. The relabeling config is done in the scrape config.

What is interesting for us is the "__scheme__" hidden label. By default 
Prometheus uses http for the targets given by the SD endpoint. For https the 
"__scheme__" needs to be set to https. (The current revision does this part.)


http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@58
PS3, Line 58: http
> does this work with https?
Good question!
Prometheus can access https service discovery endpoint, and also the targets 
that are given by the discovery endpoint. For this end users need to add 
<tls_config> to the scrape config: 
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config

Good thing that you mentioned this, as I think this is useful for the 
Prometheus article as well, thanks!


http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@67
PS3, Line 67: Promethes
> typo: Prometheus
removed this part. done.


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc
File src/kudu/master/master-test.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3838
PS3, Line 3838:
> nit: remove the empty line?
Done


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3841
PS3, Line 3841: const char* const
> nit: this might be 'constexpr const char* const ...'
Done


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3869
PS3, Line 3869: Substitute("$0/prometheus-sd", addr)
> nit: why not to include the "/prometheus-sd" suffix into the URL generated
Done


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc
File src/kudu/master/master_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@880
PS3, Line 880:     JsonError(s, output);
> Is it better to response an empty json? Can prometheus handles an error res
This is a very good catch! Thank you! Changed it to return status 200 and empty 
body []. This follows the SD reference [1]:

"The SD endpoint must answer with an HTTP 200 response, with the HTTP Header 
Content-Type: application/json. The answer must be UTF-8 formatted. If no 
targets should be transmitted, HTTP 200 must also be emitted, with an empty 
list []. Target lists are unordered."

[1] https://prometheus.io/docs/prometheus/latest/http_sd/


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@895
PS3, Line 895: const HostPortPB&
> nit: why not to use 'const auto&' here as well?
Done


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@911
PS3, Line 911:     if (!reg.http_addresses().empty()) {
> Should we log a warning if the list of HTTP addresses is empty for a tablet
Done


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@936
PS3, Line 936:   jw.String("Kudu");
> Should we include clusterID here as well to distinguish between different K
Done, moreover added location info.


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@938
PS3, Line 938: masters
> Just curious: is there any convention on the naming of the group and prefix
AFAIK there is no convention for the labels.


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@952
PS3, Line 952:   jw.String("Kudu");
> Ditto
Done


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc
File src/kudu/util/test_util.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc@721
PS3, Line 721: prometheus-sd
> Should this be also added into the PeriodicWebUIChecker, so this end-point
Yep makes sense, done.



--
To view, visit http://gerrit.cloudera.org:8080/21723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I931aa72a7567c0dde43d7b7ed53a2dd0fa8bc1fe
Gerrit-Change-Number: 21723
Gerrit-PatchSet: 3
Gerrit-Owner: Marton Greber <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Reviewer: Zoltan Martonka <[email protected]>
Gerrit-Comment-Date: Fri, 11 Jul 2025 14:20:47 +0000
Gerrit-HasComments: Yes

Reply via email to