Marton Greber has posted comments on this change. ( http://gerrit.cloudera.org:8080/21723 )
Change subject: Add Prometheus HTTP service discovery ...................................................................... Patch Set 3: (14 comments) http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@14 PS3, Line 14: servers. This returns the Kudu Master and TServers grouped, according > BTW, how is it going to work when Kudu webserver end-points are secured usi Yes you are correct, with --webserver_require_spnego Prometheus wont work. Created: KUDU-3674 Implement the ability for endpoints to skip auth. My plan is there to first create an option to be able to skip auth for Prometheus endpoints. Then we can look into adding supported auth mechanism for Prometheus. Let me know what you think about this. http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@20 PS3, Line 20: { : "targets":[ : "127.0.0.1:8765" : ], : "labels":{ : "job":"Kudu", : "group":"masters" : } : }, : { : "targets":[ : "127.0.0.1:9871", : "127.0.0.1:9875", : "127.0.0.1:9873" : ], : "labels":{ : "job":"Kudu", : "group":"tservers" : } : } > What's the idea behind this type of grouping the entities in output in addi The structure of the json is just a means for prometheus to setup targets and associated labels. Regarding the targets, these two produce the equivalent results: config 1: [ { "targets": ["127.0.0.1:8765"], "labels": { … } }, { "targets": ["127.0.0.1:8767"], "labels": { … } }, { "targets": ["127.0.0.1:8769"], "labels": { … } } ] config 2: [ { "targets": ["127.0.0.1:8765","127.0.0.1:8767","127.0.0.1:8769"], "labels": { … } } ] In my previous revision I used the structure of "config 2" for masters. Now in the latest revision the whole /prometheus-sd follows "config 1". The reason was to make it have somewhat unified structure and avoid confusion. Regarding the labels: Labels provide a way to filter metrics in Prometheus when looking at data. Eg.: in the latest revision i've added the "location" label to Tserver targets. Then in Prometheus I can do the following query to get kudu_tserver_rpc_incoming_queue_time for "/rack1": kudu_tserver_rpc_incoming_queue_time{location="/rack1"}. These label names can be arbitrary, Prometheus parses them and even provides nice autocomplete when querying. Maybe it's worth to add a blurb about labels we expose in followup patch/blogpost. For the meta_ labels: SD plugins can expose specific labels prefixed with "__" (double underscores). These are by default hidden (eg don't come up in Prometheus GUI). With relabeling these can be tweaked to have labels matching users requirements. The relabeling config is done in the scrape config. What is interesting for us is the "__scheme__" hidden label. By default Prometheus uses http for the targets given by the SD endpoint. For https the "__scheme__" needs to be set to https. (The current revision does this part.) http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@58 PS3, Line 58: http > does this work with https? Good question! Prometheus can access https service discovery endpoint, and also the targets that are given by the discovery endpoint. For this end users need to add <tls_config> to the scrape config: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config Good thing that you mentioned this, as I think this is useful for the Prometheus article as well, thanks! http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@67 PS3, Line 67: Promethes > typo: Prometheus removed this part. done. http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc File src/kudu/master/master-test.cc: http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3838 PS3, Line 3838: > nit: remove the empty line? Done http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3841 PS3, Line 3841: const char* const > nit: this might be 'constexpr const char* const ...' Done http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3869 PS3, Line 3869: Substitute("$0/prometheus-sd", addr) > nit: why not to include the "/prometheus-sd" suffix into the URL generated Done http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc File src/kudu/master/master_path_handlers.cc: http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@880 PS3, Line 880: JsonError(s, output); > Is it better to response an empty json? Can prometheus handles an error res This is a very good catch! Thank you! Changed it to return status 200 and empty body []. This follows the SD reference [1]: "The SD endpoint must answer with an HTTP 200 response, with the HTTP Header Content-Type: application/json. The answer must be UTF-8 formatted. If no targets should be transmitted, HTTP 200 must also be emitted, with an empty list []. Target lists are unordered." [1] https://prometheus.io/docs/prometheus/latest/http_sd/ http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@895 PS3, Line 895: const HostPortPB& > nit: why not to use 'const auto&' here as well? Done http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@911 PS3, Line 911: if (!reg.http_addresses().empty()) { > Should we log a warning if the list of HTTP addresses is empty for a tablet Done http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@936 PS3, Line 936: jw.String("Kudu"); > Should we include clusterID here as well to distinguish between different K Done, moreover added location info. http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@938 PS3, Line 938: masters > Just curious: is there any convention on the naming of the group and prefix AFAIK there is no convention for the labels. http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@952 PS3, Line 952: jw.String("Kudu"); > Ditto Done http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc File src/kudu/util/test_util.cc: http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc@721 PS3, Line 721: prometheus-sd > Should this be also added into the PeriodicWebUIChecker, so this end-point Yep makes sense, done. -- To view, visit http://gerrit.cloudera.org:8080/21723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I931aa72a7567c0dde43d7b7ed53a2dd0fa8bc1fe Gerrit-Change-Number: 21723 Gerrit-PatchSet: 3 Gerrit-Owner: Marton Greber <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber <[email protected]> Gerrit-Reviewer: Wang Xixu <[email protected]> Gerrit-Reviewer: Zoltan Chovan <[email protected]> Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Fri, 11 Jul 2025 14:20:47 +0000 Gerrit-HasComments: Yes
