Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/21723 )
Change subject: Add Prometheus HTTP service discovery ...................................................................... Patch Set 3: (11 comments) http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@14 PS3, Line 14: servers. This returns the Kudu Master and TServers grouped, according BTW, how is it going to work when Kudu webserver end-points are secured using SPNEGO? It seems Prometheus doesn't support authentication via SPNEGO for its HTTP(S) SD point [1], so this isn't going to work in a kerberized Kudu cluster where --webserver_require_spnego is set. I guess the same is true for the /metrics_prometheus endpoints: I'm not sure it's possible to configure Prometheus to use GSSAPI/Kerberos authn when scrapping metrics. Maybe it makes sense to file a JIRA ticket to handle this issue. As an option, we might consider adding configuration flags to allow using different authn mechanisms for Prometheus endpoints even if other end-points are protected with SPNEGO. [1] https://prometheus.io/docs/prometheus/latest/http_sd/#writing-http-service-discovery http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@20 PS3, Line 20: { : "targets":[ : "127.0.0.1:8765" : ], : "labels":{ : "job":"Kudu", : "group":"masters" : } : }, : { : "targets":[ : "127.0.0.1:9871", : "127.0.0.1:9875", : "127.0.0.1:9873" : ], : "labels":{ : "job":"Kudu", : "group":"tservers" : } : } What's the idea behind this type of grouping the entities in output in addition to grouping masters and tservers by type of a Kudu server? As I understand, an alternative would be adding a single entry for each of the server endpoint (master or tserver), and then being able to add labels per server: e.g., location, rack, and other per-server tags in the future if needed. Just curious what it would turn into in Prometheus and Graphana with different approaches and what each of them can give us. Also, I saw a mention about labels prefixed with 'meta_', so people can perform transformations in Prometheus by some standard means. I didn't use it or explore myself, but maybe you could shed more light on this if you played with more advanced topics of Prometheus+Graphana use cases. http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc File src/kudu/master/master-test.cc: http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3838 PS3, Line 3838: nit: remove the empty line? http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3841 PS3, Line 3841: const char* const nit: this might be 'constexpr const char* const ...' http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3869 PS3, Line 3869: Substitute("$0/prometheus-sd", addr) nit: why not to include the "/prometheus-sd" suffix into the URL generated in the line above? http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc File src/kudu/master/master_path_handlers.cc: http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@895 PS3, Line 895: const HostPortPB& nit: why not to use 'const auto&' here as well? http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@911 PS3, Line 911: if (!reg.http_addresses().empty()) { Should we log a warning if the list of HTTP addresses is empty for a tablet server? http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@936 PS3, Line 936: jw.String("Kudu"); Should we include clusterID here as well to distinguish between different Kudu clusters? Otherwise, how to distinguish between metrics collected from different Kudu clusters if the same Prometheus instance is used to collected metrics from them? http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@938 PS3, Line 938: masters Just curious: is there any convention on the naming of the group and prefix for the corresponding metrics? E.g., the current code adds 'kudu_master_' and 'kudu_tserver_' prefix to the master's and tserver's metrics correspondingly (see MetricEntity::WriteAsPrometheus). http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@952 PS3, Line 952: jw.String("Kudu"); Ditto http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc File src/kudu/util/test_util.cc: http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc@721 PS3, Line 721: prometheus-sd Should this be also added into the PeriodicWebUIChecker, so this end-point automatically checked by several other existing tests? -- To view, visit http://gerrit.cloudera.org:8080/21723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I931aa72a7567c0dde43d7b7ed53a2dd0fa8bc1fe Gerrit-Change-Number: 21723 Gerrit-PatchSet: 3 Gerrit-Owner: Marton Greber <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Attila Bukor <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber <[email protected]> Gerrit-Reviewer: Zoltan Chovan <[email protected]> Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Tue, 27 Aug 2024 19:54:41 +0000 Gerrit-HasComments: Yes
