Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21723 )

Change subject: Add Prometheus HTTP service discovery
......................................................................


Patch Set 3:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@14
PS3, Line 14: servers. This returns the Kudu Master and TServers grouped, 
according
> As a second thought if customers are running Knox I think they can do some
Yes, I think Knox might be a good option to explore in this context.

Also, it's possible to enable TLS client authentication in 'tls_config' [3] and 
basic HTTP authn there as well [1] in the 'http_config' sub-section of the 
'scape_config' section [2].  However, it necessary to implement the 
customization of authn per end-point in the squeasel webserver, IIUC.

[1] https://prometheus.io/docs/prometheus/latest/configuration/https/
[2] 
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
[3] 
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc
File src/kudu/master/master-test.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3883
PS3, Line 3883: } // namespace master
Does it make sense to add a test to ensure the reported list of targets is 
empty when master hasn't yet initialized?  That's just to make sure we know 
it's HTTP 200 and an empty  array, not HTTP error code from the server.


http://gerrit.cloudera.org:8080/#/c/21723/8/src/kudu/master/master-test.cc
File src/kudu/master/master-test.cc:

http://gerrit.cloudera.org:8080/#/c/21723/8/src/kudu/master/master-test.cc@4003
PS8, Line 4003:
nit: put these into separate lines, otherwise clang-tidy isn't happy


http://gerrit.cloudera.org:8080/#/c/21723/8/src/kudu/master/master_path_handlers.h
File src/kudu/master/master_path_handlers.h:

http://gerrit.cloudera.org:8080/#/c/21723/8/src/kudu/master/master_path_handlers.h@70
PS8, Line 70: r addresses su
nit: wrong indent?


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc
File src/kudu/master/master_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@936
PS3, Line 936:   jw.String("Kudu");
> Done, moreover added location info.
Great.  Now, should different clusters be scraped by different Prometheus jobs? 
 Imagine 2 Kudu clusters served by the same Prometheus instance: one Kudu 
cluster have 5 nodes, and another Kudu cluster has 250 nodes.


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@1010
PS3, Line 1010:   server->RegisterJsonPathHandler(
              :       "/prometheus-sd", "Prometheus SD",
              :       [this](const Webserver::WebRequest& req, 
Webserver::PrerenderedWebResponse* resp) {
              :         this->HandlePrometheusServiceDiscovery(req, resp);
              :       },
              :       false /*is_on_nav_bar*/);
Probably, that's a question for a separate changelist, but should this endpoint 
be kept open when SPENGO authn is used to secure the embedded webserver?  From 
reading the Prometheus docs [1], I'm not quite sure whether the SD URL should 
be always kept open, or it's possible to require one of the supported authn 
options for the SD endpoint?


```The URL to the HTTP SD is not considered secret. The authentication and any 
API keys should be passed with the appropriate authentication mechanisms. 
Prometheus supports TLS authentication, basic authentication, OAuth2, and 
authorization headers.```

[1] https://prometheus.io/docs/prometheus/latest/http_sd/


http://gerrit.cloudera.org:8080/#/c/21723/8/src/kudu/master/master_path_handlers.cc
File src/kudu/master/master_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/21723/8/src/kudu/master/master_path_handlers.cc@889
PS8, Line 889:     JsonError(s, output);
nit: should any of attributes mot be empty?  If any, consider adding 
corresponding DCHECK() for those


http://gerrit.cloudera.org:8080/#/c/21723/8/src/kudu/master/master_path_handlers.cc@917
PS8, Line 917:   // Create Prometheus service discovery JSON.
Some tablet servers might considered down after they have registered with 
masters (i.e., haven't heard from them for some time).  Should that be somehow 
reflected in the scrape entries for Prometheus?



--
To view, visit http://gerrit.cloudera.org:8080/21723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I931aa72a7567c0dde43d7b7ed53a2dd0fa8bc1fe
Gerrit-Change-Number: 21723
Gerrit-PatchSet: 3
Gerrit-Owner: Marton Greber <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Gabriella Lotz <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Wang Xixu <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Reviewer: Zoltan Martonka <[email protected]>
Gerrit-Comment-Date: Tue, 15 Jul 2025 07:42:05 +0000
Gerrit-HasComments: Yes

Reply via email to