Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21723 )

Change subject: Add Prometheus HTTP service discovery
......................................................................


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@14
PS3, Line 14: servers. This returns the Kudu Master and TServers grouped, 
according
BTW, how is it going to work when Kudu webserver end-points are secured using 
SPNEGO?  It seems Prometheus doesn't support authentication via SPNEGO for its 
HTTP(S) SD point [1], so this isn't going to work in a kerberized Kudu cluster 
where --webserver_require_spnego is set.

I guess the same is true for the /metrics_prometheus endpoints: I'm not sure 
it's possible to configure Prometheus to use GSSAPI/Kerberos authn when 
scrapping metrics.

Maybe it makes sense to file a JIRA ticket to handle this issue.  As an option, 
we might consider adding configuration flags to allow using different authn 
mechanisms for Prometheus endpoints even if other end-points are protected with 
SPNEGO.

[1] 
https://prometheus.io/docs/prometheus/latest/http_sd/#writing-http-service-discovery


http://gerrit.cloudera.org:8080/#/c/21723/3//COMMIT_MSG@20
PS3, Line 20:    {
            :       "targets":[
            :          "127.0.0.1:8765"
            :       ],
            :       "labels":{
            :          "job":"Kudu",
            :          "group":"masters"
            :       }
            :    },
            :    {
            :       "targets":[
            :          "127.0.0.1:9871",
            :          "127.0.0.1:9875",
            :          "127.0.0.1:9873"
            :       ],
            :       "labels":{
            :          "job":"Kudu",
            :          "group":"tservers"
            :       }
            :    }
What's the idea behind this type of grouping the entities in output in addition 
to grouping masters and tservers by type of a Kudu server?

As I understand, an alternative would be adding a single entry for each of the 
server endpoint (master or tserver), and then being able to add labels per 
server: e.g., location, rack, and other per-server tags in the future if needed.

Just curious what it would turn into in Prometheus and Graphana with different 
approaches and what each of them can give us.

Also, I saw a mention about labels prefixed with 'meta_', so people can perform 
transformations in Prometheus by some standard means.  I didn't use it or 
explore myself, but maybe you could shed more light on this if you played with 
more advanced topics of Prometheus+Graphana use cases.


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc
File src/kudu/master/master-test.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3838
PS3, Line 3838:
nit: remove the empty line?


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3841
PS3, Line 3841: const char* const
nit: this might be 'constexpr const char* const ...'


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master-test.cc@3869
PS3, Line 3869: Substitute("$0/prometheus-sd", addr)
nit: why not to include the "/prometheus-sd" suffix into the URL generated in 
the line above?


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc
File src/kudu/master/master_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@895
PS3, Line 895: const HostPortPB&
nit: why not to use 'const auto&' here as well?


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@911
PS3, Line 911:     if (!reg.http_addresses().empty()) {
Should we log a warning if the list of HTTP addresses is empty for a tablet 
server?


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@936
PS3, Line 936:   jw.String("Kudu");
Should we include clusterID here as well to distinguish between different Kudu 
clusters?  Otherwise, how to distinguish between metrics collected from 
different Kudu clusters if the same Prometheus instance is used to collected 
metrics from them?


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@938
PS3, Line 938: masters
Just curious: is there any convention on the naming of the group and prefix for 
the corresponding metrics?  E.g., the current code adds 'kudu_master_' and 
'kudu_tserver_' prefix to the master's and tserver's metrics correspondingly 
(see  MetricEntity::WriteAsPrometheus).


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/master/master_path_handlers.cc@952
PS3, Line 952:   jw.String("Kudu");
Ditto


http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc
File src/kudu/util/test_util.cc:

http://gerrit.cloudera.org:8080/#/c/21723/3/src/kudu/util/test_util.cc@721
PS3, Line 721: prometheus-sd
Should this be also added into the PeriodicWebUIChecker, so this end-point 
automatically checked by several other existing tests?



--
To view, visit http://gerrit.cloudera.org:8080/21723
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I931aa72a7567c0dde43d7b7ed53a2dd0fa8bc1fe
Gerrit-Change-Number: 21723
Gerrit-PatchSet: 3
Gerrit-Owner: Marton Greber <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber <[email protected]>
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Reviewer: Zoltan Martonka <[email protected]>
Gerrit-Comment-Date: Tue, 27 Aug 2024 19:54:41 +0000
Gerrit-HasComments: Yes

Reply via email to