Hi, On Thu, 2019-11-07 at 04:08 -0500, Frank Ch. Eigler wrote: > This webapi extensions allows admins to hook up debuginfod to a > prometheus-compatible monitoring system for general situational > statistics. The metrics are simple enough that local curl requests > can give a user a sense of what's going on. The metrics are > documented as unstable with respect to future versions. > +.SS /metrics > + > +This endpoint returns a Prometheus formatted text/plain dump of a > +variety of statistics about the operation of the debuginfod server. > +The exact set of metrics and their meanings may change in future > +versions. Caution: configuration information (path names, versions) > +may be disclosed.
Could you also add a reference to the Prometheus Exposition format. I see it is already in a comment in the code. Best to also add it as See also in the docs. > .SH DATA MANAGEMENT > > debuginfod stores its index in an sqlite database in a densely > packed > @@ -291,7 +299,8 @@ a denial-of-service in terms of RAM, CPU, disk > I/O, or network I/O. > If this is a problem, users are advised to install debuginfod with a > HTTPS reverse-proxy front-end that enforces site policies for > firewalling, authentication, integrity, authorization, and load > -control. > +control. The \fI/metrics\fP webapi endpoint is probably not > +appropriate for disclosure to the public. So, should there be an option to turn it off? > When relaying queries to upstream debuginfods, debuginfod \fBdoes not\fP > include any particular security features. It trusts that the binaries > diff --git a/debuginfod/debuginfod.cxx b/debuginfod/debuginfod.cxx > index 7c7a0c5d7ef5..efe8c80fa081 100644 > --- a/debuginfod/debuginfod.cxx > +++ b/debuginfod/debuginfod.cxx > @@ -72,6 +72,7 @@ extern "C" { > #include <cstring> > #include <vector> > #include <set> > +#include <map> > #include <string> > #include <iostream> > #include <iomanip> > @@ -98,6 +99,14 @@ using namespace std; > #include <sys/syscall.h> > #endif > > +#ifdef __linux__ > +#define gettid() syscall(SYS_gettid) > +#else > +#define gettid() pthread_self() > +#endif You might want to rename this since newer glibc might expose gettid(). The rest of the code looks good as far as I can see. But I would suggest you add a command line option to disable the metrics, which would not install the metrics handler and make the metrics update functions noops. > diff --git a/tests/ChangeLog b/tests/ChangeLog > index 3d50ee8623ee..156a693f8886 100644 > --- a/tests/ChangeLog > +++ b/tests/ChangeLog > @@ -1,3 +1,8 @@ > +2019-11-07 Frank Ch. Eigler <f...@redhat.com> > + > + * run-debuginfod-find.sh: Test debuginfod metrics via curl. > + Fix federated testing, asserted by metrics. > + > 2019-11-06 Frank Ch. Eigler <f...@redhat.com> > > * run-debuginfod-find.sh: Test debuginfod -L mode. Drop > diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh > index 644901073d75..4c3e3cb306c2 100755 > --- a/tests/run-debuginfod-find.sh > +++ b/tests/run-debuginfod-find.sh > @@ -181,7 +181,8 @@ sleep 3 > > # have clients contact the new server > export DEBUGINFOD_URLS=http://localhost:$PORT2 > -testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog 1 > +rm -rf $DEBUGINFOD_CACHE_PATH > +testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID > > # confirm that first server can't resolve symlinked info in L/ but second can > BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \ > @@ -202,6 +203,16 @@ export DEBUGINFOD_URLS="BAD http://localhost:$PORT1 > localhost:$PORT1 http://loca > > testrun ${abs_builddir}/debuginfod_build_id_find -e F/prog2 1 > > +######################################################################## > + > +# Fetch some metrics, if curl program is installed > +if which curl 2>/dev/null; then > + curl http://localhost:$PORT1/badapi > + curl http://localhost:$PORT1/metrics > + curl http://localhost:$PORT2/metrics > + curl http://localhost:$PORT1/metrics | grep -q > 'http_responses_total.*result.*error' > + curl http://localhost:$PORT2/metrics | grep -q > 'http_responses_total.*result.*upstream' > +fi > > ######################################################################## I think it is better to check with: if type curl >/dev/null 2>&1; then Which avoid executing which, which might not be installed... Cheers, Mark