Chaskiel,

> On Mar 29, 2021, at 9:11 AM, Chaskiel Grundman <[email protected]> wrote:
> 
> Does anyone have tools for capturing, aggregating, analyzing or displaying 
> info from xstats? Not so much operation timing, though that may also be 
> helpful.

There are some patches under review on gerrit to make OpenAFS metrics more 
machine-readable:

https://gerrit.openafs.org/#/c/14359/   xstat: Add the xstat_fs_test -format 
option
https://gerrit.openafs.org/#/c/14358/   rxdebug: Add rxdebug -raw option

I am using these patches to collect OpenAFS cell metrics into collectd and 
display charts with graphite.
collectd works fine for the scale I need, but it does seem to drop an 
observation once in a while.
It can handle aggregation automatically if you configure it to do that.
graphite works fine too, but it is a little long in the tooth; if I were doing 
this today I would try grafana
or some other alternatives.   

Other sites have had success using Splunk to display their charts.
I've used Splunk to troubleshoot OpenAFS performance problems and it was 
extremely useful.  But I have not set it up myself, so I can't give any
guidance on how difficult it is to set up compared to graphite or grafana.

I am currently pulling my stats remotely by running all my xstats and rxdebug
scripts from a central collector machine, but this is not ideal. 
I would recommend that if you are doing this at any kind of scale,
you should try to collect the stats locally (e.g. xstat_fs_test localhost) on
each OpenAFS machine - either via cron job or a bos bnode script - and 
then push the stats to your collector.

Be aware that xstat_fs_test uses a new ephemeral port on each invocation;
this in turn results in a new peer on the fileserver.  It's not horrible 
overhead,
even if you are collecting every 60 seconds; but if you are concerned about it,
you can reduce this impact by allowing xstat_fs_test to run continuously.
That is, intead of invoking it periodically with the -onceonly option, invoke
it once and specify a -frequency and a -period.  In this way it will reuse
the same ephemeral port over and over, thus creating only a single peer on the 
fileserver.

The rx_peer issue is moot for the rxdebug command; although it also uses
a new ephemeral port for each invocation, the rxdebug packets are handled by
the rx stack directly, without requiring an rx_call, rx_connection, or rx_peer.

Regards,
--
Mark Vitale
[email protected]



_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to