Re: [Gluster-devel] Feature: FOP Statistics JSON Dumps

Richard Wareing Tue, 22 Sep 2015 14:55:40 -0700

Hey Ben,

So the UI for it is simply to read it from /var/lib/glusterd/stats.  For 
example for gNFSd you can simply do this:


cat /var/lib/glusterd/stats/glusterfs_nfsd.dump


To see the output.  The reason we favor this "procfs" style interface is that:

1. There are 0 depedencies on CLIs which can hang.
2. All dumps are independent of one another, gNFSd on that host is having 
issues, this should prevent glusterfsd from sending us stats.
3. The output can be sent to an analytics/alarm engine of your choice.  Or 
simply run grep w/ "watch" in a loop to watch "live" when doing debugging.

Since we have this feature...we actually never use "profile" at all actually: 
there's really no need since you have the data 24x7 on 5 second intervals. You 
only need to enable diagnostics.latency-measurement, diagnostics.count-fop-hits 
and set the diagnostics.ios-dump-interval to non-zero and the data will land in 
/var/lib/glusterd/stats/<daemon>.dump .

Bug is updated w/ example output, but here's a teaser:


{
*SNIP*
"gluster.nfsd.inter.fop.removexattr.latency_ave_usec": "0.00",
"gluster.nfsd.inter.fop.removexattr.latency_min_usec": "0.00",
"gluster.nfsd.inter.fop.removexattr.latency_max_usec": "0.00",
"gluster.nfsd.inter.fop.opendir.per_sec": "2.60",
"gluster.nfsd.inter.fop.opendir.latency_ave_usec": "1658.92",
"gluster.nfsd.inter.fop.opendir.latency_min_usec": "715.00",
"gluster.nfsd.inter.fop.opendir.latency_max_usec": "7179.00",
"gluster.nfsd.inter.fop.fsyncdir.per_sec": "0.00",
"gluster.nfsd.inter.fop.fsyncdir.latency_ave_usec": "0.00",
"gluster.nfsd.inter.fop.fsyncdir.latency_min_usec": "0.00",
"gluster.nfsd.inter.fop.fsyncdir.latency_max_usec": "0.00",
"gluster.nfsd.inter.fop.access.per_sec": "43.19",
"gluster.nfsd.inter.fop.access.latency_ave_usec": "323.51",
"gluster.nfsd.inter.fop.access.latency_min_usec": "144.00",
"gluster.nfsd.inter.fop.access.latency_max_usec": "6639.00",
"gluster.nfsd.inter.fop.create.per_sec": "0.00",
*SNIP*
}

There's also aggregate counters which track from process birth to death which 
are exported as well.

Richard


________________________________________
From: Ben England [[email protected]]
Sent: Tuesday, September 22, 2015 11:04 AM
To: Richard Wareing
Cc: [email protected]
Subject: Re: [Gluster-devel] Feature: FOP Statistics JSON Dumps

Richard, what's great about your patch (besides lockless counters) is:

- JSON easier to parse (particularly in python).  Compare to parsing "gluster 
volume profile" output, which is much more difficult.  This will enable tools 
to display profiling data in a user-friendly way.  Would be nice if you 
attached a sample output to the bz 1261700.

- client side capture - io-stats translator is at the top of the translator 
stack so we would see latencies just like the application sees them.  "gluster 
volume profile" provides server-side latencies but this can be deceptive and 
fails to report "user experience" latencies.

I'm not that clear on the UI for it, would be nice if "gluster volume " command 
could be set up to automatically poll this data at a fixed rate like many other 
perf utilities (example: iostat), so that user could capture a Gluster profile 
over time with a single command; at present the support team has to give them a 
script to do it.  This would make it trivial for a user to share what their 
application is doing from a Gluster perspective, as well as how Gluster is 
performing from the client's perspective.    /usr/sbin/gluster utility can run 
on the client now since it is in gluster-cli RPM right?

So in other words it would be great to replace this:

gluster volume profile $volume_name start
gluster volume profile $volume_name info > /tmp/past
for min in `seq 1 $sample_count` ; do
  sleep $sample_interval
  gluster volume profile $volume_name info
done > gvp.log
gluster volume profile $volume_name stop

With this:

gluster volume profile $volume_name $sample_interval $sample_count > gvp.log

And be able to run this command on the client to use your patch there.

thx

-ben

----- Original Message -----
> From: "Richard Wareing" <[email protected]>
> To: [email protected]
> Sent: Wednesday, September 9, 2015 10:24:54 PM
> Subject: [Gluster-devel] Feature: FOP Statistics JSON Dumps
>
> Hey all,
>
> I just uploaded a clean patch for our FOP statistics dump feature @
> https://bugzilla.redhat.com/show_bug.cgi?id=1261700 .
>
> Patches cleanly to v3.6.x/v3.7.x release branches, also includes io-stats
> support for intel arch atomic operations (ifdef'd for portability) such that
> you can collect data 24x7 with a negligible latency hit in the IO path.
> We've been using this for quite sometime and there appeared to have been
> some interest at the dev summit to have this in mainline; so here it is.
>
> Take a look, and I hope you find it useful.
>
> Richard
_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Feature: FOP Statistics JSON Dumps

Reply via email to