Ah, thank you! That's a huge help. My preference, of course, would be to
use documented calls but I'm already down that rabbit hole calling
nsd_ds directly b/c the snmp agent chokes and dies a horrible death with
3.5k nodes and the number of NSDs we have.
On 7/7/16 10:16 PM, Sven Oehme wrote:
Hi,
this is a undocumented mmpmon call, so you are on your own, but here is
the correct description :
_n_
IP address of the node responding. This is the address by which GPFS
knows the node.
_nn_
The name by which GPFS knows the node.
_rc_
The reason/error code. In this case, the reply value is 0 (OK).
_t_
Current time of day in seconds (absolute seconds since Epoch (1970)).
_tu_
Microseconds part of the current time of day.
_cl_
The name of the cluster that owns the file system.
_fs_
The name of the file system for which data are being presented.
_d_
The number of disks in the file system.
_br_
Total number of bytes read from disk (not counting those read from cache.)
_bw_
Total number of bytes written, to both disk and cache.
_c_
The total number of read operations supplied from cache.
_r_
The total number of read operations supplied from disk.
_w_
The total number of write operations, to both disk and cache.
_oc_
Count of open() call requests serviced by GPFS.
_cc_
Number of close() call requests serviced by GPFS.
_rdc_
Number of application read requests serviced by GPFS.
_wc_
Number of application write requests serviced by GPFS.
_dir_
Number of readdir() call requests serviced by GPFS.
_iu_
Number of inode updates to disk.
_irc_
Number of inode reads.
_idc_
Number of inode deletions.
_icc_
Number of inode creations.
_bc_
Number of bytes read from the cache.
_sch_
Number of stat cache hits.
_scm_
Number of stat cache misses.
On Thu, Jul 7, 2016 at 7:09 PM, Aaron Knister <[email protected]
<mailto:[email protected]>> wrote:
Does anyone know what the fields in the mmpmon gfis output indicate?
# socat /var/mmfs/mmpmon/mmpmonSocket -
_event_ newconnection _t_ 1467937547 _tu_ 372882 _n_ 10.101.11.1
_node_ local_node
mmpmon gfis
_response_ begin mmpmon gfis
_mmpmon::gfis_ _n_ 10.101.11.1 _nn_ lorej001 _rc_ 0 _t_ 1467937550
_tu_ 518265 _cl_ disguise-gpfs _fs_ thome _d_ 5 _br_ 0 _bw_ 0 _c_ 0
_r_ 0 _w_ 0 _oc_ 0 _cc_ 0 _rdc_ 0 _wc_ 0 _dir_ 0 _iu_ 0 _irc_
Here's my best guess:
_d_ number of disks in the filesystem
_br_ bytes read from disk
_bw_ bytes written to disk
_c_ cache ops
_r_ read ops
_w_ write ops
_oc_ open() calls
_cc_ close() calls
_rdc_ read() calls
_wc_ write() calls
_dir_ readdir calls
_iu_ inode update count
_irc_ inode read count
_idc_ inode delete count
_icc_ inode create count
_bc_ bytes read from cache
_sch_ stat cache hits
_scm_ stat cache misses
This is all because the mmpmon fs_io_s command doesn't give me a way
that I can find to distinguish block/stat cache hits from cache
misses which makes it harder to pinpoint misbehaving applications on
the system.
-Aaron
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776 <tel:%28301%29%20286-2776>
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss