I don’t run the GUI in production, so I can’t comment on those issues specifically. I have been running a federated collector cluster for some time and it’s been working as expected. I’ve been using the Zimon-Grafana bridge code to look at GPFS performance stats.
The other part of this is the mmhealth/mmsysmonitor process that reports events. It’s been problematic for me, especially in larger clusters (400+ nodes). The mmsysmonitor process is overloading the master node (the cluster manager) with too many “heartbeats” and ends up causing lots of issues and log messages. Evidently this is something IBM is aware of (at the 4.2.2-2 level) and they have fixes coming out in 4.2.3 PTF1. I ended up disabling the cluster wide collection of health stats to prevent the cluster manager issues. However, be aware that CES depends on the mmhealth data so tinkering with the config make cause other issues if you use CES. Bob Oesterlin Sr Principal Storage Engineer, Nuance From: <[email protected]> on behalf of "David D. Johnson" <[email protected]> Reply-To: gpfsug main discussion list <[email protected]> Date: Wednesday, May 17, 2017 at 6:58 AM To: gpfsug main discussion list <[email protected]> Subject: [EXTERNAL] Re: [gpfsug-discuss] GPFS GUI So how are multiple collectors supposed to work? Active/Passive? Failover pairs? Shared storage? Better not be on GPFS… Maybe there is a place in the gui config to tell it to keep track of multiple collectors, but I gave up looking and turned of the second collector service and removed it from the candidates.
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
