Hi, let me start with a recommendation first before I explain how the cluster state is build. Starting with 4.2.1 please use the mmhealth command instead of using the mmces state/events command. The mmces state/event command will be deprecated in future releases. mmhealth node show -> show the node state for all components (incl. CES) mmhealth node show CES -> shows the CES components only. mmhealth cluster show -> show the cluster state
Now to your problem: The Spectrum Scale health monitoring is done by a daemon which runs on each cluster node. This daemon is monitoring the state of all Spectrum Scale components on the local system and based on the resulting monitoring events it compiles a local system state (shown by mmhealth node show). By having a decentralized monitoring we reduce the monitoring overhead and increase resiliency against network glitches. In order to show a cluster wide state view we have to consolidate the events from all cluster nodes on a single node. The health monitoring daemon running on the cluster manager is taking the role (CSM) to receive events from all nodes through RPC calls and to compile the cluster state (shown by mmhealth cluster show) There can be cases where the (async) event forwarding to the CSM is delayed or dropped because of network delays, high system load, cluster manager failover or split brain cases. Those cases should resolve automatically after some time when event is resend. Summary: the cluster state might be temporary out of sync (eventually consistent), for getting a current state you should refer to mmhealth node show. If the problem does not resolve automatically, restarting the monitoring daemon will force a re-sync. Please open a PMR for the 5.0 issue too if the problem persist. Mit freundlichen Grüßen / Kind regards Mathias Dietz Spectrum Scale Development - Release Lead Architect (4.2.x) Spectrum Scale RAS Architect --------------------------------------------------------------------------- IBM Deutschland Am Weiher 24 65451 Kelsterbach Phone: +49 70342744105 Mobile: +49-15152801035 E-Mail: [email protected] ----------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk WittkoppSitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 From: "Ernst Heinz (ID SD)" <[email protected]> To: "[email protected]" <[email protected]> Date: 01/16/2018 06:09 PM Subject: [gpfsug-discuss] GPFS GA 5.0.0.0: mmces commands with inconsistent output Sent by: [email protected] Hello to all peers and gurus Since more or less two weeks we have gpfs GA 5.0.0.0 running on our testenvironment Today I?ve seen following behavior on our SpectrumScale-testcluster which slighdly surprised me Following: Checking status of the cluster on different ways [root@testnas13ces01 idsd_erh_t1]# mmces state cluster CLUSTER AUTH BLOCK NETWORK AUTH_OBJ NFS OBJ SMB CES testnas13.ethz.ch FAILED DISABLED HEALTHY DISABLED DEPEND DISABLED DEPEND FAILED [root@testnas13ces01 idsd_erh_t1]# mmces state show -a NODE AUTH BLOCK NETWORK AUTH_OBJ NFS OBJ SMB CES testnas13ces01-i HEALTHY DISABLED HEALTHY DISABLED HEALTHY DISABLED HEALTHY HEALTHY testnas13ces02-i HEALTHY DISABLED HEALTHY DISABLED HEALTHY DISABLED HEALTHY HEALTHY does anyone of you guys has an explanation therefore? Is there someone else who has seen a behavior like this? By the way we have a similar view on one of our clusters on gpfs 4.2.3.4 (open PMR: 30218.112.848) Any kind of response would be very grateful Kind regards Heinz =============================================================== Heinz Ernst ID-Systemdienste WEC C 16 Weinbergstrasse 11 CH-8092 Zurich [email protected] Phone: +41 44 633 84 48 Mobile: +41 79 216 15 50 =============================================================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
