DNS shouldn't be relied upon on a GPFS cluster for internal 
communication/management or data.

As a starting point, make sure the IP's and names of all managers/quorum nodes 
and clients have *unique* entries in the hosts files of all other nodes in the 
clusters, being the same as how they where joined and licensed in the first 
place. If you issue a 'mmlscluster' on the cluster manager for the servers and 
clients, those results should be used to build the common hosts file for all 
nodes involved.

Also, all nodes should have a common ntp configuration, pointing to the same 
*internal* ntp server, easily accessible via name/IP also on the hosts file.

And obviously, you need a stable network, eth or IB. Have a good monitoring 
tool in place, to rule out network as a possible culprit. In the particular 
case of IB, check that the fabric managers are doing their jobs properly.

And keep one eye on the 'tail -f /var/mmfs/gen/mmfslog' output of the managers 
and the nodes being expelled for other clues.

Jaime



On 5/9/2020 06:25:28, TURNER Aaron wrote:
Dear All,

We are getting, on an intermittent basis with currently no obvious pattern, an 
issue with GPFS nodes reporting rejecting nodes of the form:

nodename.domain.domain.domain....

DNS resolution using the standard command-line tools of the IP address present 
in the logs does not repeat the domain, and so far it seems isolated to GPFS.

Ultimately the nodes are rejected as not responding on the network.

Has anyone seen this sort of behaviour before?

Regards

Aaron Turner
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


.
.
.        ************************************
          TELL US ABOUT YOUR SUCCESS STORIES
         http://www.scinethpc.ca/testimonials
         ************************************
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to