This:
“2018-06-14 08:04:06.341564 CET ib_rdma_nic_unrecognized ERROR IB RDMA NIC 
mlx5_0/1 was not recognized”

Looks like you are telling GPFS to use an MLX card that doesn’t exist on the 
node, this is set with verbsPorts, it’s probably not your issue here, but you 
are better using nodeclasses and assigning the config option to those 
nodeclasses that have the correct card installed (I’d also encourage you to use 
a fabric number, we do this even if there is only 1 fabric currently in the 
cluster as we’ve added other fabrics over time or over multiple locations).

Have you tried using mmnetverify at all? It’s been getting better in the newer 
releases and will give you a good indication if you have a comms issue due to 
something like name resolution in addition to testing between nodes…

Simon

From: <[email protected]> on behalf of 
"[email protected]" <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Friday, 15 June 2018 at 16:16
To: "[email protected]" <[email protected]>
Subject: Re: [gpfsug-discuss] Thousands of CLOSE_WAIT connections

2018-06-14 08:04:06.341564 CET ib_rdma_nic_unrecognized ERROR IB RDMA NIC 
mlx5_0/1 was not recognized
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to