On 17/01/2024 15:21, Ward Poelmans wrote:
CAUTION: This email originated outside the University. Check before
clicking links or attachments.
On 17/01/2024 16:11, Ryan Novosielski wrote:
We have a various points ran into nodes not using RDMA, just because of a
minor misconfiguration, and suddenly hundreds of megabytes a second of
storage traffic we’re going over a net network designed for
administration.
You can use verbsRdmaFailBackTCPIfNotAvailable=no for that. If RDMA is
not working on a node configured for it, GPFS will refuse to start.
Interesting. Noting we run GPFS exclusively over Ethernet and the idea
was still to run it over Ethernet but with RDMA.
We took the decision a long time ago now to make use of the fact that we
have fancy pants Ethernet switches and put the admin traffic over the
same physical Ethernet link but on a separate VLAN which we then
prioritise with QoS.
Consequently if something where to go wrong with the RDMA and it fell
back to TCP it would still be going over the same physical link :-)
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org