I'd be inclined to look at something like:

ibqueryerrors -s 
PortXmitWait,LinkDownedCounter,PortXmitDiscards,PortRcvRemotePhysicalErrors -c

And see if you have a high number of symbol errors, might be a cable needs 
replugging or replacing.

Simon

From: 
<[email protected]<mailto:[email protected]>>
 on behalf of "J. Eric Wonderley" 
<[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, 17 January 2017 at 21:16
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [gpfsug-discuss] rmda errors scatter thru gpfs logs

I have messages like these frequent my logs:
Tue Jan 17 11:25:49.731 2017: [E] VERBS RDMA rdma write error 
IBV_WC_REM_ACCESS_ERR to 10.51.10.5 (cl005) on mlx5_0 port 1 fabnum 0 
vendor_err 136
Tue Jan 17 11:25:49.732 2017: [E] VERBS RDMA closed connection to 10.51.10.5 
(cl005) on mlx5_0 port 1 fabnum 0 due to RDMA write error IBV_WC_REM_ACCESS_ERR 
index 23

Any ideas on cause..?

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to