I did netperf test over Mellanox 23108 4X HCA against r2720, after a while the HCA stopped to accept packets.
I am pretty sure it's a bug in driver not IPoIB. rmmod ib_ipoib didn't help, after removing ib_mthca, and restarted, the driver worked again. It's easy to reprocude on my 4-way intel based systems.

I did saw the below messages from /var/log/messages, when this happened.

Jun 29 12:13:53 elm3a238 kernel: ib_mthca 0000:04:00.0: Port change to down for port 1
Jun 29 12:13:55 elm3a238 kernel: ib_mthca 0000:04:00.0: Port change to active for port 1

And the HCA didn't accept ARP reply packet afterwards.

Now I am adding a permanent arp entry to avoid this problem. It seems to work fine. Anybody else saw this problem?

Thanks
Shirley Ma
IBM Linux Technology Center
15300 SW Koll Parkway
Beaverton, OR 97006-6063
Phone(Fax): (503) 578-7638

_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to