Your message dated Fri, 30 Apr 2021 21:36:25 +0200
with message-id <E1lcYvn-001MJ1-M1@hullmann.westfalen.local>
and subject line Closing this bug
has caused the Debian Bug report #795060,
regarding Latest Wheezy backport kernel prefers Infiniband mlx4_en over 
mlx4_ib, breaks existing installs
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ow...@bugs.debian.org
immediately.)


-- 
795060: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=795060
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems
--- Begin Message ---
Package: linux-image-3.16.0-0.bpo.4-amd64
Version: 3.16.7-ckt11-1+deb8u2~bpo70+1
Severity: Critical


Hello,

We have a 2 node Supermicro chassis (2028TP-DC0FR) chassis with an onboard
Mellanox ConnecX-3 HBA in production since last year. 
Both nodes are directly connected with a QFSP FDR cable.
We use IPoIB (for DRBD) and thus load the mlx4_ib module and all the
assorted other ones in /etc/modules at boot time. 
These are Wheezy machines, currently with the 3.16.7-ckt2-1~bpo70+1 kernel.

Last week we got another (identical) one of these chassis and I installed
Wheezy as well (we need pacemaker, which is sorely lacking in Jessie).
This was with the 3.16.7-ckt11-1+deb8u2~bpo70+1 kernel and unlike in the
past it proceeded to load the mlx4_en module automatically, created an
eth2: interface and the ib0: interface was nowhere to be found.

This was not only very unexpected, I was also under the impression that 
mlx4_en and mlx4_ib could be used in parallel, but even though mlx4_ib was
loaded it did not work (the  /sys/class/net/ib0 entry was not created).

Booting into the stock Wheezy 3.2 kernel (which we also run on older
machines with ConnectX-2 HBAs) resulted in the expected behavior, IB
interface, no Ethernet. 

I'm also not seeing this on several other machines we use for Ceph with the
current Jessie kernel, but to be fair they use slightly different (QDR,
not FDR) ConnectX-3 HBAs.

After doing a fake-install (blacklisting didn't work) like this:
---
echo "install mlx4_en /bin/true" > /etc/modprobe.d/mlx4_en.conf 
depmod -a
update-initramfs -u
---
and rebooting I have IB running on 3.16.0-0.bpo.4-amd64 again as well.

Given that the previous version works as expected and that Jessie is
doing the "right" thing as well, I'd consider this a critical bug.

Had I rebooted the older production cluster with 500,000 users on it into
this kernel, the results would not have been pretty.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
ch...@gol.com           Global OnLine Japan/Fusion Communications
http://www.gol.com/

--- End Message ---
--- Begin Message ---
This bug was filed for a very old kernel. If you can reproduce it with
- the current version in unstable/testing
- the latest kernel from buster.backports
please reopen the bug, see https://www.debian.org/Bugs/server-control

--- End Message ---

Reply via email to