Hi Jens,
I highly recommend you go through the pain to upgrade the kernel on your
GPU cluster to something modern, like 4.15.0-91-generic. There was quite
a few regressions around the 4.15.0-56 to 4.15.0-58 mark, as we merged a
lot of upstream stable patches in at that time.
4.15.0-91 is pretty
Hi Jeff,
hmm, didn't get notified by launchpad about your answer :(((. Anyway,
tried another machine with 4.15.0-91-generic and indeed, it seems to be
fixed.
Now the problem is, that our GPU machines are running 4.15.0-58-generic
and cannot be upgraded because all the nvidia stuff is very picky
Hi Jens,
As the fix was landed in 4.15.0-59, I would expect that you would likely still
see issues in 4.15.0-58. The current Bionic GA kernel is 4.15.0.91.83 in
-updates. You should try an updated kernel and see if that resolves the issue.
--
You received this bug notification because you
We use 'Linux kino6 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41
UTC 2019 x86_64 x86_64 x86_64 GNU/Linux' (Ubuntu 18.04.3 LTS) and see
all the time 'hw csum failure's:
[ +28.297139] kino6_0: hw csum failure
[ +0.003607] CPU: 12 PID: 0 Comm: swapper/12 Tainted: P O
Hi @mohamadh,
Are you using the in-tree drivers that Ubuntu supplies?
I have seen some customers using out of tree MOFED drivers from the
Mellanox website that have been out of date, and being able to reproduce
this on newer kernels like 4.15.0-66. Upon updating to newer MOFED
drivers or
Hi,
I see that the issue still reproduced on newer kernels > 4.15.0-69,
to fix the issue should get all the following upstream patches:
net/mlx5e: Rx, Fix checksum calculation for new hardware -->
db849faa9bef993a1379dc510623f750a72fa7ce
net/mlx5e: Rx, Check ip headers sanity - >
Patches could be found in Eoan as well, close it with Fix-released.
** Changed in: linux (Ubuntu)
Status: Incomplete => Fix Released
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
Hi Matthew,
I was going through some kernel bugs and it looks this one has already been
released,
but just didn't get updated automatically as its LP number is not directly
mentioned
in the changelog.
Marking it as Fix Released.
Hope this helps!
cheers,
Mauricio
** Changed in: linux (Ubuntu
The customer installed 4.15.0-59 from -proposed to a machine with
Mellanox Ethernet CX4LX cards, using the mlx5_core kernel module.
Checksums are now calculated correctly and the kernel spat does not
occur when the devices are brought up.
Marking this as verified.
** Tags added:
9 matches
Mail list logo