Public bug reported:

Several different systems, both vm and bare metal, running Ubuntu 20.04
LTS and different Ubuntu kernel versions with the generic kernel,
5.4.0-91.81, a mainline 5.15 build from kernel.ubuntu.com, 5.11-hwe,
show random panics.

The crash dumps all show panics within the SLUB memory management stuff.
Analyzing the kdump show invalid free list pointers:

CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
kmem: kmalloc-256: slab: ffffefc73e319900 invalid freepointer: d28996eab0548a12
ffff8a863dc06f40      256      15454     76096   1189    16k  kmalloc-256

The kernel log shows warnings about wrong slab cache:

[148148.037307] cache_from_obj: Wrong slab cache. jbd2_journal_handle but 
object is from kmalloc-256
[148148.037348] WARNING: CPU: 20 PID: 4141624 at mm/slab.h:521 
kmem_cache_free+0x260/0x2b0

This shows up if there are several hundred MB per second backup traffic
via ceph (I think using rbd). Sometimes it only takes minutes for the
system to panic.

Hardware:
    HP DL380 Gen10
    AMD EPYC 7502 32-Core Processor
    256GB RAM

Firmware and microcode are uptodate.

I attached one backtraces, the warning message, the panic.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "Panic 1"
   https://bugs.launchpad.net/bugs/1952425/+attachment/5543591/+files/out

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1952425

Title:
  SLUB freelist corruption with ceph

Status in linux package in Ubuntu:
  New

Bug description:
  Several different systems, both vm and bare metal, running Ubuntu
  20.04 LTS and different Ubuntu kernel versions with the generic
  kernel, 5.4.0-91.81, a mainline 5.15 build from kernel.ubuntu.com,
  5.11-hwe, show random panics.

  The crash dumps all show panics within the SLUB memory management
  stuff. Analyzing the kdump show invalid free list pointers:

  CACHE             OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE  NAME
  kmem: kmalloc-256: slab: ffffefc73e319900 invalid freepointer: 
d28996eab0548a12
  ffff8a863dc06f40      256      15454     76096   1189    16k  kmalloc-256

  The kernel log shows warnings about wrong slab cache:

  [148148.037307] cache_from_obj: Wrong slab cache. jbd2_journal_handle but 
object is from kmalloc-256
  [148148.037348] WARNING: CPU: 20 PID: 4141624 at mm/slab.h:521 
kmem_cache_free+0x260/0x2b0

  This shows up if there are several hundred MB per second backup
  traffic via ceph (I think using rbd). Sometimes it only takes minutes
  for the system to panic.

  Hardware:
      HP DL380 Gen10
      AMD EPYC 7502 32-Core Processor
      256GB RAM

  Firmware and microcode are uptodate.

  I attached one backtraces, the warning message, the panic.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1952425/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to