apport information

** Attachment added: "ProcInterrupts.txt"
   
https://bugs.launchpad.net/bugs/2023143/+attachment/5678292/+files/ProcInterrupts.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2023143

Title:
  Memory leak on large server

Status in linux package in Ubuntu:
  New

Bug description:
  Hi,
  We are trying to diagnose a kernel memory look on a production Ubuntu 22.04.2 
LTS.
  We have tried several official Ubuntu kernels, 5.15aws, 5.19aws and now even 
6.2.0-1004-aws (all Ubuntu signed):
  ```
  # cat /proc/version_signature
  Ubuntu 6.2.0-1004.4-aws 6.2.6
  ```

  This is a production server so we'll appreciate any and all help diagnosing 
and solving this issue!
   
  The server is an u-112 instance with 12TB RAM, and is losing 1TB+ of memory a 
day to a kernel leak.
  For example, currently with an uptime of 3.5 days, we have 1.8Ti available, 
however RSS+slabs is only 4.1TB.

  all active process together take about 4TB of RAM (`ps -eo rss | awk
  'BEGIN {x=0} {x = x + $1} END {print x}'` gives 4088636708).

  From slabtop we see about 100GB are consumed by slab (`slabtop -o -s t | 
head`: )
  ```
   Active / Total Objects (% used)    : 303580174 / 332642344 (91.3%)
   Active / Total Slabs (% used)      : 6697552 / 6697552 (100.0%)
   Active / Total Caches (% used)     : 158 / 215 (73.5%)
   Active / Total Size (% used)       : 112801663.93K / 121442845.45K (92.9%)
   Minimum / Average / Maximum Object : 0.01K / 0.36K / 16.00K

    OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
  67537280 59696907  88%    0.03K 527635      128   2110540K kmalloc-32
  65247564 65241398  99%    0.31K 1279364       51  20469824K arc_buf_hdr_t_full
  58270446 58040685  99%    0.10K 747057       78   5976456K abd_t
  16697268 13731405  82%    0.38K 397554       42   6360864K dmu_buf_impl_t
  15982912 10366686  64%    0.50K 249733       64   7991456K kmalloc-512
  14975616 11605380  77%    0.06K 233994       64    935976K kmalloc-64
  ```

  In /proc/meminfo:
  ```
  MemTotal:       12656421408 kB
  MemFree:        1975976204 kB
  MemAvailable:   1968415088 kB
  Buffers:         1087956 kB
  Cached:         101168004 kB
  SwapCached:     17912340 kB
  Active:         101022084 kB
  Inactive:       4129984264 kB
  Active(anon):   94623216 kB
  Inactive(anon): 4104673512 kB
  Active(file):    6398868 kB
  Inactive(file): 25310752 kB
  Unevictable:      338908 kB
  Mlocked:          332132 kB
  SwapTotal:      4294967292 kB
  SwapFree:       3500705532 kB
  Zswap:                 0 kB
  Zswapped:              0 kB
  Dirty:              2908 kB
  Writeback:             0 kB
  AnonPages:      4123489132 kB
  Mapped:          3761620 kB
  Shmem:          70756156 kB
  KReclaimable:   10319220 kB
  Slab:           122355620 kB
  SReclaimable:   10319220 kB
  SUnreclaim:     112036400 kB
  KernelStack:     1793296 kB
  PageTables:     21748556 kB
  SecPageTables:         0 kB
  NFS_Unstable:          0 kB
  Bounce:                0 kB
  WritebackTmp:          0 kB
  CommitLimit:    10623177996 kB
  Committed_AS:   6775476544 kB
  VmallocTotal:   34359738367 kB
  VmallocUsed:    296984480 kB
  VmallocChunk:          0 kB
  Percpu:          1326080 kB
  HardwareCorrupted:     0 kB
  AnonHugePages:  1630980096 kB
  ShmemHugePages:        0 kB
  ShmemPmdMapped:        0 kB
  FileHugePages:         0 kB
  FilePmdMapped:         0 kB
  HugePages_Total:       0
  HugePages_Free:        0
  HugePages_Rsvd:        0
  HugePages_Surp:        0
  Hugepagesize:       2048 kB
  Hugetlb:               0 kB
  DirectMap4k:     2056036 kB
  DirectMap2M:    40935424 kB
  DirectMap1G:    12814647296 kB
  ```

  Its not a tmpfs/shm fs issue either:
  ```
  df -h | grep -E 'tmpfs|shm'
  tmpfs                                               256G   70G  187G  27% 
/dev/shm
  tmpfs                                               256G  3.4M  256G   1% /run
  tmpfs                                               5.0M     0  5.0M   0% 
/run/lock
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10102
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/1002
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10030
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10194
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10200
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10136
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10198
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10143
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10188
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10124
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10174
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10165
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10197
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10183
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10033
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10023
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10133
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10185
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10201
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/1004
  tmpfs                                               8.0G   24K  8.0G   1% 
/run/user/10014
  ```
  --- 
  ProblemType: Bug
  AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 
2: ls: cannot access '/dev/snd/': No such file or directory
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.20.11-0ubuntu82.5
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  CRDA: N/A
  CasperMD5CheckResult: unknown
  DistroRelease: Ubuntu 22.04
  Ec2AMI: ami-08c40ec9ead489470
  Ec2AMIManifest: (unknown)
  Ec2AvailabilityZone: us-east-1d
  Ec2InstanceType: u-12tb1.112xlarge
  Ec2Kernel: unavailable
  Ec2Ramdisk: unavailable
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lspci: Error: [Errno 2] No such file or directory: 'lspci'
  Lspci-vt: Error: [Errno 2] No such file or directory: 'lspci'
  Lsusb: Error: command ['lsusb'] failed with exit code 1:
  Lsusb-t:
   
  Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1:
  MachineType: Amazon EC2 u-12tb1.112xlarge
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   LC_CTYPE=C.UTF-8
   TERM=xterm-256color
   PATH=(custom, no user)
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB:
   
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.2.0-1004-aws 
root=PARTUUID=cbb5015f-ca94-467b-91ae-cce97828a042 ro quiet mitigations=off 
console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 panic=-1
  ProcVersionSignature: Ubuntu 6.2.0-1004.4-aws 6.2.6
  RelatedPackageVersions:
   linux-restricted-modules-6.2.0-1004-aws N/A
   linux-backports-modules-6.2.0-1004-aws  N/A
   linux-firmware                          N/A
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags:  jammy ec2-images
  Uname: Linux 6.2.0-1004-aws x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: False
  dmi.bios.date: 10/16/2017
  dmi.bios.release: 1.0
  dmi.bios.vendor: Amazon EC2
  dmi.bios.version: 1.0
  dmi.board.asset.tag: i-0b8914fe51e3d7555
  dmi.board.vendor: Amazon EC2
  dmi.chassis.asset.tag: Amazon EC2
  dmi.chassis.type: 1
  dmi.chassis.vendor: Amazon EC2
  dmi.modalias: 
dmi:bvnAmazonEC2:bvr1.0:bd10/16/2017:br1.0:svnAmazonEC2:pnu-12tb1.112xlarge:pvr:rvnAmazonEC2:rn:rvr:cvnAmazonEC2:ct1:cvr:sku:
  dmi.product.name: u-12tb1.112xlarge
  dmi.sys.vendor: Amazon EC2

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023143/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to