I have reproduced with @amalmostafa's updated script with a separate
disk too. I see no segfault and no EXT4 errors but the regression in
performance is still present but not as great as in my previous tests.

```
### Ubuntu 20.04 with 5.4 kernel and data disk

ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write 
--filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting 
--numjobs=8 --name=fiojob1 --filename=/dev/sdb
fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, 
ioengine=libaio, iodepth=128
...
fio-3.16
Starting 8 processes
Jobs: 8 (f=8): [W(8)][100.0%][w=1162MiB/s][w=298k IOPS][eta 00m:00s]
fiojob1: (groupid=0, jobs=8): err= 0: pid=2391: Tue Nov 21 10:34:57 2023
  write: IOPS=284k, BW=1108MiB/s (1162MB/s)(320GiB/295713msec); 0 zone resets
    slat (nsec): min=751, max=115304k, avg=8630.05, stdev=124263.77
    clat (nsec): min=391, max=239001k, avg=3598764.23, stdev=2429948.87
     lat (usec): min=72, max=239002, avg=3607.70, stdev=2428.75
    clat percentiles (usec):
     |  1.00th=[  668],  5.00th=[ 1434], 10.00th=[ 1778], 20.00th=[ 2212],
     | 30.00th=[ 2573], 40.00th=[ 2900], 50.00th=[ 3261], 60.00th=[ 3654],
     | 70.00th=[ 4080], 80.00th=[ 4686], 90.00th=[ 5669], 95.00th=[ 6587],
     | 99.00th=[ 9110], 99.50th=[10945], 99.90th=[26608], 99.95th=[43779],
     | 99.99th=[83362]
   bw (  MiB/s): min=  667, max= 1341, per=99.98%, avg=1107.88, stdev=13.07, 
samples=4728
   iops        : min=170934, max=343430, avg=283618.08, stdev=3346.84, 
samples=4728
  lat (nsec)   : 500=0.01%, 750=0.01%, 1000=0.01%
  lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
  lat (usec)   : 250=0.04%, 500=0.43%, 750=0.66%, 1000=0.51%
  lat (msec)   : 2=13.23%, 4=53.26%, 10=31.18%, 20=0.53%, 50=0.11%
  lat (msec)   : 100=0.03%, 250=0.01%
  cpu          : usr=3.10%, sys=7.36%, ctx=1105263, majf=0, minf=102
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
  WRITE: bw=1108MiB/s (1162MB/s), 1108MiB/s-1108MiB/s (1162MB/s-1162MB/s), 
io=320GiB (344GB), run=295713-295713msec

Disk stats (read/write):
  sdb: ios=96/33256749, merge=0/50606838, ticks=19/30836895, in_queue=971080, 
util=100.00%
ubuntu@cloudimg:~$ 
ubuntu@cloudimg:~$ uname --kernel-release
5.4.0-164-generic
ubuntu@cloudimg:~$ cat /etc/cloud/build.info 
build_name: server
serial: 20231011
ubuntu@cloudimg:~$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/";
SUPPORT_URL="https://help.ubuntu.com/";
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/";
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy";
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

### Ubuntu 20.04 with 5.15 kernel and data disk

ubuntu@cloudimg:~$ uname --kernel-release 
5.15.0-89-generic
ubuntu@cloudimg:~$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.6 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.6 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/";
SUPPORT_URL="https://help.ubuntu.com/";
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/";
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy";
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
ubuntu@cloudimg:~$ cat /etc/cloud/build.info 
build_name: server
serial: 20231011

ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write 
--filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting 
--numjobs=8 --name=fiojob1 --filename=/dev/sdb
fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, 
ioengine=libaio, iodepth=128
...
fio-3.16
Starting 8 processes
Jobs: 8 (f=8): [W(8)][100.0%][w=1071MiB/s][w=274k IOPS][eta 00m:00s]
fiojob1: (groupid=0, jobs=8): err= 0: pid=1008: Tue Nov 21 12:19:56 2023
  write: IOPS=258k, BW=1007MiB/s (1056MB/s)(320GiB/325284msec); 0 zone resets
    slat (nsec): min=931, max=36726k, avg=7936.15, stdev=120427.45
    clat (nsec): min=1963, max=155870k, avg=3959799.65, stdev=2129472.51
     lat (usec): min=55, max=155872, avg=3968.10, stdev=2128.87
    clat percentiles (usec):
     |  1.00th=[  562],  5.00th=[ 1319], 10.00th=[ 1811], 20.00th=[ 2376],
     | 30.00th=[ 2835], 40.00th=[ 3294], 50.00th=[ 3720], 60.00th=[ 4113],
     | 70.00th=[ 4621], 80.00th=[ 5211], 90.00th=[ 6390], 95.00th=[ 7439],
     | 99.00th=[10159], 99.50th=[11863], 99.90th=[19268], 99.95th=[23462],
     | 99.99th=[36439]
   bw (  KiB/s): min=715896, max=1190887, per=100.00%, avg=1031613.99, 
stdev=9298.50, samples=5200
   iops        : min=178974, max=297721, avg=257903.26, stdev=2324.62, 
samples=5200
  lat (usec)   : 2=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 250=0.01%
  lat (usec)   : 500=0.46%, 750=1.58%, 1000=0.60%
  lat (msec)   : 2=10.64%, 4=43.72%, 10=41.91%, 20=0.99%, 50=0.08%
  lat (msec)   : 100=0.01%, 250=0.01%
  cpu          : usr=2.85%, sys=7.94%, ctx=959401, majf=0, minf=106
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
  WRITE: bw=1007MiB/s (1056MB/s), 1007MiB/s-1007MiB/s (1056MB/s-1056MB/s), 
io=320GiB (344GB), run=325284-325284msec

Disk stats (read/write):
  sdb: ios=46/38116780, merge=0/45751626, ticks=3/33604063, in_queue=33604068, 
util=100.00%


### Ubuntu 22.04 with 5.15 kernel and data disk

ubuntu@cloudimg:~$ uname --kernel-release
5.15.0-87-generic
ubuntu@cloudimg:~$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/";
SUPPORT_URL="https://help.ubuntu.com/";
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/";
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy";
UBUNTU_CODENAME=jammy
ubuntu@cloudimg:~$ cat /etc/cloud/build.info 
build_name: server
serial: 20231026
ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write 
--filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting 
--numjobs=8 --name=fiojob1 --filename=/dev/sdb
fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, 
ioengine=libaio, iodepth=128
...
fio-3.28
Starting 8 processes
Jobs: 8 (f=8): [W(8)][100.0%][w=921MiB/s][w=236k IOPS][eta 00m:00s]
fiojob1: (groupid=0, jobs=8): err= 0: pid=2202: Tue Nov 21 12:28:43 2023
  write: IOPS=256k, BW=1001MiB/s (1050MB/s)(320GiB/327226msec); 0 zone resets
    slat (nsec): min=901, max=35355k, avg=7934.72, stdev=121719.65
    clat (usec): min=3, max=142689, avg=3983.12, stdev=2144.58
     lat (usec): min=96, max=142691, avg=3991.39, stdev=2144.04
    clat percentiles (usec):
     |  1.00th=[  570],  5.00th=[ 1336], 10.00th=[ 1827], 20.00th=[ 2409],
     | 30.00th=[ 2868], 40.00th=[ 3294], 50.00th=[ 3720], 60.00th=[ 4146],
     | 70.00th=[ 4621], 80.00th=[ 5276], 90.00th=[ 6390], 95.00th=[ 7504],
     | 99.00th=[10421], 99.50th=[12256], 99.90th=[19006], 99.95th=[22938],
     | 99.99th=[33162]
   bw (  KiB/s): min=684728, max=1214915, per=100.00%, avg=1026686.75, 
stdev=9974.31, samples=5226
   iops        : min=171182, max=303728, avg=256671.00, stdev=2493.58, 
samples=5226
  lat (usec)   : 4=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 250=0.01%
  lat (usec)   : 500=0.42%, 750=1.56%, 1000=0.62%
  lat (msec)   : 2=10.20%, 4=44.19%, 10=41.81%, 20=1.11%, 50=0.08%
  lat (msec)   : 100=0.01%, 250=0.01%
  cpu          : usr=2.88%, sys=7.72%, ctx=959556, majf=0, minf=113
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=0,83886080,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=128

Run status group 0 (all jobs):
  WRITE: bw=1001MiB/s (1050MB/s), 1001MiB/s-1001MiB/s (1050MB/s-1050MB/s), 
io=320GiB (344GB), run=327226-327226msec

Disk stats (read/write):
  sdb: ios=150/38103383, merge=0/45772366, ticks=40/33843512, 
in_queue=33843555, util=100.00%

```

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2042564

Title:
  Performance regression in the 5.15 Ubuntu 20.04 kernel compared to 5.4
  Ubuntu 20.04 kernel

Status in linux package in Ubuntu:
  New
Status in linux source package in Focal:
  New

Bug description:
  We in the Canonical Public Cloud team have received report from our
  colleagues in Google regarding a potential performance regression with
  the 5.15 kernel vs the 5.4 kernel on ubuntu 20.04. Their test were
  performed using the linux-gkeop and linux-gkeop-5.15 kernels.

  I have verified with the generic Ubuntu 20.04 5.4 linux-generic and
  the Ubuntu 20.04 5.15 linux-generic-hwe-20.04 kernels.

  The tests were run using `fio`

  fio commands:

  * 4k initwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write 
--filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting 
--numjobs=8 --name=fiojob1 --filename=/dev/sdc`
  * 4k overwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write 
--filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting 
--numjobs=8 --name=fiojob1 --filename=/dev/sdc`

  
  My reproducer was to launch an Ubuntu 20.04 cloud image locally with qemu the 
results are below:

  Using 5.4 kernel

  ```
  ubuntu@cloudimg:~$ uname --kernel-release
  5.4.0-164-generic

  ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k 
--readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 
--group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda
  fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=128
  ...
  fio-3.16
  Starting 8 processes
  Jobs: 8 (f=8): [W(8)][99.6%][w=925MiB/s][w=237k IOPS][eta 00m:01s] 
  fiojob1: (groupid=0, jobs=8): err= 0: pid=2443: Thu Nov  2 09:15:22 2023
    write: IOPS=317k, BW=1237MiB/s (1297MB/s)(320GiB/264837msec); 0 zone resets
      slat (nsec): min=628, max=37820k, avg=7207.71, stdev=101058.61
      clat (nsec): min=457, max=56099k, avg=3222240.45, stdev=1707823.38
       lat (usec): min=23, max=56100, avg=3229.78, stdev=1705.80
      clat percentiles (usec):
       |  1.00th=[  775],  5.00th=[ 1352], 10.00th=[ 1647], 20.00th=[ 2024],
       | 30.00th=[ 2343], 40.00th=[ 2638], 50.00th=[ 2933], 60.00th=[ 3261],
       | 70.00th=[ 3654], 80.00th=[ 4146], 90.00th=[ 5014], 95.00th=[ 5932],
       | 99.00th=[ 8979], 99.50th=[10945], 99.90th=[18220], 99.95th=[22676],
       | 99.99th=[32113]
     bw (  MiB/s): min=  524, max= 1665, per=100.00%, avg=1237.72, stdev=20.42, 
samples=4232
     iops        : min=134308, max=426326, avg=316855.16, stdev=5227.36, 
samples=4232
    lat (nsec)   : 500=0.01%, 750=0.01%, 1000=0.01%
    lat (usec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%
    lat (usec)   : 250=0.05%, 500=0.54%, 750=0.37%, 1000=0.93%
    lat (msec)   : 2=17.40%, 4=58.02%, 10=22.01%, 20=0.60%, 50=0.07%
    lat (msec)   : 100=0.01%
    cpu          : usr=3.29%, sys=7.45%, ctx=1262621, majf=0, minf=103
    IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.1%
       issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0
       latency   : target=0, window=0, percentile=100.00%, depth=128

  Run status group 0 (all jobs):
    WRITE: bw=1237MiB/s (1297MB/s), 1237MiB/s-1237MiB/s (1297MB/s-1297MB/s), 
io=320GiB (344GB), run=264837-264837msec

  Disk stats (read/write):
    sda: ios=36/32868891, merge=0/50979424, ticks=5/27498602, in_queue=1183124, 
util=100.00%
  ```

  
  After upgrading to linux-generic-hwe-20.04 kernel and rebooting

  ```
  ubuntu@cloudimg:~$ uname --kernel-release
  5.15.0-88-generic

  ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k 
--readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 
--group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda
  fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=libaio, iodepth=128
  ...
  fio-3.16
  Starting 8 processes
  Jobs: 1 (f=1): [_(7),W(1)][100.0%][w=410MiB/s][w=105k IOPS][eta 00m:00s]
  fiojob1: (groupid=0, jobs=8): err= 0: pid=1438: Thu Nov  2 09:46:49 2023
    write: IOPS=155k, BW=605MiB/s (634MB/s)(320GiB/541949msec); 0 zone resets
      slat (nsec): min=660, max=325426k, avg=10351.04, stdev=232438.50
      clat (nsec): min=1100, max=782743k, avg=6595008.67, stdev=6290570.04
       lat (usec): min=86, max=782748, avg=6606.08, stdev=6294.03
      clat percentiles (usec):
       |  1.00th=[   914],  5.00th=[  2180], 10.00th=[  2802], 20.00th=[  3556],
       | 30.00th=[  4178], 40.00th=[  4817], 50.00th=[  5538], 60.00th=[  6259],
       | 70.00th=[  7177], 80.00th=[  8455], 90.00th=[ 10683], 95.00th=[ 13566],
       | 99.00th=[ 26870], 99.50th=[ 34866], 99.90th=[ 63177], 99.95th=[ 80217],
       | 99.99th=[145753]
     bw (  KiB/s): min=39968, max=1683451, per=100.00%, avg=619292.10, 
stdev=26377.19, samples=8656
     iops        : min= 9990, max=420862, avg=154822.58, stdev=6594.34, 
samples=8656
    lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
    lat (usec)   : 100=0.01%, 250=0.01%, 500=0.05%, 750=0.48%, 1000=0.65%
    lat (msec)   : 2=2.79%, 4=23.00%, 10=60.93%, 20=10.08%, 50=1.83%
    lat (msec)   : 100=0.16%, 250=0.02%, 500=0.01%, 1000=0.01%
    cpu          : usr=3.27%, sys=7.39%, ctx=1011754, majf=0, minf=93
    IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
       submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
       complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.1%
       issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0
       latency   : target=0, window=0, percentile=100.00%, depth=128

  Run status group 0 (all jobs):
    WRITE: bw=605MiB/s (634MB/s), 605MiB/s-605MiB/s (634MB/s-634MB/s), 
io=320GiB (344GB), run=541949-541949msec

  Disk stats (read/write):
    sda: ios=264/31713991, merge=0/52167896, ticks=127/57278442, 
in_queue=57278609, util=99.95%
  ```

  I have shared the results with xnox and the important datapoints to
  see are `bw=1237MiB/s` with the 5.4 kernel and only `bw=605MiB/s` with
  the 5.15 kernel.

  Attached find the test results initially reported by our google
  colleagues

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042564/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to