I primarily test by building webkitgtk [1], and I experience the same loss of 
system responsiveness whether / is ext4 or Btrfs. But I do see a difference in 
top and iotop.
https://drive.google.com/open?id=12jpQeskPsvHmfvDjWSPOwIWSz09JIUlk

This is an extreme case of refaulting, it's out of memory and swap, and
since kswapd and btrfs threads are using a lot of CPU I'm guessing the
faults are a mix of anonymous pages and file pages. At this point the
system is really lost which is why the UX is the same with ext4 and
btrfs; but behind the scenes it does seem more is going on. There might
be other workloads which aren't as extreme, thereby exposing the
difference. Two possible sources of the heavy CPU for btrfs threads:
decompression, and checksumming. If it's true there is near constant
reclaim happening, it's not just a simple minimum 4K read but rather a
128K minimum because all Btrfs compressed files use 128K extent size; is
then decompressed, and then requires reading csum tree and computing
csum on the read to compare. Ordinarily this is cheap but in this
situation possibly it's resulting in a lot of extra congestion, but this
is the limit of my knowledge so it's just speculation.

Btrfs write amplification is a known issue (wandering trees problem).
But that appears to not be the issue in this example.

It might be this problem is better dealt with by cgroupsv2 to protect certain 
tasks from reclaim, and thus reduce the problem on any file system. But Btrfs 
alone (for now) does have more sophisticated cgroupvs2 IO isolation control as 
well.
https://www.spinics.net/lists/cgroups/msg24743.html

The upstream GNOME and KDE developers are aware of the loss of responsiveness 
problem and have done quite a lot of preliminary work in GNOME 3.34 with more 
work on the way.
https://blogs.gnome.org/benzea/2019/10/01/gnome-3-34-is-now-managed-using-systemd/

You can today take advantage of this cgroupsv2 work by running resource hungry 
tasks as a systemd user unit in Fedora 31.
https://blogs.gnome.org/benzea/2019/10/01/gnome-3-34-is-now-managed-using-systemd/#comment-14833

I expect in the next 6-12 months (it's a guesstimate) there will be
additional work in GNOME to protect the user session or what I vaguely
call the "GUI stack" from reclaim, and thus improve its responsiveness
at the expense of the resource hungry process.


[1] first two lines; set -j to RAM in GiB +2 GiB; i.e. if you have 8G RAM, use 
-j 10; more jobs makes the problem happen faster.
https://trac.webkit.org/wiki/BuildingGtk

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1833281

Title:
  System freeze when memory is put on SWAP in Linux >4.10.x

Status in Linux:
  Confirmed
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  I'm reporting this since it's reproduceable the 70% of the time.
  Summary:

  In different circumstances, when the systems starts to swap out RAM
  memory, even small amounts, the system becomes completely unusuable
  and the screen freezes up, no mouse movement, no TTY access or SSH
  access can be made, only SYSRQ keys seem to do something (only reboot,
  so REISUB worked so far though, OOM is useless since the memory/swap
  is not even full)

  The I/O Disk led is stuck to 100% in ALL the following cases when this
  happens.

  So far:
  - This happens even when only ZRAM is enabled, and no swap partition is used.
  - Happens when ZSWAP is used with a swap partition
  - Happens also when a partition without zram or zswap is used
  - Maybe it's AMD specific? 

  However, I'm not experiencing this on my laptop using the same tests.

  My laptop is an Intel one, while my desktop is an AMD Ryzen platform.

  Here are the specs:
  CPU: AMD Ryzen 5 1600 no OC
  GPU: AMD RX 580 8GB
  SSD: Crucial MX500 500GB
  MOBO: MSI B350M Grenade
  RAM: 8GB HyperX Kingston 2667Mhz

  Ubuntu version: 18.04 LTS, backports repo enabled
  Kernel version: 4.18.0-18, official ubuntu repo
  Bios settings: Default

  Additional info: Maybe I'm not 100% sure, but I noticed when using the
  5.0.0-17 generic kernel, the lockups seem to still happen, but they
  recover eventually. Happened only a few times though...

  But will always be frozen for at least 30 seconds, differently from my
  intel laptop where those do occur.

  The SSD make is the same. I bought two of these, they got also the
  same amount of RAM.

  In my laptop those do not occur at all. Swapping memory even huge
  quantities like 1GB or more, do not produce any issues.

  Tests made:
  For testing this behaviour I tried:

  - Compiling the chromium-browser source code (takes up a lot of system
  RAM)

  - Used the "stress" command, using a specific amount of memory to
  decide how many it will be swapped, and here I noticed that even small
  quantities like a couple of megabytes will cause the system to freeze
  the 70% of the times

  Example: "stress --vm 1 --vm-bytes=7G"

  What should happen:
  I expect system slowdowns when swapping out memory since I do not have enough 
RAM, but unlikely when using Windows or my laptop with the same Linux version, 
not a completely unusuable environment. The swap partition is in both cases on 
an SSD.

  Reproduceability: 70% of the times

  Additional info again:
  I'm not sure this is due to any hardware failure, my SSD health is fine, as 
my CPU and RAM. As I said swapping in Windows works fine...
  --- 
  ProblemType: Bug
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC1:  haru       2076 F.... pulseaudio
   /dev/snd/controlC2:  haru       2076 F.... pulseaudio
   /dev/snd/controlC0:  haru       2076 F.... pulseaudio
  CurrentDesktop: communitheme:ubuntu:GNOME
  DistroRelease: Ubuntu 18.04
  IwConfig:
   enp24s0   no wireless extensions.
   
   lo        no wireless extensions.
  MachineType: Micro-Star International Co., Ltd. MS-7A37
  Package: linux (not installed)
  ProcFB: 0 amdgpudrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-5.0.0-17-generic 
root=UUID=75d45574-7169-4653-aea3-9f95087f0806 ro rootflags=subvol=@ quiet 
splash vt.handoff=1
  ProcVersionSignature: Ubuntu 5.0.0-17.18~18.04.1-generic 5.0.8
  RelatedPackageVersions:
   linux-restricted-modules-5.0.0-17-generic N/A
   linux-backports-modules-5.0.0-17-generic  N/A
   linux-firmware                            1.173.6
  RfKill:
   0: hci0: Bluetooth
        Soft blocked: no
        Hard blocked: no
  Tags:  bionic
  Uname: Linux 5.0.0-17-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: sudo video
  WifiSyslog:
   
  _MarkForUpload: True
  dmi.bios.date: 01/22/2019
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 1.K0
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: B350M MORTAR (MS-7A37)
  dmi.board.vendor: Micro-Star International Co., Ltd.
  dmi.board.version: 1.0
  dmi.chassis.asset.tag: To be filled by O.E.M.
  dmi.chassis.type: 4
  dmi.chassis.vendor: Micro-Star International Co., Ltd.
  dmi.chassis.version: 1.0
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr1.K0:bd01/22/2019:svnMicro-StarInternationalCo.,Ltd.:pnMS-7A37:pvr1.0:rvnMicro-StarInternationalCo.,Ltd.:rnB350MMORTAR(MS-7A37):rvr1.0:cvnMicro-StarInternationalCo.,Ltd.:ct4:cvr1.0:
  dmi.product.family: To be filled by O.E.M.
  dmi.product.name: MS-7A37
  dmi.product.sku: To be filled by O.E.M.
  dmi.product.version: 1.0
  dmi.sys.vendor: Micro-Star International Co., Ltd.

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1833281/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to