I primarily test by building webkitgtk [1], and I experience the same loss of 
system responsiveness whether / is ext4 or Btrfs. But I do see a difference in 
top and iotop.
https://drive.google.com/open?id=12jpQeskPsvHmfvDjWSPOwIWSz09JIUlk

This is an extreme case of refaulting, it's out of memory and swap, and
since kswapd and btrfs threads are using a lot of CPU I'm guessing the
faults are a mix of anonymous pages and file pages. At this point the
system is really lost which is why the UX is the same with ext4 and
btrfs; but behind the scenes it does seem more is going on. There might
be other workloads which aren't as extreme, thereby exposing the
difference. Two possible sources of the heavy CPU for btrfs threads:
decompression, and checksumming. If it's true there is near constant
reclaim happening, it's not just a simple minimum 4K read but rather a
128K minimum because all Btrfs compressed files use 128K extent size; is
then decompressed, and then requires reading csum tree and computing
csum on the read to compare. Ordinarily this is cheap but in this
situation possibly it's resulting in a lot of extra congestion, but this
is the limit of my knowledge so it's just speculation.

Btrfs write amplification is a known issue (wandering trees problem).
But that appears to not be the issue in this example.

It might be this problem is better dealt with by cgroupsv2 to protect certain 
tasks from reclaim, and thus reduce the problem on any file system. But Btrfs 
alone (for now) does have more sophisticated cgroupvs2 IO isolation control as 
well.
https://www.spinics.net/lists/cgroups/msg24743.html

The upstream GNOME and KDE developers are aware of the loss of responsiveness 
problem and have done quite a lot of preliminary work in GNOME 3.34 with more 
work on the way.
https://blogs.gnome.org/benzea/2019/10/01/gnome-3-34-is-now-managed-using-systemd/

You can today take advantage of this cgroupsv2 work by running resource hungry 
tasks as a systemd user unit in Fedora 31.
https://blogs.gnome.org/benzea/2019/10/01/gnome-3-34-is-now-managed-using-systemd/#comment-14833

I expect in the next 6-12 months (it's a guesstimate) there will be
additional work in GNOME to protect the user session or what I vaguely
call the "GUI stack" from reclaim, and thus improve its responsiveness
at the expense of the resource hungry process.


[1] first two lines; set -j to RAM in GiB +2 GiB; i.e. if you have 8G RAM, use 
-j 10; more jobs makes the problem happen faster.
https://trac.webkit.org/wiki/BuildingGtk

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1833281

Title:
  System freeze when memory is put on SWAP in Linux >4.10.x

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1833281/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to