GNU/Hurd swap performance improvements

Michael Kelly Thu, 18 Dec 2025 12:03:17 -0800

Hi All,

With recent contributions to Hurd IRQ management I was finally able torun GNU/Hurd on my vintage X64 hardware for the purposes of stresstesting using stress-ng. I've been running similar tests on virtualmachines over the last 6 months or so and I was interested in how stablethese tests were on standalone hardware.

It was immediately obvious that the swapping performance during theintensive paging operations was very much reduced compared to thevirtual machine. That in itself is not surprising but the performancewas so poor that system lockups (which do occur similarly on virtualmachines) were almost immediate. In fact, I haven't been able to runthis single 2 minute test case to completion without the kernel endingin a 'system lock' (awaiting a page in):

# stress-ng -t 2m --metrics --vm 32 --vm-bytes 1800M --mmap 32--mmap-bytes 1800M --page-in

My machine has a traditional rotating disc and 4Gb of RAM. Running theabove on a similar sized virtual machine uses around 1.3G of swap anddoes succeed approximately 90% or more of the time. My suspicion is thatthe longer page in times associated with a real disc (rather than apossibly cached virtual disc) results in a greater likelihood of systemlock. I concluded that in order to make meaningful observations of thistype of system performance on actual hardware that I needed to make someimprovements to the swapping performance.

The current page replacement policy in gnumach/vm_page is documentedwithin the source. It describes the policy of preferring page out ofexternal pages (mmap) over internal pages (anonymous memory) in order tominimise the use of the default pager which is described as unreliable.I've been stress testing GNU/Hurd now for quite some time and seen manyinstances of system freezes but I do not recall any that there werecertainly caused by the default pager. The most common underlying causeinvolves a request to an external page that cannot be progressed due toeither a deadlock situation elsewhere or assertions within the ext2fs orstoreio servers.

I have spent some time recently developing some alternative pagereplacement implementations of varying complexity. One of the mostsimple of these (referred to from here as 'My_patch') actually resultsin very significant performance improvements generally and sufficientimprovement to allow the stress test case above to complete most times.Before I offer this as a patch series, I'd like to present theperformance improvements it results in and a description of how itachieves it.


I've benchmarked the following 2 test cases:

1) SNG10

This is simply a tenfold iteration of the 2 minute stress-ng test caseshown above. Whilst this is a good driver of the system to enter a heavypaging state, it doesn't really represent anything that might normallybe run on a machine.


2) TCM3

This is a test case that might be more likely to occur normally. I foundsome C++ that uses heavily templated code which results in largecompilation process sizes. Specifically, I used the code MatrixSine.cppwhich I found as example code within the libeigen package. Running 3concurrent compilations results in swap usage of around 500M on my 4GBtest machines:

# /usr/bin/x86_64-gnu-g++-14 -I/usr/include/eigen3 -g -O2 -omatrix_sine_1 MatrixSine.cpp &# /usr/bin/x86_64-gnu-g++-14 -I/usr/include/eigen3 -g -O2 -omatrix_sine_2 MatrixSine.cpp &# /usr/bin/x86_64-gnu-g++-14 -I/usr/include/eigen3 -g -O2 -omatrix_sine_3 MatrixSine.cpp &


These are the various machine configurations used with each test case:

1) VMHURD_REL: VM using Hurd (GNU-Mach 1.8+git20250731-8 amd64)
  4096M RAM (3610M post boot), 2.8G swap

2) VMHURD_PAT: VM using Hurd (GNU-Mach 1.8+git20250731-8 amd64 + 'My_patch')
  4096M RAM (3610M post boot), 2.8G swap

3) VMLINX: VM using Debian (6.12.48+deb13-amd64)
  3920M RAM (run with maxcpus=1 and has 3610M post boot), 4G swap

4) HWHURD_REL: Advent hardware using Hurd (GNU-Mach 1.8+git20250731-8 amd64)
  4096M (Available 3374M after boot), 10G swap

5) HWHURD_PAT: Advent hardware using Hurd (GNU-Mach 1.8+git20250731-8amd64 + 'My_patch')

  4096M (Available 3374M after boot), 10G swap

6) HWLINX: Advent hardware using Debian (6.12.48+deb13-amd64)
  4096M (run with maxcpus=1 and has 3325M available after boot)

I have a number of my own local glibc, gnumach and hurd patches that fixvarious bugs exposed by the stress tests but which I have not yetsubmitted for merging. These do not affect swap performance.


These figures show averages for a number of runs of TCM3:

VMHURD_REL/TCM3: 11m12s (pagein=2225294,pageout=1972821)
VMHURD_PAT/TCM3:  3m07s (pagein= 179883,pageout= 281279)

HWHURD_REL/TCM3:  Unable to complete any test case
HWHURD_PAT/TCM3:  8m59s (pagein=256466,pageout=373059)

HWLINX/TCM3:      2m12s (pagein=66796,pageout=236674)

The VMLINX times are significantly shorter than VMHURD_PAT but due todifferences in virtual machine optimisations it doesn't seem meaningfulto report those. The above however shows that on hardware and even withmy patched kernel that Linux is around 4 times faster than GNU/Hurd inthis test case.

The stress-ng test case metrics give an indication of the number of mmapand vm operations completed. Here are the averaged totals for a numberof test cases:


VMHURD_REL/SNG10: (mmap) 50.1, (vm) 206217
VMHURD_PAT/SNG10: (mmap) 327.1, (vm) 840666
HWHURD_REL/SNG10: Unable to complete any test run
HWHURD_PAT/SNG10: (mmap) 183, (vm) 560018

TCM3 completes over 3 times faster with 'My_patch' and there areapproximately 4 times as many stress-ng operations completed per iteration.

All 'My_patch' actually does is to remove the restriction in alwaysprioritising external pages for page eviction. There are quite a fewlines of code changed but almost all of them are trivial, really. Thechanges result in 2 main behavioural differences:

1) The vm_page code currently always attempts to find an external pagebefore looking for internal ones. I have changed a number of functionsto be told explicitly to choose either external or internal.

2) The vm_page code currently counts the number of active and inactivepages. The counts include all external and internal pages. I havechanged the code to maintain separate counts for active_internal,active_external, inactive_internal and inactive_external pages.

The final patch in the series uses an extremely unsophisticatedalgorithm to determine what type of page to choose for evicting next. Itstill chooses external pages until they represent less than 1 in 25 ofall (active or inactive) pages and at that point chooses internal. I donot propose this as a long term strategy but simply as a starting pointfor a more meaningful eviction policy. It is quite frankly ludicrouslysimplistic but nevertheless seemingly effective for these test cases atleast.

There are many parts of the current implementation that are negativelyaffecting performance. I have some speculative implementations thatreduce the HWHURD_PAT/TCM3 combination from the latest 9m to around 5mbut they can be discussed later if appropriate.

I'd welcome feedback on whether 'My_patch' should be submitted forconsideration.


Regards,

Mike.

GNU/Hurd swap performance improvements

Reply via email to