On Jul 27, 2025, at 00:33, Mark Millard <mark...@yahoo.com> wrote: > On Jul 23, 2025, at 01:42, Mark Millard <mark...@yahoo.com> wrote: > >> In a context with RAM+SWAP = 704 GiBytes (192 GiBytes being RAM, >> 512 GiBytes being SWAP) doing poudriere bulk -Ca builds at some >> point ends up with reports like: >> >> swp_pager_getswapspace(22): failed >> >> and: >> >> was killed: failed to reclaim memory >> >> for 12 builders, MAKE_JOBS_NUMBER=3 , TMPFS_BLACKLIST >> in use, 32 FreeBSD cpus, etc. >> >> For example: >> >> . . . >> Jul 22 10:17:27 7950X3D-ZFS kernel: pid 62915 (scc_16815), jid 780, uid 0: >> exited on signal 11 (core dumped) >> Jul 22 21:38:10 7950X3D-ZFS kernel: ue0: link state changed to DOWN >> Jul 22 21:38:10 7950X3D-ZFS kernel: ue0: link state changed to UP >> Jul 22 21:38:29 7950X3D-ZFS kernel: swap_pager: out of swap space >> Jul 22 21:38:29 7950X3D-ZFS kernel: swp_pager_getswapspace(22): failed >> Jul 22 21:39:11 7950X3D-ZFS kernel: pid 15059 (dot), jid 780, uid 0, was >> killed: failed to reclaim memory >> Jul 22 21:43:38 7950X3D-ZFS kernel: swap_pager: out of swap space >> Jul 22 21:43:38 7950X3D-ZFS kernel: swp_pager_getswapspace(14): failed >> Jul 22 21:44:04 7950X3D-ZFS kernel: pid 15049 (dot), jid 780, uid 0, was >> killed: failed to reclaim memory >> Jul 22 21:56:39 7950X3D-ZFS kernel: swap_pager: out of swap space >> Jul 22 21:56:39 7950X3D-ZFS kernel: swp_pager_getswapspace(15): failed >> Jul 22 21:57:12 7950X3D-ZFS kernel: pid 15045 (dot), jid 780, uid 0, was >> killed: failed to reclaim memory >> >> I've not figured out a way to track down such messages >> back to the relevant log file for the builds that were >> killed. Neither the pid, nor the jid appear in >> the log files. Similarly, nothing in /var/log/messages >> identifies the poudriere Job Id or other such. >> >> (I've never happened to be actively monitoring when >> the issue happened. So I've always ended up looking at >> it after the fact.) >> >> It would be nice to be able to identify what specific >> packages to try to rebuild for these --and to investigate >> why the SWAP usage that had stayed under 2 GiByte ended >> up reaching 512 GiBytes during that period. > > A panic from the activity during another bulk -Ca > test lead to the dump providing enough context to > track down the package that was being built that > got the issue and what is was running that, in > turn, has the problem memory usage: > > [2D:01:22:29] [06] [00:00:00] Building graphics/sdl2_gpu | sdl2_gpu-0.12.0 > > was using: > > UID PID PPID C PRI NI VSZ RSS MWCHAN STAT TT TIME > COMMAND > . . . > 0 79229 40923 4 59 0 23524 4148 wait D - 0:00.00 [sh] > 0 79230 79229 5 59 0 14208 172 wait Ds - 0:00.01 > [make] > 0 79233 79230 4 59 0 14668 176 wait D - 0:00.00 [sh] > 0 79234 79233 5 59 0 14668 176 wait D - 0:00.00 [sh] > 0 79235 79234 12 0 0 16284 356 select D - 0:00.01 > [ninja] > 0 79236 79235 28 59 0 223048 1052 uwait D - 0:00.44 > [doxygen] > 0 79272 79236 25 59 0 157589964 41424308 pfault D - 3:25.33 > [dot] > 0 79279 79236 31 59 0 157601740 41513520 pfault D - 3:23.41 > [dot] > 0 79289 79236 14 59 0 157589964 41361600 pfault D - 3:22.72 > [dot] > 0 79301 79236 18 49 0 157667276 41208476 pfault D - 3:24.32 > [dot] > . . . > > Part of the context was the /06/ text in: > . . . > root dot 79301 0 > /usr/local/poudriere/data/.m/main-ZNV4-bulk_a-alt/06/dev 20 crw-rw-rw- > null r > root dot 79289 0 > /usr/local/poudriere/data/.m/main-ZNV4-bulk_a-alt/06/dev 20 crw-rw-rw- > null r > . . . > root dot 79279 0 > /usr/local/poudriere/data/.m/main-ZNV4-bulk_a-alt/06/dev 20 crw-rw-rw- > null r > . . . > root dot 79272 0 > /usr/local/poudriere/data/.m/main-ZNV4-bulk_a-alt/06/dev 20 crw-rw-rw- > null r > . . . > root doxygen 79236 0 > /usr/local/poudriere/data/.m/main-ZNV4-bulk_a-alt/06/dev 20 crw-rw-rw- > null r > . . . > > It identifies the [06] builder and the "Building" notice had made it to > the disk before the panic happened. Then I could check the Makefile for > if doxygen was used and it was. graphics/sdl2_gp historical build logs > suggest problems exist.
Dumb typo, missing the "u" in "gpu", so: graphics/sdl2_gpu === Mark Millard marklmi at yahoo.com