After doing a bit more research I stumbled upon this:
Looks like systemd imposes another, smaller limit on the number of
processes that a user can run:
It would have been nice if the 'fork' man page mentioned that this could
be a cause for failure. :(
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
Call to fork/clone fails with EAGAIN (before encountering resource
Status in linux package in Ubuntu:
I wrote a test program that forks processes until the fork calls start
to fail. It forks around 12000 processes and then the fork calls
start failing with EAGAIN. According to the fork man page, there are
four conditions that could cause EAGAIN to be returned:
- the RLIMIT_NPROC soft resource limit, which limits the number of processes
and threads for a real user ID, was reached
- the kernel's system-wide limit on the number of processes and threads,
/proc/sys/kernel/threads-max, was reached
- the maximum number of PIDs, /proc/sys/kernel/pid_max, was reached
- The caller is operating under the SCHED_DEADLINE scheduling policy and does
not have the reset-on-fork flag set
On my machine:
- Before running the program, ~250 processes / ~500 threads are running (as
determined by ps)
- RLIMIT_NPROC (soft and hard) is 31616
- threads-max is 63233
- pid_max is 32768
- the program runs with the SCHED_NORMAL scheduling policy (so, not
It seems strange that the fork calls fail after ~12000 forks, (it
should fail at 31616.) Some more technical details:
- Reproducible on Ubuntu 16.04.1 running with kernel 4.4.0-36-generic.
- Reproducible when tested with mainline kernel 4.8.0-040800rc6-generic
- Doesn't occur on Ubuntu 12.04 running with kernel 3.2.0-23-generic
- Monitoring thread usage, it appears to fail at exactly the 12,500 thread
- From using strace, it looks like clone is the syscall actually being used
behind the scenes (should have the same EAGAIN error semantics, from the clone
- From using systemtap and ftrace, it looks like copy_process in _do_fork
returns an error when this case is hit. Maybe from sched_trace? It's hard to
tell - the ftrace output doesn't seem complete.
I'm attaching the test fork program I've been using, which has some
code to also print the aforementioned values.
DistroRelease: Ubuntu 16.04
Package: linux-image-4.4.0-36-generic 4.4.0-36.55
ProcVersionSignature: Ubuntu 4.4.0-36.55-generic 4.4.16
Uname: Linux 4.4.0-36-generic x86_64
USER PID ACCESS COMMAND
/dev/snd/controlC1: shockwave 3454 F.... pulseaudio
/dev/snd/controlC0: shockwave 3454 F.... pulseaudio
Date: Thu Sep 15 11:07:16 2016
InstallationDate: Installed on 2016-09-12 (3 days ago)
InstallationMedia: Ubuntu 16.04.1 LTS "Xenial Xerus" - Release amd64
lo no wireless extensions.
eno1 no wireless extensions.
MachineType: Dell Inc. Precision T1600
ProcFB: 0 nouveaufb
root=UUID=e15cb9c6-c1a4-4313-9067-340edb6098a1 ro quiet splash vt.handoff=7
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.vendor: Dell Inc.
dmi.board.vendor: Dell Inc.
dmi.chassis.vendor: Dell Inc.
dmi.product.name: Precision T1600
dmi.sys.vendor: Dell Inc.
To manage notifications about this bug go to:
Mailing list: https://launchpad.net/~kernel-packages
Post to : email@example.com
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp