[Touch-packages] [Bug 1629226] Re: systemd's service killed by cgroup controller pids
Regarding the original report this is a simple program which keeps the maximal allowed children running and it does not get killed by cgroups, just the fork() call fails: --- #include #include #include #include #include #include #include #include #define MASTER_SLEEP_NS 100L #define CHILD_SLEEP_S 5 void main(void) { pid_t pid; struct timespec master_sleep = {0, MASTER_SLEEP_NS}; for (;;) { pid = fork(); if (pid < 0) { perror("fork failed:"); nanosleep(_sleep, NULL); } if (pid == 0) { sleep(CHILD_SLEEP_S); exit(0); } nanosleep(_sleep, NULL); /* collect exited children */ while (waitpid(-1, NULL, WNOHANG) > 0); } } --- [Unit] Description=Reproducer After=multi-user.target [Service] ExecStart=/home/user/reproducer Type=simple TasksMax=512 [Install] WantedBy=multi-user.target --- ● reproducer.service - Reproducer Loaded: loaded (/etc/systemd/system/reproducer.service; disabled; vendor preset: enabled) Active: active (running) since Mon 2017-05-22 13:16:50 UTC; 3min 3s ago Main PID: 11778 (reproducer) Tasks: 512 (limit: 512) Memory: 55.4M CPU: 1min 2.794s CGroup: /system.slice/reproducer.service ├─11778 /home/rbalint/reproducer ├─18144 /home/rbalint/reproducer ... ├─26763 /home/rbalint/reproducer ├─26764 /home/rbalint/reproducer ├─26765 /home/rbalint/reproducer └─26766 /home/rbalint/reproducer May 22 13:20:14 zesty-test reproducer[11778]: fork failed:: Resource temporarily unavailable May 22 13:20:14 zesty-test reproducer[11778]: fork failed:: Resource temporarily unavailable May 22 13:20:14 zesty-test reproducer[11778]: fork failed:: Resource temporarily unavailable --- Bash on the other hand kills itself after a few failing forks: ● reproducer.service - Reproducer Loaded: loaded (/etc/systemd/system/reproducer.service; disabled; vendor preset: enabled) Active: failed (Result: exit-code) since Mon 2017-05-22 13:22:38 UTC; 3s ago Process: 14281 ExecStart=/home/rbalint/reproducer.sh (code=exited, status=0/SUCCESS) Main PID: 14287 (code=exited, status=254) CPU: 639ms May 22 13:22:35 zesty-test reproducer.sh[14281]: /home/rbalint/reproducer.sh: fork: retry: Resource temporarily unavailable May 22 13:22:35 zesty-test reproducer.sh[14281]: /home/rbalint/reproducer.sh: fork: retry: Resource temporarily unavailable May 22 13:22:35 zesty-test reproducer.sh[14281]: /home/rbalint/reproducer.sh: fork: retry: Resource temporarily unavailable May 22 13:22:35 zesty-test reproducer.sh[14281]: /home/rbalint/reproducer.sh: fork: retry: Resource temporarily unavailable May 22 13:22:35 zesty-test reproducer.sh[14281]: /home/rbalint/reproducer.sh: fork: retry: Resource temporarily unavailable May 22 13:22:35 zesty-test reproducer.sh[14281]: /home/rbalint/reproducer.sh: fork: retry: Resource temporarily unavailable May 22 13:22:38 zesty-test reproducer.sh[14281]: /home/rbalint/reproducer.sh: fork: Interrupted system call May 22 13:22:38 zesty-test systemd[1]: reproducer.service: Main process exited, code=exited, status=254/n/a May 22 13:22:38 zesty-test systemd[1]: reproducer.service: Unit entered failed state. http://sources.debian.net/src/bash/4.4-5/jobs.c/?hl=1919#L1919 /* Create the child, handle severe errors. Retry on EAGAIN. */ while ((pid = fork ()) < 0 && errno == EAGAIN && forksleep < FORKSLEEP_MAX) { /* bash-4.2 */ sigprocmask (SIG_SETMASK, , (sigset_t *)NULL); /* If we can't create any children, try to reap some dead ones. */ waitchld (-1, 0); errno = EAGAIN; /* restore errno */ sys_error ("fork: retry"); RESET_SIGTERM; if (sleep (forksleep) != 0) break; forksleep <<= 1; if (interrupt_state) break; sigprocmask (SIG_SETMASK, , (sigset_t *)NULL); } ... if (pid < 0) { sys_error ("fork"); /* Kill all of the processes in the current pipeline. */ terminate_current_pipeline (); /* Discard the current pipeline, if any. */ if (the_pipeline) kill_current_pipeline (); last_command_exit_value = EX_NOEXEC; throw_to_top_level ();/* Reset signals, etc. */ } ... I believe this is by design and I think this approach is reasonable. A shell should not try to keep itself alive forking new processes when it hit system limits already for a few times. There are other tools available for implementing servers with worker pools which adapt to system limits which are not defined in advance. If you know the number of workers need in advance I suggest setting TasksMax to high enough or to infinity in case you don't want to rely on cgroup fork limits. ** Changed in: bash (Ubuntu) Status: In Progress => Invalid ** Summary changed: - systemd's service killed by cgroup controller pids + Bash exits after a few failed fork()-s -- You received this bug
[Touch-packages] [Bug 1629226] Re: systemd's service killed by cgroup controller pids
** Changed in: bash (Ubuntu) Status: Triaged => In Progress -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to bash in Ubuntu. https://bugs.launchpad.net/bugs/1629226 Title: systemd's service killed by cgroup controller pids Status in The Ubuntu-power-systems project: New Status in bash package in Ubuntu: In Progress Bug description: Problem Description === I write a simple systemd service which will fork child processes fiercely. But quickly the service failed: % sudo systemctl status reproducer.service ? reproducer.service - Reproducer of systemd services killed by ips Loaded: loaded (/etc/systemd/system/reproducer.service; disabled; vendor preset: enabled) Active: failed (Result: exit-code) since Fri 2016-03-18 06:58:37 CDT; 2min 43s ago Process: 5103 ExecStart=/home/hpt/reproducer/reproducer.sh (code=exited, status=0/SUCCESS) Main PID: 5105 (code=exited, status=254) Mar 18 06:58:36 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:36 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Main process exited, code=exited, status=254/n/a Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Unit entered failed state. Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Failed with result 'exit-code'. The default task limit of systemd services is 512. Looks like the service is terminated by the kernel's ips cgroup controller. I think this isn't correct. Child processes cannot be forked shouldn't cause parent to die. % cat /etc/systemd/system/reproducer.service [Unit] Description=Reproducer of systemd services killed by ips After=multi-user.target [Service] ExecStart=/home/hpt/reproducer/reproducer.sh Type=forking [Install] WantedBy=multi-user.target % cat /home/hpt/reproducer/reproducer.sh #!/bin/bash foo() { #exec sh -c "echo $1: \$\$;sleep 60" echo $1: sleep 60 } bar() { c=1 while true do for ((i=1;i<=2048;i++)) do foo $c & ((c++)) done wait c=1 done } # main bar & disown -a exit 0 ---uname output--- Linux pinelp3 4.4.0-12-generic #28-Ubuntu SMP Wed Mar 9 00:40:38 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux Machine Type = IBM,8408-E8E,lpar Steps to Reproduce 1. install the simple service in "Problem description" 2. sudo systemctl start reproducer.service 3. wait 2~3 minutes == Comment: #3 - Vaishnavi Bhat- 2016-03-22 11:21:55 == From the machine, root@pinelp3:~# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 48192 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 48192 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited root@pinelp3:~# ps aux | wc -l ->While the service is running 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l-->While the service is not running. 572 root@pinelp3:~# free -m --> While service is running totalusedfree shared buff/cache available Mem: 12117 628 459 22 11029
[Touch-packages] [Bug 1629226] Re: systemd's service killed by cgroup controller pids
** Changed in: bash (Ubuntu) Importance: Undecided => Medium ** Changed in: bash (Ubuntu) Status: New => Triaged ** Changed in: bash (Ubuntu) Milestone: None => ubuntu-17.05 ** Changed in: bash (Ubuntu) Assignee: Taco Screen team (taco-screen-team) => Balint Reczey (rbalint) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to bash in Ubuntu. https://bugs.launchpad.net/bugs/1629226 Title: systemd's service killed by cgroup controller pids Status in The Ubuntu-power-systems project: New Status in bash package in Ubuntu: Triaged Bug description: Problem Description === I write a simple systemd service which will fork child processes fiercely. But quickly the service failed: % sudo systemctl status reproducer.service ? reproducer.service - Reproducer of systemd services killed by ips Loaded: loaded (/etc/systemd/system/reproducer.service; disabled; vendor preset: enabled) Active: failed (Result: exit-code) since Fri 2016-03-18 06:58:37 CDT; 2min 43s ago Process: 5103 ExecStart=/home/hpt/reproducer/reproducer.sh (code=exited, status=0/SUCCESS) Main PID: 5105 (code=exited, status=254) Mar 18 06:58:36 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:36 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Main process exited, code=exited, status=254/n/a Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Unit entered failed state. Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Failed with result 'exit-code'. The default task limit of systemd services is 512. Looks like the service is terminated by the kernel's ips cgroup controller. I think this isn't correct. Child processes cannot be forked shouldn't cause parent to die. % cat /etc/systemd/system/reproducer.service [Unit] Description=Reproducer of systemd services killed by ips After=multi-user.target [Service] ExecStart=/home/hpt/reproducer/reproducer.sh Type=forking [Install] WantedBy=multi-user.target % cat /home/hpt/reproducer/reproducer.sh #!/bin/bash foo() { #exec sh -c "echo $1: \$\$;sleep 60" echo $1: sleep 60 } bar() { c=1 while true do for ((i=1;i<=2048;i++)) do foo $c & ((c++)) done wait c=1 done } # main bar & disown -a exit 0 ---uname output--- Linux pinelp3 4.4.0-12-generic #28-Ubuntu SMP Wed Mar 9 00:40:38 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux Machine Type = IBM,8408-E8E,lpar Steps to Reproduce 1. install the simple service in "Problem description" 2. sudo systemctl start reproducer.service 3. wait 2~3 minutes == Comment: #3 - Vaishnavi Bhat- 2016-03-22 11:21:55 == From the machine, root@pinelp3:~# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 48192 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 48192 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited root@pinelp3:~# ps aux | wc -l ->While the service is running 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l-->While the service is not running. 572
[Touch-packages] [Bug 1629226] Re: systemd's service killed by cgroup controller pids
** Also affects: ubuntu-power-systems Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to bash in Ubuntu. https://bugs.launchpad.net/bugs/1629226 Title: systemd's service killed by cgroup controller pids Status in The Ubuntu-power-systems project: New Status in bash package in Ubuntu: New Bug description: Problem Description === I write a simple systemd service which will fork child processes fiercely. But quickly the service failed: % sudo systemctl status reproducer.service ? reproducer.service - Reproducer of systemd services killed by ips Loaded: loaded (/etc/systemd/system/reproducer.service; disabled; vendor preset: enabled) Active: failed (Result: exit-code) since Fri 2016-03-18 06:58:37 CDT; 2min 43s ago Process: 5103 ExecStart=/home/hpt/reproducer/reproducer.sh (code=exited, status=0/SUCCESS) Main PID: 5105 (code=exited, status=254) Mar 18 06:58:36 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:36 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Main process exited, code=exited, status=254/n/a Mar 18 06:58:37 pinelp3 reproducer.sh[5103]: /home/hpt/reproducer/reproducer.sh: fork: Resource temporarily unavailable Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Unit entered failed state. Mar 18 06:58:37 pinelp3 systemd[1]: reproducer.service: Failed with result 'exit-code'. The default task limit of systemd services is 512. Looks like the service is terminated by the kernel's ips cgroup controller. I think this isn't correct. Child processes cannot be forked shouldn't cause parent to die. % cat /etc/systemd/system/reproducer.service [Unit] Description=Reproducer of systemd services killed by ips After=multi-user.target [Service] ExecStart=/home/hpt/reproducer/reproducer.sh Type=forking [Install] WantedBy=multi-user.target % cat /home/hpt/reproducer/reproducer.sh #!/bin/bash foo() { #exec sh -c "echo $1: \$\$;sleep 60" echo $1: sleep 60 } bar() { c=1 while true do for ((i=1;i<=2048;i++)) do foo $c & ((c++)) done wait c=1 done } # main bar & disown -a exit 0 ---uname output--- Linux pinelp3 4.4.0-12-generic #28-Ubuntu SMP Wed Mar 9 00:40:38 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux Machine Type = IBM,8408-E8E,lpar Steps to Reproduce 1. install the simple service in "Problem description" 2. sudo systemctl start reproducer.service 3. wait 2~3 minutes == Comment: #3 - Vaishnavi Bhat- 2016-03-22 11:21:55 == From the machine, root@pinelp3:~# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 48192 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 48192 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited root@pinelp3:~# ps aux | wc -l ->While the service is running 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l " 1084 root@pinelp3:~# ps aux | wc -l-->While the service is not running. 572 root@pinelp3:~# free -m --> While service is running totalusedfree shared buff/cache available Mem: 12117 628 459 22 11029