Hi,

We have during boot received the "Supervising process..." warning from systemd 
related to chronyd.service. This is not always happening, but when it happens 
systemd receives SIGCHLD from grand-parent (22955) before the parent (22956). 
See logs below:


Oct 25 10:34:55.104980 axis-accc8ed1c728 systemd[22955]: chronyd.service: 
Executing: /usr/sbin/chronyd -u chronyd -f /run/chrony/chronyd.conf
Oct 25 10:34:55.117999 axis-accc8ed1c728 chronyd[22957]: chronyd version 4.2 
starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER -SIGND +ASYNCDNS +NTS 
-SECHASH +IPV6 +DEBUG)
Oct 25 10:34:55.120781 axis-accc8ed1c728 chronyd[22957]: Frequency -8.172 +/- 
1.366 ppm read from /var/lib/chrony/drift
Oct 25 10:34:55.124304 axis-accc8ed1c728 systemd[1]: systemd-journald.service: 
Received EPOLLHUP on stored fd 82 (stored), closing.
Oct 25 10:34:55.126460 axis-accc8ed1c728 systemd[1]: Received SIGCHLD from PID 
22955 (chronyd).
Oct 25 10:34:55.126708 axis-accc8ed1c728 systemd[1]: Child 22955 (chronyd) died 
(code=exited, status=0/SUCCESS)
Oct 25 10:34:55.126920 axis-accc8ed1c728 systemd[1]: chronyd.service: Child 
22955 belongs to chronyd.service.
Oct 25 10:34:55.127000 axis-accc8ed1c728 systemd[1]: chronyd.service: Control 
process exited, code=exited, status=0/SUCCESS (success)
Oct 25 10:34:55.127027 axis-accc8ed1c728 systemd[1]: chronyd.service: Got final 
SIGCHLD for state start.
Oct 25 10:34:55.127160 axis-accc8ed1c728 systemd[1]: chronyd.service: 
Potentially unsafe symlink chain, will now retry with relaxed checks: 
/run/chrony/chronyd.pid
Oct 25 10:34:55.127571 axis-accc8ed1c728 systemd[1]: chronyd.service: New main 
PID 22957 belongs to service, we are happy.
Oct 25 10:34:55.127598 axis-accc8ed1c728 systemd[1]: chronyd.service: Main PID 
loaded: 22957
Oct 25 10:34:55.127759 axis-accc8ed1c728 systemd[1]: Custom log in 
process-util.c fnc pid_is_my_child(): pid: 22957, ppid: 22956, cached_pid: 1.
Oct 25 10:34:55.127785 axis-accc8ed1c728 systemd[1]: chronyd.service: 
Supervising process 22957 which is not our child. We'll most likely not notice 
when it exits.
Oct 25 10:34:55.127964 axis-accc8ed1c728 systemd[1]: chronyd.service: Changed 
start -> running
Oct 25 10:34:55.128006 axis-accc8ed1c728 systemd[1]: chronyd.service: Job 
117032 chronyd.service/start finished, result=done
Oct 25 10:34:55.128053 axis-accc8ed1c728 systemd[1]: Started NTP client/server.
...
Oct 25 10:34:55.158173 axis-accc8ed1c728 systemd[1]: Received SIGCHLD from PID 
22956 (chronyd).
Oct 25 10:34:55.158436 axis-accc8ed1c728 systemd[1]: Child 22956 (chronyd) died 
(code=exited, status=0/SUCCESS)
Oct 25 10:34:55.158679 axis-accc8ed1c728 systemd[1]: chronyd.service: Child 
22956 belongs to chronyd.service.


The chronyd does two forks. In the normal case the parent will die first and 
then the grand-parent will die. This behavior is according to the SysV Daemons 
implementation https://www.freedesktop.org/software/systemd/man/daemon.html 
However, it seems scheduling for parent and grand-parent can vary and result in 
a different behavior.


When systemd receives the SIGCHLD from the grand-parent (22955) it start to run 
service_load_pid_file() and set the new main PID, which also later triggers the 
warning "Supervising process...", but shouldn't systemd wait for the parent 
(22956) to the main PID to die before considering the service as started? Now 
chronyd is started before the parent (22956) has died, indicated by the log 
entry "Started NTP client/server." above. In 
https://www.freedesktop.org/software/systemd/man/systemd.service.html it is 
only mentioning that when the parent process exits then the unit is started. It 
doesn't mention grand-parent when type=forking and the service binary is doing 
double fork().


Testing with systemd 251.3.


Best Regards,

Christopher Wong

Reply via email to