Hi Folks, I am finding anomalous behavior when I am trying to run dhclient process inside my docker container in vanilla Ubuntu 16.04 host. The service gets into "deactivating" state and is stuck forever. In the mail I have attached a minimalistic reproduction of the issue seen.
Working logic: - There is a sample trial@.service script which invokes the `trial` binary with the option passed to the systemd service via @ option - The valid options are sleep and dhclient_<interface_name> - The binary either invokes a long-lived sleep process or dhclient process on the said interface_name based on the input - The binary then spawns `kill_trial.sh` script. The script sleeps for 20 seconds and kills the parent `trial` binary. The kill signal is SIGKILL in the trial example. In the real-world, this can be a SIGSEGV indicating a crash in the parent process. - If the trial binary was started for sleep process things work fine and service goes into "failed" state as expected - However, in case of dhclient, the service is stuck in "deactivating" state if the underlying host OS is Ubuntu 16.04. This works well if the host is running Ubuntu 20.04. - We have kept TimeoutStopSec to infinity, because in real-word deployments, the core collection post a crash takes varying time depending on the memory config on the host. Steps to reproduce # tar -xf minimal_repro.tar -C minimal_repro/ # cd minimal_repro/ # docker build -t trial . # docker rm -f trial # docker run -it -d --net=host --privileged -v /sys/fs/cgroup:/sys/fs/cgroup:ro --name trial trial # docker exec -it trial bash # systemctl start trial@dhclient_eth1.service # #Keep monitoring trial@dhclient_eth1.service -- issue should be seen within 20-30 seconds on Ubuntu 16.04 host # systemctl status trial@dhclient_eth1.service ● trial@dhclient_eth1.service - Trial Loaded: loaded (/etc/systemd/system/trial@.service; static; vendor preset: enabled) Active: deactivating (stop-sigterm) (Result: signal) since Mon 2021-06-07 13:19:12 UTC; 1min 11s ago Process: 55 ExecStartPre=/bin/bash /etc/systemd/system/trial_service_script.sh pre_start dhclient_eth1 (code=exited, status=0/SUCCESS) Process: 56 ExecStart=/bin/bash /etc/systemd/system/trial_service_script.sh start dhclient_eth1 (code=killed, signal=KILL) Main PID: 56 (code=killed, signal=KILL) Tasks: 0 (limit: 38590) Memory: 588.0K CGroup: /docker/903fca0cee1387b7c2113a36ee5efdb3a25edd1e60584fe5da5d0c5b5ffd8241/system.slice/system-trial.slice/trial@dhclient_eth1.service # #NOTE: `Active: deactivating` -- in stuck state # #Running `systemctl daemon-reload` forces the service to go to failed state # systemctl start trial@sleep.service # #Keep monitoring trial@sleep.service -- would be killed in 20-30 seconds and goes into failed state as expected # # systemctl status trial@sleep.service ● trial@sleep.service - Trial Loaded: loaded (/etc/systemd/system/trial@.service; static; vendor preset: enabled) Active: failed (Result: signal) since Mon 2021-06-07 13:38:19 UTC; 21s ago Process: 113 ExecStartPre=/bin/bash /etc/systemd/system/trial_service_script.sh pre_start sleep (code=exited, status=0/SUCCESS) Process: 114 ExecStart=/bin/bash /etc/systemd/system/trial_service_script.sh start sleep (code=killed, signal=KILL) Process: 129 ExecStopPost=/bin/bash /etc/systemd/system/trial_service_script.sh post_stop sleep (code=exited, status=0/SUCCESS) Main PID: 114 (code=killed, signal=KILL) Please advise on what can help us in alleviating the issue. Thanks, Aravindhan Regards, Aravindhan Krishnan...
minimal_repro.tar
Description: Unix tar archive
_______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel