Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-22 Thread Lennart Poettering
On Tue, 21.04.15 13:25, Daniel Drake (dr...@endlessm.com) wrote:

 There's a comment in unit_kill_context() which looks relevant here:
 
 /* FIXME: For now, we will not wait for the
  * cgroup members to die, simply because
  * cgroup notification is unreliable. It
  * doesn't work at all in containers, and
  * outside of containers it can be confused
  * easily by leaving directories in the
  * cgroup. */
 
 /* wait_for_exit = true; */

This is indeed the key of the issue.

As soon as we move to using the new sane behaviour kernel cgroup API
we can fix this properly, and wait for the children correctly. As soon
as that is not the case though we send SIGKILL immediately after the
SIGTERM...

I am a bit unwilling to document the precise behaviour, since the
current behaviour is really just a stop-gap until we ported things
over to the new kernel API and this will work as intended.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-21 Thread Daniel Drake
On Mon, Apr 20, 2015 at 6:29 PM, Lennart Poettering
lenn...@poettering.net wrote:
 Sure, we don't want to keep track of which processes we already
 killed, to distuingish them from the processes newly created in the
 time between our sending of SIGTERM and receiving SIGCHLD for the main
 process.

 We assume that if we get SIGCHLD for the main process that the daemon
 is down, and everything that is left over then is auxiliary stuff we
 can kill.

OK, doesn't sound unreasonable. Once we get to the end of this topic,
I'll submit a documentation patch to make that a bit clearer.

So, of the 3 signals (TERM, TERM, KILL) sent to gdm-simple-slave
within a total time of 0.01s, we have good explanations for the first
2.

The 3rd one (KILL) is still suspicious to me though. It is sent 0.4ms
after the preceding SIGTERM, here is what happens in the code:

1. gdm's main process exits due to the first SIGTERM. systemd becomes
aware in service_sigchld_event(), and responds as follows:

case SERVICE_STOP_SIGTERM:
case SERVICE_STOP_SIGKILL:
if (!control_pid_good(s))
service_enter_stop_post(s, f);

2. Inside service_enter_stop post, there is no command to execute, so we call:
service_enter_signal(s, SERVICE_FINAL_SIGTERM, SERVICE_SUCCESS);

3. service_enter_signal calls unit_kill_context() to send the second
SIGTERM. Looking at what happens inside unit_kill_context(): there is
no main process, nor control process, so we go straight to the cgroup
killing. The cgroup kill happens without error, and we reach the end
of the function:

return wait_for_exit;

wait_for_exit was not modified from its intial value (false) during
the course of the function, so false is returned here.

4. Back in service_enter_signal, since unit_kill_context returned
false, we do not arm the timer. Without hesitation systemd goes
directly and sends SIGKILL.

} else if (state == SERVICE_FINAL_SIGTERM)
service_enter_signal(s, SERVICE_FINAL_SIGKILL, SERVICE_SUCCESS)


I can understand that once the main PID goes away, systemd feels
welcome to get heavy handed with the remaining processes. But doing
SIGTERM and then immediately SIGKILL just a few microseconds later
seems strange - why not go straight for the SIGKILL?

There's a comment in unit_kill_context() which looks relevant here:

/* FIXME: For now, we will not wait for the
 * cgroup members to die, simply because
 * cgroup notification is unreliable. It
 * doesn't work at all in containers, and
 * outside of containers it can be confused
 * easily by leaving directories in the
 * cgroup. */

/* wait_for_exit = true; */

If that were uncommented, the above behaviour would be different.

Daniel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Lennart Poettering
On Mon, 20.04.15 13:16, Daniel Drake (dr...@endlessm.com) wrote:

 On Mon, Apr 20, 2015 at 9:04 AM, Lennart Poettering
 lenn...@poettering.net wrote:
  maybe the main gdm process is not the one waiting, but a worker
  process is, and the main process kills the worker process without the
  worker process handling that nicely?
 
 Not really. I removed all the process-killing code from gdm and the
 problem is still there.
 
 I have stepped through and I think that systemd is being too
 aggressive. Still running with the default KillMode=cgroup, here is
 what happens:
 
 1. service_enter_stop() is entered which calls:
 service_enter_signal(s, SERVICE_STOP_SIGTERM, 
 SERVICE_SUCCESS);
 
 2. service_enter_signal sends SIGTERM to all gdm processes.

No, if you use KillMode=mixed (as you say you do) it will only send
SIGTERM to the main process of gdm.

 3. gdm simple-slave's signal handler triggers, which causes the
 mainloop to exit, and it starts to kill and wait for the X server
 death. I'm not exactly sure why, but quitting the glib mainloop also
 causes the signal handler to be destroyed, so sigaction() is called
 here to return SIGTERM to its default behaviour.
 
 4. Moments later we arrive in systemd's service_sigchld_event(),
 presumably because the main gdm process exited due to SIGTERM.
 s-main_pid == pid. 

If PID 1 gets the SIGCHLD for the main process then it assumes the
service has finished correctly, and will kill the rest that might remain.

 7. To make things even worse, after sending the SIGTERMs,
 service_enter_signal hits:
 } else if (state == SERVICE_FINAL_SIGTERM)
 service_enter_signal(s, SERVICE_FINAL_SIGKILL,
 SERVICE_SUCCESS);

Hmm? if we managed to kill something we'll arm the timeout and wait
for sigchld or cgroup empty or similar.

These shortcuts only take place if we couldn't kill anything because
there was nothing. And hence the second killing will have no effect
either, but at least we go through the state engine...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Marcos Mello
Daniel Drake drake at endlessm.com writes:

 
 So, moments after sending 2 SIGTERMs, SIGKILL is sent to all gdm
 processes. There does not seem to be any consideration of giving the
 process some time to respond to SIGTERMs, nor the fact that I have
 hacked gdm.service to have SendSIGKILL=no as an experiment.
 

I noticed that too with SendSIGKILL=no.

http://lists.freedesktop.org/archives/systemd-devel/2015-March/029933.html
http://lists.freedesktop.org/archives/systemd-devel/2015-April/030196.html

Squid is not a good example of how a daemon should behave though.

--
Marcos

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Lennart Poettering
On Mon, 20.04.15 18:13, Daniel Drake (dr...@endlessm.com) wrote:

  3. gdm simple-slave's signal handler triggers, which causes the
  mainloop to exit, and it starts to kill and wait for the X server
  death. I'm not exactly sure why, but quitting the glib mainloop also
  causes the signal handler to be destroyed, so sigaction() is called
  here to return SIGTERM to its default behaviour.
 
  4. Moments later we arrive in systemd's service_sigchld_event(),
  presumably because the main gdm process exited due to SIGTERM.
  s-main_pid == pid.
 
  If PID 1 gets the SIGCHLD for the main process then it assumes the
  service has finished correctly, and will kill the rest that might remain.
 
 Even if we already killed the rest just a few milliseconds ago (in
 #2)?

Sure, we don't want to keep track of which processes we already
killed, to distuingish them from the processes newly created in the
time between our sending of SIGTERM and receiving SIGCHLD for the main
process.

We assume that if we get SIGCHLD for the main process that the daemon
is down, and everything that is left over then is auxiliary stuff we
can kill. 

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Daniel Drake
On Mon, Apr 20, 2015 at 6:04 PM, Lennart Poettering
lenn...@poettering.net wrote:
 I have stepped through and I think that systemd is being too
 aggressive. Still running with the default KillMode=cgroup, here is
 what happens:

 1. service_enter_stop() is entered which calls:
 service_enter_signal(s, SERVICE_STOP_SIGTERM, 
 SERVICE_SUCCESS);

 2. service_enter_signal sends SIGTERM to all gdm processes.

 No, if you use KillMode=mixed (as you say you do) it will only send
 SIGTERM to the main process of gdm.

Only bleeding edge gdm has KillMode=mixed. I'm using a slightly older
version which has the default KillMode=cgroup. Sorry for the
confusion.

 3. gdm simple-slave's signal handler triggers, which causes the
 mainloop to exit, and it starts to kill and wait for the X server
 death. I'm not exactly sure why, but quitting the glib mainloop also
 causes the signal handler to be destroyed, so sigaction() is called
 here to return SIGTERM to its default behaviour.

 4. Moments later we arrive in systemd's service_sigchld_event(),
 presumably because the main gdm process exited due to SIGTERM.
 s-main_pid == pid.

 If PID 1 gets the SIGCHLD for the main process then it assumes the
 service has finished correctly, and will kill the rest that might remain.

Even if we already killed the rest just a few milliseconds ago (in #2)?

 7. To make things even worse, after sending the SIGTERMs,
 service_enter_signal hits:
 } else if (state == SERVICE_FINAL_SIGTERM)
 service_enter_signal(s, SERVICE_FINAL_SIGKILL,
 SERVICE_SUCCESS);

 Hmm? if we managed to kill something we'll arm the timeout and wait
 for sigchld or cgroup empty or similar.

 These shortcuts only take place if we couldn't kill anything because
 there was nothing. And hence the second killing will have no effect
 either, but at least we go through the state engine...

I added logging to sys_kill at the kernel level, and I definitely
observe systemctl stop gdm causing PID 1 to kill gdm-simple-slave 3
times (TERM, TERM, KILL) within the space of a few milliseconds.
I will look closer tomorrow to explain in more detail what is going on
at the code level.

Thanks for your help!
Daniel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Lennart Poettering
On Sun, 19.04.15 09:34, Andrei Borzenkov (arvidj...@gmail.com) wrote:

 В Fri, 17 Apr 2015 14:04:18 -0600
 Daniel Drake dr...@endlessm.com пишет:
 
  Hi,
  
  I'm investigating why systemctl stop gdm; Xorg usually fails. The
  new X process complains that X is still running.
  
  Here's what I think is happening:
  
  1. systemd sends SIGTERM to gdm to stop the service
  
  2. gdm exits - it has a simple SIGTERM handler which just quits the
  mainloop without doing any cleanup (as far as I can see, it doesn't
  make any attempt to kill the child X server)
  
  3. X exits because of PR_SET_PDEATHSIG (i.e. it's set to be
  automatically killed when the parent goes away). The killed process
  enters defunct state and is reparented to PID 1, presumably also
  moving it out of the gdm cgroup.
  
 
 No, it remains in cgroup. Otherwise systemd service management would
 not be possible at all ...
 
  4. systemd notes that gdm's cgroup is empty and decides that gdm is
  now successfully stopped.
  
 
 I looked at display-manager.service here and it sets KillMode=process.
 That is better explanation to your observation.

Hmm, it does? It does not on Fedora. Also display-manager.service is
just an alias to gdm.service on Fedora.

Daniel, can you check with systemctl cat gdm what your distro
configures there?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Lennart Poettering
On Fri, 17.04.15 14:04, Daniel Drake (dr...@endlessm.com) wrote:

 I'm investigating why systemctl stop gdm; Xorg usually fails. The
 new X process complains that X is still running.

Have you checked what precisely fails? What's the error message you
are getting? What is the actual failing routine?

 Here's what I think is happening:
 
 1. systemd sends SIGTERM to gdm to stop the service
 
 2. gdm exits - it has a simple SIGTERM handler which just quits the
 mainloop without doing any cleanup (as far as I can see, it doesn't
 make any attempt to kill the child X server)
 
 3. X exits because of PR_SET_PDEATHSIG (i.e. it's set to be
 automatically killed when the parent goes away). The killed process
 enters defunct state and is reparented to PID 1, presumably also
 moving it out of the gdm cgroup.

zombie processes indeed do not belong to any cgroup anymore. 

 4. systemd notes that gdm's cgroup is empty and decides that gdm is
 now successfully stopped.

Note that SIGCHLD is processed with higher prorirty that cgroup empty
events by systemd. This means that if both are queued, SIGCHLD and
reaping of the PIDs should always happen first.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Daniel Drake
On Mon, Apr 20, 2015 at 8:24 AM, Lennart Poettering
lenn...@poettering.net wrote:
 On Sun, 19.04.15 09:34, Andrei Borzenkov (arvidj...@gmail.com) wrote:

 В Fri, 17 Apr 2015 14:04:18 -0600
 Daniel Drake dr...@endlessm.com пишет:

  Hi,
 
  I'm investigating why systemctl stop gdm; Xorg usually fails. The
  new X process complains that X is still running.
 
  Here's what I think is happening:
 
  1. systemd sends SIGTERM to gdm to stop the service
 
  2. gdm exits - it has a simple SIGTERM handler which just quits the
  mainloop without doing any cleanup (as far as I can see, it doesn't
  make any attempt to kill the child X server)
 
  3. X exits because of PR_SET_PDEATHSIG (i.e. it's set to be
  automatically killed when the parent goes away). The killed process
  enters defunct state and is reparented to PID 1, presumably also
  moving it out of the gdm cgroup.
 

 No, it remains in cgroup. Otherwise systemd service management would
 not be possible at all ...

  4. systemd notes that gdm's cgroup is empty and decides that gdm is
  now successfully stopped.
 

 I looked at display-manager.service here and it sets KillMode=process.
 That is better explanation to your observation.

 Hmm, it does? It does not on Fedora. Also display-manager.service is
 just an alias to gdm.service on Fedora.

 Daniel, can you check with systemctl cat gdm what your distro
 configures there?

gdm git does have KillMode=mixed, but the slightly old gdm I'm running
here also does not have any KillMode assignment.

I'm investigating further at the moment. I've found a mistake in what
I wrote earlier - when gdm receives SIGTERM it *does* do a
kill/waitpid() on the child X server.
However the process seems to disappear before waitpid() returns -
currently trying to understand why. Ideas welcome.

Thanks for the help.
Daniel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Lennart Poettering
On Mon, 20.04.15 08:54, Daniel Drake (dr...@endlessm.com) wrote:

 gdm git does have KillMode=mixed, but the slightly old gdm I'm running
 here also does not have any KillMode assignment.

KillMode=mixed means that systemd will SIGKILL all cgroup member
processes before stop returns.

 
 I'm investigating further at the moment. I've found a mistake in what
 I wrote earlier - when gdm receives SIGTERM it *does* do a
 kill/waitpid() on the child X server.
 However the process seems to disappear before waitpid() returns -
 currently trying to understand why. Ideas welcome.

maybe the main gdm process is not the one waiting, but a worker
process is, and the main process kills the worker process without the
worker process handling that nicely?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-20 Thread Daniel Drake
On Mon, Apr 20, 2015 at 9:04 AM, Lennart Poettering
lenn...@poettering.net wrote:
 maybe the main gdm process is not the one waiting, but a worker
 process is, and the main process kills the worker process without the
 worker process handling that nicely?

Not really. I removed all the process-killing code from gdm and the
problem is still there.

I have stepped through and I think that systemd is being too
aggressive. Still running with the default KillMode=cgroup, here is
what happens:

1. service_enter_stop() is entered which calls:
service_enter_signal(s, SERVICE_STOP_SIGTERM, SERVICE_SUCCESS);

2. service_enter_signal sends SIGTERM to all gdm processes.

3. gdm simple-slave's signal handler triggers, which causes the
mainloop to exit, and it starts to kill and wait for the X server
death. I'm not exactly sure why, but quitting the glib mainloop also
causes the signal handler to be destroyed, so sigaction() is called
here to return SIGTERM to its default behaviour.

4. Moments later we arrive in systemd's service_sigchld_event(),
presumably because the main gdm process exited due to SIGTERM.
s-main_pid == pid. We respond as follows:

case SERVICE_STOP_SIGTERM:
case SERVICE_STOP_SIGKILL:
if (!control_pid_good(s))
service_enter_stop_post(s, f);

5. Inside service_enter_stop post, there is no command to execute, so we call:
service_enter_signal(s, SERVICE_FINAL_SIGTERM, SERVICE_SUCCESS);

6. service_enter_signal causes all remaining gdm processes to receive
SIGTERM again, only moments after the previous one. As gdm
simple-slave now has the default SIGTERM handler (instant death), it
dies, before it has finished the X server cleanup :(

7. To make things even worse, after sending the SIGTERMs,
service_enter_signal hits:
} else if (state == SERVICE_FINAL_SIGTERM)
service_enter_signal(s, SERVICE_FINAL_SIGKILL, SERVICE_SUCCESS);

So, moments after sending 2 SIGTERMs, SIGKILL is sent to all gdm
processes. There does not seem to be any consideration of giving the
process some time to respond to SIGTERMs, nor the fact that I have
hacked gdm.service to have SendSIGKILL=no as an experiment.

Daniel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-19 Thread Andrei Borzenkov
В Fri, 17 Apr 2015 14:04:18 -0600
Daniel Drake dr...@endlessm.com пишет:

 Hi,
 
 I'm investigating why systemctl stop gdm; Xorg usually fails. The
 new X process complains that X is still running.
 
 Here's what I think is happening:
 
 1. systemd sends SIGTERM to gdm to stop the service
 
 2. gdm exits - it has a simple SIGTERM handler which just quits the
 mainloop without doing any cleanup (as far as I can see, it doesn't
 make any attempt to kill the child X server)
 
 3. X exits because of PR_SET_PDEATHSIG (i.e. it's set to be
 automatically killed when the parent goes away). The killed process
 enters defunct state and is reparented to PID 1, presumably also
 moving it out of the gdm cgroup.
 

No, it remains in cgroup. Otherwise systemd service management would
not be possible at all ...

 4. systemd notes that gdm's cgroup is empty and decides that gdm is
 now successfully stopped.
 

I looked at display-manager.service here and it sets KillMode=process.
That is better explanation to your observation.

 5. systemctl returns and now Xorg is launched immediately. Xorg reads
 the PID of the old Xorg process from /tmp, and notices that that PID
 is still in use (it is still an unreaped zombie) because kill()
 doesn't return an error. Xorg aborts thinking that it is already
 running.
 
 6. Moments later, systemd reaps the zombie. Oops, too late.
 
 
 Does that make sense?
 I wonder how it is best to fix this. Is it a bug that systemd decided
 that gdm.service had stopped before it had reaped zombie processes
 that originally belonged to gdm?
 
 Is it a gdm bug that killing gdm doesn't make any attempt to reap X
 before going away itself? (they chose PR_SET_PDEATHSIG to do something
 similar, but maybe we have to argue that it is not quite sufficient)
 
 Thanks
 Daniel
 ___
 systemd-devel mailing list
 systemd-devel@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/systemd-devel

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Zombie process still exists after stopping gdm.service

2015-04-17 Thread Daniel Drake
Hi,

I'm investigating why systemctl stop gdm; Xorg usually fails. The
new X process complains that X is still running.

Here's what I think is happening:

1. systemd sends SIGTERM to gdm to stop the service

2. gdm exits - it has a simple SIGTERM handler which just quits the
mainloop without doing any cleanup (as far as I can see, it doesn't
make any attempt to kill the child X server)

3. X exits because of PR_SET_PDEATHSIG (i.e. it's set to be
automatically killed when the parent goes away). The killed process
enters defunct state and is reparented to PID 1, presumably also
moving it out of the gdm cgroup.

4. systemd notes that gdm's cgroup is empty and decides that gdm is
now successfully stopped.

5. systemctl returns and now Xorg is launched immediately. Xorg reads
the PID of the old Xorg process from /tmp, and notices that that PID
is still in use (it is still an unreaped zombie) because kill()
doesn't return an error. Xorg aborts thinking that it is already
running.

6. Moments later, systemd reaps the zombie. Oops, too late.


Does that make sense?
I wonder how it is best to fix this. Is it a bug that systemd decided
that gdm.service had stopped before it had reaped zombie processes
that originally belonged to gdm?

Is it a gdm bug that killing gdm doesn't make any attempt to reap X
before going away itself? (they chose PR_SET_PDEATHSIG to do something
similar, but maybe we have to argue that it is not quite sufficient)

Thanks
Daniel
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel