Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
Hi list, On Wed, Jan 22, 2014 at 10:24:24AM +0100, Djalal Harouni wrote: On Thu, Jan 16, 2014 at 11:19:08AM +0100, Djalal Harouni wrote: On Thu, Jan 16, 2014 at 06:01:58AM +0100, Zbigniew Jędrzejewski-Szmek wrote: Indeed, in a container (without your patches), sessions remain in closing state. But with your patches, systemd --user instance is started and killed immediately during login. Not good either :) Ok, ouch! It seems that the systemd --user instance is killed without my patches! Systemd git will confirm it. Ok, after some thoughts and source code analysis, I came to the conclusion that in order to correctly clean session and user state files, we need: a) Improve session_check_gc() and user_check_gc() by using the session and user states. This implies b) b) Make sure that the state of the session and the user is always correctly set. Just to let you know that I've a v2 series, a more complete one. I'll clean it and send it, thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Wed, Jan 22, 2014 at 09:51:05AM +, Colin Guthrie wrote: 'Twas brillig, and Djalal Harouni at 22/01/14 09:24 did gyre and gimble: On Thu, Jan 16, 2014 at 11:19:08AM +0100, Djalal Harouni wrote: On Thu, Jan 16, 2014 at 06:01:58AM +0100, Zbigniew Jędrzejewski-Szmek wrote: Indeed, in a container (without your patches), sessions remain in closing state. But with your patches, systemd --user instance is started and killed immediately during login. Not good either :) Ok, ouch! It seems that the systemd --user instance is killed without my patches! Systemd git will confirm it. Ok, after some thoughts and source code analysis, I came to the conclusion that in order to correctly clean session and user state files, we need: a) Improve session_check_gc() and user_check_gc() by using the session and user states. This implies b) b) Make sure that the state of the session and the user is always correctly set. FWIW, I have some backported patches in Mageia that are not (yet) in the fedora package. As you mentioned Fedora earlier I figure that's what you are using: Yes I'm doing tests with Fedora 20 systemd and systemd from git. Namely a partial backport of cc377381 Patch is here: http://svnweb.mageia.org/packages/cauldron/systemd/current/SOURCES/0511-logind-Partial-backport-of-cc377381.patch?revision=566087view=markuppathrev=566087 It might be useful to apply to Fedora too. EDIT: Perhaps not, see below. Ok, yes that patch is needed in case the gc is being run while the service and slice jobs are in the starting or stoping process: do not garbage-collect while we are in the middel of: user_start() = user_start_slice() user_start_service() and user_stop() = user_stop_slice() user_stop_service() Before and after the gc should run, this for sure prevents races! Note: after the jobs are finished they will put the user in the gc queue which will trigger the gc again. Ok (currently I'm only reporting to systemd upstream). Ohh I just remembered that Lennart added some logic to wait for the service to finish startup before notifying the client that he can make logins, this is perhaps not related but it should interest you: commit dd9b67aa3e94 http://cgit.freedesktop.org/systemd/systemd/commit/?id=dd9b67aa3e9476b3a4b3e231006eea6d108c841f I'll discuss some notes about a) and follow with patches that will try to honore b). I'm sending this to have some feedback before spending more time on it, ahh yes debugging this beast is hard! Please note that I did not take a look at seats logic, only sessions and users using getty and ssh logins. Please feel free to correct me! a) Improve session_check_gc() and user_check_gc() * user_check_gc() has an if (u-sessions) check! this should be updated to check if there are sessions that are not in the closing state. If *all* the sessions are in the closing state then check each session to see if there are still running processes in the background: screen,... If the session is closing, it has to, by definition, still have processes. If not, then the session wouldn't be closing, but actually Of course. fully closed and disappeared... Yes that's precisely where the bug is, currently fully closed sessions are not removed they will not disappear, all processes have been closed but sessions are still there. Testing a systemd git will show that getty or ssh sessions will persist after logouts unless you issue a D-Bus TerminateSession() or TerminateUser() which is probably the case of display managers or you will have to manuall do:$ loginctl terminate-{session|user} ... I'm trying to trigger the gc in a sane manner in pam_systemd, but first the patches I sent will close various session and user states races. And yeh I got the concept of how to deal with sessions that still have processes running in a background. Perhaps add the logind.conf KillUserProcesses check in the gc, if not set then we prevent the gc from collecting sessions. Care must be taken to honor KillUserProcesses definition of the logind.conf man page. That said, it's perhaps this check here that actually *prevents* this final state from being reached due to GC not being run! So this could fix the chicken+egg problem (I don't know the code, so can only really talk in generalities!) Yes, you are speaking about the if (u-sessions) in the user_check_gc(), yes sure it prevents some cases, just think about sessions that are in the closing state without any process in the background they should be put in the gc, but this check will prevent this *and* this is also related to sessions that are not put in the gc cause their scope will always be running and will always be around: src/login/logind-session.c:session_check_gc() if (s-scope manager_unit_is_active(s-manager, s-scope)) So in order to remove this check we need to make the session state more stable, fix what I've
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Thu, Jan 16, 2014 at 11:19:08AM +0100, Djalal Harouni wrote: On Thu, Jan 16, 2014 at 06:01:58AM +0100, Zbigniew Jędrzejewski-Szmek wrote: Indeed, in a container (without your patches), sessions remain in closing state. But with your patches, systemd --user instance is started and killed immediately during login. Not good either :) Ok, ouch! It seems that the systemd --user instance is killed without my patches! Systemd git will confirm it. Ok, after some thoughts and source code analysis, I came to the conclusion that in order to correctly clean session and user state files, we need: a) Improve session_check_gc() and user_check_gc() by using the session and user states. This implies b) b) Make sure that the state of the session and the user is always correctly set. I'll discuss some notes about a) and follow with patches that will try to honore b). I'm sending this to have some feedback before spending more time on it, ahh yes debugging this beast is hard! Please note that I did not take a look at seats logic, only sessions and users using getty and ssh logins. Please feel free to correct me! a) Improve session_check_gc() and user_check_gc() * user_check_gc() has an if (u-sessions) check! this should be updated to check if there are sessions that are not in the closing state. If *all* the sessions are in the closing state then check each session to see if there are still running processes in the background: screen,... If so then abort the user_check_gc() or depend on a per-session function that will perhaps do the kill-session-processes. * session_check_gc() should check if the session is in the closing state and if there are still running processes in the background, then decide to abort or force the killsession-processes But it should not do the following check: if (s-scope manager_unit_is_active(s-manager, s-scope)) Since the scope will always be around, the same was applied in the commit 63966da86 for user_check_gc(), do not check the user manager! b) Make sure that the state of the session and the user is always correctly set. Not to mention that some logind clients depend on it. So to make a) we must honor that b) is stable, and after an extensive source code analysis, it seems that the state of users and sessions is not always correct. I'll list some notes/bugs here: * It seems that in systemd v204 after a logout from a session where processes are still running like screen the state files will show: systemd 204 session state file: user state file: Active = 1State = active State = closing ACTIVE_SESSIONS = X Where for systemd git: session state file: user state file: Active = 0State = closing State = closing ACTIVE_SESSIONS = Not to mention that for systemd git, the session and user state files will show the same even if there are no background processes and they will not be cleaned (using getty or ssh logins) which is is the object of this thread! This is a bug, not fixed yet. * To track the state of sessions the fifo_fd should be used since the event manager will call session_dispatch_fifo() to remove it on eof. Using the session_is_active() *before* the fifo_fd check is not stable since it will return true before the fifo_fd was even created! and also before the session scope and user serivce are created, the session-seat-active will be set earlier before the fifo_fd creation and will be the last one to be unset after fifo_fd removal. And currently this affects src/login/logind-user.c:user_get_state() logic since it checks session_is_active() before calling session_get_state() which performs the fifo_fd check before the session_is_active() So user_get_state() and session_get_state() results might differ! This is a bug, not fixed yet, user_get_state() should count on session_get_state(). * It seems there are other problems with the session and user states, different variables are used to check the state and later these variables are updated at different places! The following patches will describe the problem and try to make the state more stable. The design was do not add extra flags to 'User' and 'Session' structs unless this is the only way. The logic is already there, and it needs a littel synchronization. Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
'Twas brillig, and Djalal Harouni at 22/01/14 09:24 did gyre and gimble: On Thu, Jan 16, 2014 at 11:19:08AM +0100, Djalal Harouni wrote: On Thu, Jan 16, 2014 at 06:01:58AM +0100, Zbigniew Jędrzejewski-Szmek wrote: Indeed, in a container (without your patches), sessions remain in closing state. But with your patches, systemd --user instance is started and killed immediately during login. Not good either :) Ok, ouch! It seems that the systemd --user instance is killed without my patches! Systemd git will confirm it. Ok, after some thoughts and source code analysis, I came to the conclusion that in order to correctly clean session and user state files, we need: a) Improve session_check_gc() and user_check_gc() by using the session and user states. This implies b) b) Make sure that the state of the session and the user is always correctly set. FWIW, I have some backported patches in Mageia that are not (yet) in the fedora package. As you mentioned Fedora earlier I figure that's what you are using: Namely a partial backport of cc377381 Patch is here: http://svnweb.mageia.org/packages/cauldron/systemd/current/SOURCES/0511-logind-Partial-backport-of-cc377381.patch?revision=566087view=markuppathrev=566087 It might be useful to apply to Fedora too. EDIT: Perhaps not, see below. I'll discuss some notes about a) and follow with patches that will try to honore b). I'm sending this to have some feedback before spending more time on it, ahh yes debugging this beast is hard! Please note that I did not take a look at seats logic, only sessions and users using getty and ssh logins. Please feel free to correct me! a) Improve session_check_gc() and user_check_gc() * user_check_gc() has an if (u-sessions) check! this should be updated to check if there are sessions that are not in the closing state. If *all* the sessions are in the closing state then check each session to see if there are still running processes in the background: screen,... If the session is closing, it has to, by definition, still have processes. If not, then the session wouldn't be closing, but actually fully closed and disappeared... That said, it's perhaps this check here that actually *prevents* this final state from being reached due to GC not being run! So this could fix the chicken+egg problem (I don't know the code, so can only really talk in generalities!) If so then abort the user_check_gc() or depend on a per-session function that will perhaps do the kill-session-processes. * session_check_gc() should check if the session is in the closing state and if there are still running processes in the background, then decide to abort or force the killsession-processes But it should not do the following check: if (s-scope manager_unit_is_active(s-manager, s-scope)) Since the scope will always be around, the same was applied in the commit 63966da86 for user_check_gc(), do not check the user manager! Interesting. It seems that 63966da86 suggests my patch above shouldn't be needed and in older code, the old if (u-slice_job || u-service_job) check should just be dropped. b) Make sure that the state of the session and the user is always correctly set. Not to mention that some logind clients depend on it. So to make a) we must honor that b) is stable, and after an extensive source code analysis, it seems that the state of users and sessions is not always correct. I'll list some notes/bugs here: * It seems that in systemd v204 after a logout from a session where processes are still running like screen the state files will show: systemd 204 session state file: user state file: Active = 1 State = active State = closing ACTIVE_SESSIONS = X Where for systemd git: session state file: user state file: Active = 0 State = closing State = closing ACTIVE_SESSIONS = Not to mention that for systemd git, the session and user state files will show the same even if there are no background processes and they will not be cleaned (using getty or ssh logins) which is is the object of this thread! This is a bug, not fixed yet. So basically the problem is that the gc is simply not triggered often enough to tidy up these old closing sessions with no processes left? * To track the state of sessions the fifo_fd should be used since the event manager will call session_dispatch_fifo() to remove it on eof. Using the session_is_active() *before* the fifo_fd check is not stable since it will return true before the fifo_fd was even created! and also before the session scope and user serivce are created, the session-seat-active will be set earlier before the fifo_fd creation and will be the last one to be unset after fifo_fd removal. And currently this affects
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Thu, Jan 16, 2014 at 06:01:58AM +0100, Zbigniew Jędrzejewski-Szmek wrote: On Sun, Jan 12, 2014 at 02:07:32AM +0100, Djalal Harouni wrote: On Sat, Jan 11, 2014 at 10:26:13PM +0100, Zbigniew Jędrzejewski-Szmek wrote: On Fri, Jan 03, 2014 at 02:19:19PM +0100, Djalal Harouni wrote: On logout pam_systemd should ensures the following: If the last concurrent session of a user ends, the $XDG_RUNTIME_DIR directory and all its contents are removed, too. from manpage. Using git HEAD, and a simple systemd-nspawn test will show that the above is not ensured and the sessions will stay! I can't reproduce this (with todays git). In the examples below, I understand that you're logging in through getty. Can you test with current git and/or provide a complete recipe to reproduce this? Yes through getty, and I guess this issue will also be visible for ssh/remote sessions, or sessions where TerminateSession() is not called. Thank you for the recipe. This helps. Indeed, in a container (without your patches), sessions remain in closing state. But with your patches, systemd --user instance is started and killed immediately during login. Not good either :) Ok, ouch! With just the first patch, session still remain as closing. Ok, thanks for the input, I'll work on it soon Also, there seems to be a regression with Fedora installs with yum: I installed a fresh one, and there was no /var/run - /run symlink, the first boot was mostly broken. - https://bugzilla.redhat.com/show_bug.cgi?id=1053983 Oh yes, I forgot about that, yes I fixed it in my install! Indeed, the bug is in yum, totally forget to report it, sorry :-/ Zbyszek Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Sun, Jan 12, 2014 at 02:07:32AM +0100, Djalal Harouni wrote: On Sat, Jan 11, 2014 at 10:26:13PM +0100, Zbigniew Jędrzejewski-Szmek wrote: On Fri, Jan 03, 2014 at 02:19:19PM +0100, Djalal Harouni wrote: On logout pam_systemd should ensures the following: If the last concurrent session of a user ends, the $XDG_RUNTIME_DIR directory and all its contents are removed, too. from manpage. Using git HEAD, and a simple systemd-nspawn test will show that the above is not ensured and the sessions will stay! I can't reproduce this (with todays git). In the examples below, I understand that you're logging in through getty. Can you test with current git and/or provide a complete recipe to reproduce this? Yes through getty, and I guess this issue will also be visible for ssh/remote sessions, or sessions where TerminateSession() is not called. Thank you for the recipe. This helps. Indeed, in a container (without your patches), sessions remain in closing state. But with your patches, systemd --user instance is started and killed immediately during login. Not good either :) With just the first patch, session still remain as closing. Also, there seems to be a regression with Fedora installs with yum: I installed a fresh one, and there was no /var/run - /run symlink, the first boot was mostly broken. - https://bugzilla.redhat.com/show_bug.cgi?id=1053983 Zbyszek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Fri, Jan 03, 2014 at 02:19:19PM +0100, Djalal Harouni wrote: On logout pam_systemd should ensures the following: If the last concurrent session of a user ends, the $XDG_RUNTIME_DIR directory and all its contents are removed, too. from manpage. Using git HEAD, and a simple systemd-nspawn test will show that the above is not ensured and the sessions will stay! I can't reproduce this (with todays git). In the examples below, I understand that you're logging in through getty. Can you test with current git and/or provide a complete recipe to reproduce this? Zbyszek A simple systemd-nspawn test: 1) login as user X 2) logout 3) login as user Y 4) loginctl (will list session of user X) In this example we are session c4: -bash-4.2# loginctl list-sessions SESSIONUID USER SEAT 1 1000 tixxdz seat0 c1 1000 tixxdz seat0 c2 0 root seat0 c3 1000 tixxdz seat0 c4 0 root seat0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Sat, Jan 11, 2014 at 10:26:13PM +0100, Zbigniew Jędrzejewski-Szmek wrote: On Fri, Jan 03, 2014 at 02:19:19PM +0100, Djalal Harouni wrote: On logout pam_systemd should ensures the following: If the last concurrent session of a user ends, the $XDG_RUNTIME_DIR directory and all its contents are removed, too. from manpage. Using git HEAD, and a simple systemd-nspawn test will show that the above is not ensured and the sessions will stay! I can't reproduce this (with todays git). In the examples below, I understand that you're logging in through getty. Can you test with current git and/or provide a complete recipe to reproduce this? Yes through getty, and I guess this issue will also be visible for ssh/remote sessions, or sessions where TerminateSession() is not called. Ok, testing both a fedora 20 systemd and git! Using the Testing in a Container from: www.freedesktop.org/wiki/Software/systemd/VirtualizedTesting/ On a fedora 19 system, I installed a Fedora 20: yum --releasever=20 --installroot=$HOME/fedora-tree-20 ... $ cat /home/tixxdz/fedora-tree-20/etc/redhat-release Fedora release 20 (Heisenbug) I set the root password, disable pam_loginuid.so and boot as shown in the Testing in a Container doc 1) login as root 2) logout 3) login as root (again) -bash-4.2# loginctl SESSIONUID USER SEAT 1 0 root seat0 c1 0 root seat0 2 sessions listed. As you can see the closed sessions are not cleaned. So I confirm the bug from testing a Fedora 20 container. Now from git 8042e377b8bb8f:! I just do: $ ./autogen.sh ./configure --disable-kdbus --libdir=/usr/lib64 $ make -j8 # DESTDIR=/home/tixxdz/fedora-tree make install # systemd-nspawn -bD /home/tixxdz/fedora-tree/ 3 Please note that: /usr/lib64 is needed since it will not be set automatically. It will be set by distributions and packagers of systemd So I make sure that pam_systemd modules are in /usr/lib64/security instead of /usr/lib/security, since only the ones that are in /usr/lib64/security are loaded. Hmm a guess, perhaps the old pam_systemd that got installed by the systemd rpm package is being loaded instead of the new compiled one! *Coupled* with graphical display managers who perhaps kill this bug by forcing a TerminateSession() vs getty logins! So using today git 8042e377b8 I also confirm the same bug. Please Zbyszek let me know if the pervious Testing in a Container using a Fedora 20 (a new systemd version), shows something for you? or perhaps check the timestamp of your *assumed right* /usr/lib64/security/pam_systemd.so after make install ? Thanks Zbyszek! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
ping? (Please if I'm missing something let me know) On Fri, Jan 03, 2014 at 02:19:19PM +0100, Djalal Harouni wrote: On logout pam_systemd should ensures the following: If the last concurrent session of a user ends, the $XDG_RUNTIME_DIR directory and all its contents are removed, too. from manpage. Using git HEAD, and a simple systemd-nspawn test will show that the above is not ensured and the sessions will stay! A simple systemd-nspawn test: 1) login as user X 2) logout 3) login as user Y 4) loginctl (will list session of user X) In this example we are session c4: -bash-4.2# loginctl list-sessions SESSIONUID USER SEAT 1 1000 tixxdz seat0 c1 1000 tixxdz seat0 c2 0 root seat0 c3 1000 tixxdz seat0 c4 0 root seat0 -bash-4.2# loginctl show-session --property=State 1 c1 c2 c3 c4 State=closing State=closing State=closing State=closing State=active As shown only session c4 is active, all the others are dead sessions. To close the dead sessions and clean things up, a dbus TerminateSession()=session_stop() must be issued... Please note that I'm running without pam_loginuid.so, due to another bug related to audit: https://bugzilla.redhat.com/show_bug.cgi?id=966807 Anyway, after some debugging: It seems that after ReleaseSession() which is called by pam_systemd, the user,session and seat state files will also still be available. The garbage-collector will miss them! In src/login/logind.c:manager_gc() the while loops will never be entered. The user slice units will start, then the match_job_removed() and co signals on these units will call session_add_to_gc_queue() and user_add_to_gc_queue() to push to gc_queue when done, in the mean time the manager_gc() will consume the gc_queue and remove them IOW *just* before and after the ReleaseSession() the manager {session|user}_gc_queue queues might be empty, hence session, users and seats will never be cleaned! the user's slice will still be alive... To fix this, I'm attaching two patches and I can say that they are related to each other from the perspective of the described bug, and at the same time they are independent of each other from a general perspective! 1) Make method_release_session() call session_add_to_gc_queue() This will ensure that the released session is in the gc_queue. This change gives the garbage collector a chance to collect sessions, and should not affect the logind behaviour and other display managers, the session_check_gc() is able to detect if the session is still valid. The thing is that from a quick git log the method_release_session() never called session_add_to_gc_queue(), so this bug was introduced or made *visible* by another change... (not sure) 2) As in commit 63966da86d8e, in function session_check_gc() the session manager will always be around so don't check it in order to garbage-collect the session. Thanks! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Wed, Jan 08, 2014 at 05:12:23PM +0100, Djalal Harouni wrote: ping? (Please if I'm missing something let me know) No. I think that nobody had time to review this. Zbyszek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [Patch 0/2] logind: make sure that closed sessions will be cleaned
On Wed, Jan 08, 2014 at 05:29:05PM +0100, Zbigniew Jędrzejewski-Szmek wrote: On Wed, Jan 08, 2014 at 05:12:23PM +0100, Djalal Harouni wrote: ping? (Please if I'm missing something let me know) No. I think that nobody had time to review this. Zbyszek Ok, thank you Zbyszek! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel