Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
On Thu, 15.01.15 09:39, Colin Guthrie (gm...@colin.guthr.ie) wrote: Ross Lagerwall wrote on 14/01/15 22:41: On Mon, Jan 12, 2015 at 09:04:35PM +0300, Andrei Borzenkov wrote: В Mon, 12 Jan 2015 10:34:07 + Colin Guthrie co...@mageia.org пишет: Anyway, assuming the process is in the .mount unit cgroup, should systemd detect the umount and kill the processes accordingly, and if It does not do it currently. It only starts killing if (u)mount times out. Otherwise if umount is successful it goes to stopped state immediately. Although it probably should, even for the sake of user space helpers. not, should calling systemctl status on .mount units show processes even if it's in an inactive state? I believe something very similar (not only for mount units) was reported recently, but I do not have reference handy. I mean, processes belonging to stopped unit (e.g. with KillMode=none) are not displayed. This commit is probably needed: http://cgit.freedesktop.org/systemd/systemd/commit/?id=dab5bf859900c0abdbf78c584e4aed42a19768cd That indeed looks like a likely candidate. I'll try breaking things again and checking status output with this patch applied. Did this fix things for you? Are all processes invoked by the mount unit terminated cleanly now on umount? Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
Steve Dickson wrote on 15/01/15 00:50: Hello, On 01/12/2015 04:43 PM, Colin Guthrie wrote: But FWIW, your check for whether systemctl is installed via calling systemctl --help is IMO not very neat. If you're using bash here anyway, you might as well just do a: if [ -d /sys/fs/cgroup/systemd ]; then type check or if you want to be super sure you could do: if mountpoint -q /sys/fs/cgroup/systemd; then This is a simple trick to detect if systemd is running. If systemctl is then not found, then I'd be pretty surprised (but your code would continue if the call fails anyway, so this should be safe). This avoids one more fork. Technicaly you could avoid calling systemctl start by calling systemctl is-active first, but to be honest this isn't really needed. I took Michael advice and used 'test -d /run/systemd/system' Seems best indeed yes. Thanks (to both!) :) Although if the script is in bash I'd use if [ -d ... rather than if test -d ... as (and bash experts (Harald?) can correct me here if I'm wrong) I believe [ is a bash built in (even if it is a binary in /usr/bin/), whereas it would have to fork out to run test. That seems to work OK (from a practical perspective things worked OK and I got my mount) but are obviously sub optimal, especially when the mount point is unmounted. In my case, I called umount but the rpc.statd process was still running: What is the expectation? When the umount should bring down rpc.statd? If it started it's own instance, it should IMO kill it again on umount, but I was more talking about the systemd context here. If the start-statd thing had done it's job correctly we wouldn't have gotten into this situation in the first place (as rpc-statd.service would have been started and contained it's own rpc.statd process happily!t's I don't really know how it should work on non-systemd systems as in that case I presume start-statd is called for each mount there (forgive me if I'm wrong) and thus you'd likely end up with lots of rpc.statd processes running, especially if you do lots of mount/umount cycles on a given share. Perhaps all you need to do is some very, very minimal fallback support here? e.g. checking the pid file and that the process of that pid is an rpc.statd process and only actually start it if it's not already running? Well, there is code in rpc.statd, sm-notify and mount.nfs that checks to see if a rpc.statd is already running... But the code appears to be a bit racy since in very a few environments, multiple rpc.statds are being started up... Yeah, I actually doubted myself the other day when I made a suggestion regarding doing some code to make sure only one was running... I later remembered that rpc.statd had a pid file and thus must have this stuff sort of built in (and I remember seeing messages along the lines of rpc.statd is already running). I guess the reason I got two was due to the extreme parallelism that systemd offers on boot. My two mounts (with the faulty start-statd) must have come in at almost the same time and triggered the race in rpc.statd startup and I got two processes. I don't suppose there is much we can do about that other than teaching rpc.statd to be less racy, but to be honest, this should be avoided with systemd now (thanks to the fixed start-statd) and other inits probably won't trigger the race condition, so, practically speaking, it's probably one to quietly ignore... at least for this list :D For systemd systems generally all that would happen is you've have a lot of redundant calls to systemctl start, but they should generally be harmless. Well, the environment I just described, where multiple statd getting started which are going using systemd to do the start ups. On systemd systems, it should all be fine yes. It's only really a problem when on non-systemd systems now that start-statd is working properly! FWIW, I think there are a number of issues with the systemd units upstream. If you're interested in some feedback here is a small snippet. I'm happy to do some patches for you if you'd be willing to apply them. Always... But I would like to have this conversation with the NFS community at linux-...@vger.kernel.org. Maybe you could post your ideas there? In a new thread? Sure I will do! Main issues are: 1. nfs-utils.service really is an abuse. It should be a nfs-utils.target (the comment inside aludes to the fact this is know, and it's just that you want systemctl restart nfs-utils to just work as a command. I can see the desire, but it's still an abuse. Perhaps someone would be willing to write a patch that does expansion to .service or .target should the unit type not be specified? Dunno how hard it would be tho' Well we did make the nfs-client a target but the nfs-server was made a service... I remember bring this point up at the time... but I forget what was said. The mists of time hide all ;) 2. The way nfs-blkmap.service/target
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
Ross Lagerwall wrote on 14/01/15 22:41: On Mon, Jan 12, 2015 at 09:04:35PM +0300, Andrei Borzenkov wrote: В Mon, 12 Jan 2015 10:34:07 + Colin Guthrie co...@mageia.org пишет: Anyway, assuming the process is in the .mount unit cgroup, should systemd detect the umount and kill the processes accordingly, and if It does not do it currently. It only starts killing if (u)mount times out. Otherwise if umount is successful it goes to stopped state immediately. Although it probably should, even for the sake of user space helpers. not, should calling systemctl status on .mount units show processes even if it's in an inactive state? I believe something very similar (not only for mount units) was reported recently, but I do not have reference handy. I mean, processes belonging to stopped unit (e.g. with KillMode=none) are not displayed. This commit is probably needed: http://cgit.freedesktop.org/systemd/systemd/commit/?id=dab5bf859900c0abdbf78c584e4aed42a19768cd That indeed looks like a likely candidate. I'll try breaking things again and checking status output with this patch applied. Thanks! Col -- Colin Guthrie gmane(at)colin.guthr.ie http://colin.guthr.ie/ Day Job: Tribalogic Limited http://www.tribalogic.net/ Open Source: Mageia Contributor http://www.mageia.org/ PulseAudio Hacker http://www.pulseaudio.org/ Trac Hacker http://trac.edgewall.org/ ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
On 15.01.2015 10:28, Colin Guthrie wrote: Although if the script is in bash I'd use if [ -d ... rather than if test -d ... as (and bash experts (Harald?) can correct me here if I'm wrong) I believe [ is a bash built in (even if it is a binary in /usr/bin/), whereas it would have to fork out to run test. If you don't write /usr/bin/test, then bash will use the builtin test. $ help test test: test [expr] Evaluate conditional expression. so [ -d dir ] and test -d dir are basically equal. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
On 15/01/15 09:28, Colin Guthrie wrote: Although if the script is in bash I'd use if [ -d ... rather than if test -d ... as (and bash experts (Harald?) can correct me here if I'm wrong) I believe [ is a bash built in (even if it is a binary in /usr/bin/), whereas it would have to fork out to run test. bash, dash, zsh and busybox sh all have both test and [ as builtins, at least as they're configured on Debian (and presumably Ubuntu too). No idea about more obscure shells like ksh, but there's really no good reason to implement one and not the other. GNU coreutils provides /usr/bin/test and /usr/bin/[ as a fallback, but they'd rarely be used. Here's how you tell: smcv@archetype:~$ type [ [ is a shell builtin smcv@archetype:~$ type test test is a shell builtin smcv@archetype:~$ type dd # a random non-builtin for comparison dd is /bin/dd S ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
Hello, On 01/12/2015 04:43 PM, Colin Guthrie wrote: But FWIW, your check for whether systemctl is installed via calling systemctl --help is IMO not very neat. If you're using bash here anyway, you might as well just do a: if [ -d /sys/fs/cgroup/systemd ]; then type check or if you want to be super sure you could do: if mountpoint -q /sys/fs/cgroup/systemd; then This is a simple trick to detect if systemd is running. If systemctl is then not found, then I'd be pretty surprised (but your code would continue if the call fails anyway, so this should be safe). This avoids one more fork. Technicaly you could avoid calling systemctl start by calling systemctl is-active first, but to be honest this isn't really needed. I took Michael advice and used 'test -d /run/systemd/system' That seems to work OK (from a practical perspective things worked OK and I got my mount) but are obviously sub optimal, especially when the mount point is unmounted. In my case, I called umount but the rpc.statd process was still running: What is the expectation? When the umount should bring down rpc.statd? If it started it's own instance, it should IMO kill it again on umount, but I was more talking about the systemd context here. If the start-statd thing had done it's job correctly we wouldn't have gotten into this situation in the first place (as rpc-statd.service would have been started and contained it's own rpc.statd process happily!t's I don't really know how it should work on non-systemd systems as in that case I presume start-statd is called for each mount there (forgive me if I'm wrong) and thus you'd likely end up with lots of rpc.statd processes running, especially if you do lots of mount/umount cycles on a given share. Perhaps all you need to do is some very, very minimal fallback support here? e.g. checking the pid file and that the process of that pid is an rpc.statd process and only actually start it if it's not already running? Well, there is code in rpc.statd, sm-notify and mount.nfs that checks to see if a rpc.statd is already running... But the code appears to be a bit racy since in very a few environments, multiple rpc.statds are being started up... For systemd systems generally all that would happen is you've have a lot of redundant calls to systemctl start, but they should generally be harmless. Well, the environment I just described, where multiple statd getting started which are going using systemd to do the start ups. FWIW, I think there are a number of issues with the systemd units upstream. If you're interested in some feedback here is a small snippet. I'm happy to do some patches for you if you'd be willing to apply them. Always... But I would like to have this conversation with the NFS community at linux-...@vger.kernel.org. Maybe you could post your ideas there? In a new thread? Main issues are: 1. nfs-utils.service really is an abuse. It should be a nfs-utils.target (the comment inside aludes to the fact this is know, and it's just that you want systemctl restart nfs-utils to just work as a command. I can see the desire, but it's still an abuse. Perhaps someone would be willing to write a patch that does expansion to .service or .target should the unit type not be specified? Dunno how hard it would be tho' Well we did make the nfs-client a target but the nfs-server was made a service... I remember bring this point up at the time... but I forget what was said. 2. The way nfs-blkmap.service/target interact seems really non-standard. The fact that nfs-blkmap.service has no [Install] section will make it report oddly in systemctl status (it will not say enabled or disabled but static). The use of Requisite= to tie it to it's target is, well, creative. Personally, I'd have it as a I see your point... 3. rpc-svcgssd.service and nfs-mountd.service have two PartOf= directives. This could lead to a very strange state where e.g. nfs-server.service is stopped, and thus the stop job is propigated to both these units, but they are actually still needed by nfs clients (I believe) as you also list it as part of nfs-utils.service (which as I mentioned already is really an abuse of a service. At this point rpc-svcgssd.service not even being used, at least in the distos I work with... The point being use BindsTo instead of PartOf? 4. Numerous units make use of /etc/sysconfig/* tree. This is very much discourage for upstream units and the official mechanism for tweaking units is to put a dropin file in /etc/systemd/system/theservice.service.d/my-overrides.conf Like it or not people expect things to be in /etc/sysconf/. From a distro stand point that would be a very hard thing to change. But.. ff there was a seamless way to make that change... That would be interesting... In these files you can tweak a single line, typically the Exec= line, or add an Environment= line, without altering anything else.
[systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
Hi, Looking into a thoroughly broken nfs-utils package here I noticed a quirk in systemctl status and in umount behaviour. In latest nfs-utils there is a helper binary shipped upstream called /usr/sbin/start-statd (I'll send a separate mail talking about this infrastructure with subject: Running system services required for certain filesystems) It sets the PATH to /sbin:/usr/sbin then tries to run systemctl (something that is already broken here as systemctl is in bin, not sbin) to start statd.service (again this seems to be broken as the unit appears to be called nfs-statd.service upstream... go figure). Either way we call the service nfs-lock.service here (for legacy reasons). If this command fails (which it does for us for two reasons) it runs rpc.statd --no-notify directly. This binary then run in the context of the .mount unit and thus in the .mount cgroup. That seems to work OK (from a practical perspective things worked OK and I got my mount) but are obviously sub optimal, especially when the mount point is unmounted. In my case, I called umount but the rpc.statd process was still running: [root@jimmy nfs-utils]$ pscg | grep 3256 3256 rpcuser 4:devices:/system.slice/mnt-media-scratch.mount,1:name=systemd:/system.slice/mnt-media-scratch.mount rpc.statd --no-notify [root@jimmy nfs-utils]$ systemctl status mnt-media-scratch.mount ● mnt-media-scratch.mount - /mnt/media/scratch Loaded: loaded (/etc/fstab) Active: inactive (dead) since Mon 2015-01-12 09:58:52 GMT; 1min 12s ago Where: /mnt/media/scratch What: marley.rasta.guthr.ie:/mnt/media/scratch Docs: man:fstab(5) man:systemd-fstab-generator(8) Jan 07 14:55:13 jimmy mount[3216]: /usr/sbin/start-statd: line 8: systemctl: command not found Jan 07 14:55:14 jimmy rpc.statd[3256]: Version 1.3.0 starting Jan 07 14:55:14 jimmy rpc.statd[3256]: Flags: TI-RPC [root@jimmy nfs-utils]$ As you can see the mount is dead but the process is still running and the systemctl status output does not correctly show the status of binaries running in the cgroup. When the mount is active the process does actually exist in this unit's context (provided systemd is used to do the mount - if you call mount /path command separately, the rpc.statd process can end up in weird cgroups - such as your user session!) Anyway, assuming the process is in the .mount unit cgroup, should systemd detect the umount and kill the processes accordingly, and if not, should calling systemctl status on .mount units show processes even if it's in an inactive state? This is with 217 with a few cherry picks on top so might have been addressed by now. Cheers Col -- Colin Guthrie colin(at)mageia.org http://colin.guthr.ie/ Day Job: Tribalogic Limited http://www.tribalogic.net/ Open Source: Mageia Contributor http://www.mageia.org/ PulseAudio Hacker http://www.pulseaudio.org/ Trac Hacker http://trac.edgewall.org/ ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
Hello On 01/12/2015 05:34 AM, Colin Guthrie wrote: Hi, Looking into a thoroughly broken nfs-utils package here I noticed a quirk in systemctl status and in umount behaviour. In latest nfs-utils there is a helper binary shipped upstream called /usr/sbin/start-statd (I'll send a separate mail talking about this infrastructure with subject: Running system services required for certain filesystems) It sets the PATH to /sbin:/usr/sbin then tries to run systemctl (something that is already broken here as systemctl is in bin, not sbin) to start statd.service (again this seems to be broken as the unit appears to be called nfs-statd.service upstream... go figure). The PATH problem has been fixed in the latest nfs-utils. Either way we call the service nfs-lock.service here (for legacy reasons). With the latest nfs-utils rpc-statd.service is now called from start-statd But yes, I did symbolically nfs-lock.service to rpc-statd.service when I moved to the upstream systemd scripts. If this command fails (which it does for us for two reasons) it runs rpc.statd --no-notify directly. This binary then run in the context of the .mount unit and thus in the .mount cgroup. What are the two reason rpc.statd --no-notify fails? That seems to work OK (from a practical perspective things worked OK and I got my mount) but are obviously sub optimal, especially when the mount point is unmounted. In my case, I called umount but the rpc.statd process was still running: What is the expectation? When the umount should bring down rpc.statd? [root@jimmy nfs-utils]$ pscg | grep 3256 3256 rpcuser 4:devices:/system.slice/mnt-media-scratch.mount,1:name=systemd:/system.slice/mnt-media-scratch.mount rpc.statd --no-notify [root@jimmy nfs-utils]$ systemctl status mnt-media-scratch.mount ● mnt-media-scratch.mount - /mnt/media/scratch Loaded: loaded (/etc/fstab) Active: inactive (dead) since Mon 2015-01-12 09:58:52 GMT; 1min 12s ago Where: /mnt/media/scratch What: marley.rasta.guthr.ie:/mnt/media/scratch Docs: man:fstab(5) man:systemd-fstab-generator(8) Jan 07 14:55:13 jimmy mount[3216]: /usr/sbin/start-statd: line 8: systemctl: command not found Jan 07 14:55:14 jimmy rpc.statd[3256]: Version 1.3.0 starting Jan 07 14:55:14 jimmy rpc.statd[3256]: Flags: TI-RPC [root@jimmy nfs-utils]$ Again this is fixed with the latest nfs-utils... Question? Why are you using v3 mounts? With V4 all this goes away. steved. As you can see the mount is dead but the process is still running and the systemctl status output does not correctly show the status of binaries running in the cgroup. When the mount is active the process does actually exist in this unit's context (provided systemd is used to do the mount - if you call mount /path command separately, the rpc.statd process can end up in weird cgroups - such as your user session!) Anyway, assuming the process is in the .mount unit cgroup, should systemd detect the umount and kill the processes accordingly, and if not, should calling systemctl status on .mount units show processes even if it's in an inactive state? This is with 217 with a few cherry picks on top so might have been addressed by now. Cheers Col ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
2015-01-12 22:43 GMT+01:00 Colin Guthrie gm...@colin.guthr.ie: But FWIW, your check for whether systemctl is installed via calling systemctl --help is IMO not very neat. If you're using bash here anyway, you might as well just do a: if [ -d /sys/fs/cgroup/systemd ]; then type check or if you want to be super sure you could do: if mountpoint -q /sys/fs/cgroup/systemd; then The canonical way to check if systemd is the active PID 1 is [1] test -d /run/systemd/system [1] http://www.freedesktop.org/software/systemd/man/sd_booted.html ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)
В Mon, 12 Jan 2015 10:34:07 + Colin Guthrie co...@mageia.org пишет: Anyway, assuming the process is in the .mount unit cgroup, should systemd detect the umount and kill the processes accordingly, and if It does not do it currently. It only starts killing if (u)mount times out. Otherwise if umount is successful it goes to stopped state immediately. Although it probably should, even for the sake of user space helpers. not, should calling systemctl status on .mount units show processes even if it's in an inactive state? I believe something very similar (not only for mount units) was reported recently, but I do not have reference handy. I mean, processes belonging to stopped unit (e.g. with KillMode=none) are not displayed. This is with 217 with a few cherry picks on top so might have been addressed by now. Cheers Col ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel