Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-28 Thread Lennart Poettering
On Thu, 15.01.15 09:39, Colin Guthrie (gm...@colin.guthr.ie) wrote:

 Ross Lagerwall wrote on 14/01/15 22:41:
  On Mon, Jan 12, 2015 at 09:04:35PM +0300, Andrei Borzenkov wrote:
  В Mon, 12 Jan 2015 10:34:07 +
  Colin Guthrie co...@mageia.org пишет:
 
 
  Anyway, assuming the process is in the .mount unit cgroup, should
  systemd detect the umount and kill the processes accordingly, and if
 
  It does not do it currently. It only starts killing if (u)mount times
  out. Otherwise if umount is successful it goes to stopped state
  immediately. Although it probably should, even for the sake of user
  space helpers.
 
  not, should calling systemctl status on .mount units show processes
  even if it's in an inactive state?
 
 
  I believe something very similar (not only for mount units) was
  reported recently, but I do not have reference handy. I mean, processes
  belonging to stopped unit (e.g. with KillMode=none) are not displayed.
 
  
  This commit is probably needed:
  
  http://cgit.freedesktop.org/systemd/systemd/commit/?id=dab5bf859900c0abdbf78c584e4aed42a19768cd
 
 That indeed looks like a likely candidate. I'll try breaking things
 again and checking status output with this patch applied.

Did this fix things for you? Are all processes invoked by the mount
unit terminated cleanly now on umount?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-15 Thread Colin Guthrie
Steve Dickson wrote on 15/01/15 00:50:
 Hello,
 
 On 01/12/2015 04:43 PM, Colin Guthrie wrote:

 But FWIW, your check for whether systemctl is installed via calling
 systemctl --help is IMO not very neat.

 If you're using bash here anyway, you might as well just do a:

 if [ -d /sys/fs/cgroup/systemd ]; then

 type check or if you want to be super sure you could do:

 if mountpoint -q /sys/fs/cgroup/systemd; then

 This is a simple trick to detect if systemd is running. If systemctl is
 then not found, then I'd be pretty surprised (but your code would
 continue if the call fails anyway, so this should be safe).

 This avoids one more fork.

 Technicaly you could avoid calling systemctl start by calling
 systemctl is-active first, but to be honest this isn't really needed.
 I took Michael advice and used 'test -d /run/systemd/system'

Seems best indeed yes. Thanks (to both!) :)

Although if the script is in bash I'd use  if [ -d ... rather than if
test -d ... as (and bash experts (Harald?) can correct me here if I'm
wrong) I believe [ is a bash built in (even if it is a binary in
/usr/bin/), whereas it would have to fork out to run test.


 That seems to work OK (from a practical perspective things worked OK and
 I got my mount) but are obviously sub optimal, especially when the mount
 point is unmounted.

 In my case, I called umount but the rpc.statd process was still running:
 What is the expectation? When the umount should bring down rpc.statd?

 If it started it's own instance, it should IMO kill it again on umount,
 but I was more talking about the systemd context here. If the
 start-statd thing had done it's job correctly we wouldn't have gotten
 into this situation in the first place (as rpc-statd.service would have
 been started and contained it's own rpc.statd process happily!t's

 I don't really know how it should work on non-systemd systems as in that
 case I presume start-statd is called for each mount there (forgive me if
 I'm wrong) and thus you'd likely end up with lots of rpc.statd processes
 running, especially if you do lots of mount/umount cycles on a given
 share. Perhaps all you need to do is some very, very minimal fallback
 support here? e.g. checking the pid file and that the process of that
 pid is an rpc.statd process and only actually start it if it's not
 already running?
 Well, there is code in rpc.statd, sm-notify and mount.nfs that checks 
 to see if a rpc.statd is already running... But the code appears
 to be a bit racy since in very a few environments, multiple rpc.statds
 are being started up... 

Yeah, I actually doubted myself the other day when I made a suggestion
regarding doing some code to make sure only one was running... I later
remembered that rpc.statd had a pid file and thus must have this stuff
sort of built in (and I remember seeing messages along the lines of
rpc.statd is already running).

I guess the reason I got two was due to the extreme parallelism that
systemd offers on boot. My two mounts (with the faulty start-statd) must
have come in at almost the same time and triggered the race in rpc.statd
startup and I got two processes.

I don't suppose there is much we can do about that other than teaching
rpc.statd to be less racy, but to be honest, this should be avoided with
systemd now (thanks to the fixed start-statd) and other inits probably
won't trigger the race condition, so, practically speaking, it's
probably one to quietly ignore... at least for this list :D

 For systemd systems generally all that would happen is you've have a lot
 of redundant calls to systemctl start, but they should generally be
 harmless.
 Well, the environment I just described, where multiple statd getting
 started which are going using systemd to do the start ups.

On systemd systems, it should all be fine yes. It's only really a
problem when on non-systemd systems now that start-statd is working
properly!


 FWIW, I think there are a number of issues with the systemd units
 upstream. If you're interested in some feedback here is a small snippet.
 I'm happy to do some patches for you if you'd be willing to apply them.
 Always... But I would like to have this conversation with the
 NFS community at linux-...@vger.kernel.org. Maybe you could post
 your ideas there? In a new thread?

Sure I will do!

 Main issues are:

 1. nfs-utils.service really is an abuse. It should be a nfs-utils.target
 (the comment inside aludes to the fact this is know, and it's just that
 you want systemctl restart nfs-utils to just work as a command. I
 can see the desire, but it's still an abuse. Perhaps someone would be
 willing to write a patch that does expansion to .service or .target
 should the unit type not be specified? Dunno how hard it would be tho'
 Well we did make the nfs-client a target but the nfs-server was made 
 a service... I remember bring this point up at the time... but I forget 
 what was said.

The mists of time hide all ;)


 2. The way nfs-blkmap.service/target 

Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-15 Thread Colin Guthrie
Ross Lagerwall wrote on 14/01/15 22:41:
 On Mon, Jan 12, 2015 at 09:04:35PM +0300, Andrei Borzenkov wrote:
 В Mon, 12 Jan 2015 10:34:07 +
 Colin Guthrie co...@mageia.org пишет:


 Anyway, assuming the process is in the .mount unit cgroup, should
 systemd detect the umount and kill the processes accordingly, and if

 It does not do it currently. It only starts killing if (u)mount times
 out. Otherwise if umount is successful it goes to stopped state
 immediately. Although it probably should, even for the sake of user
 space helpers.

 not, should calling systemctl status on .mount units show processes
 even if it's in an inactive state?


 I believe something very similar (not only for mount units) was
 reported recently, but I do not have reference handy. I mean, processes
 belonging to stopped unit (e.g. with KillMode=none) are not displayed.

 
 This commit is probably needed:
 
 http://cgit.freedesktop.org/systemd/systemd/commit/?id=dab5bf859900c0abdbf78c584e4aed42a19768cd

That indeed looks like a likely candidate. I'll try breaking things
again and checking status output with this patch applied.

Thanks!

Col


-- 

Colin Guthrie
gmane(at)colin.guthr.ie
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited http://www.tribalogic.net/
Open Source:
  Mageia Contributor http://www.mageia.org/
  PulseAudio Hacker http://www.pulseaudio.org/
  Trac Hacker http://trac.edgewall.org/
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-15 Thread Harald Hoyer
On 15.01.2015 10:28, Colin Guthrie wrote:
 Although if the script is in bash I'd use  if [ -d ... rather than if
 test -d ... as (and bash experts (Harald?) can correct me here if I'm
 wrong) I believe [ is a bash built in (even if it is a binary in
 /usr/bin/), whereas it would have to fork out to run test.


If you don't write /usr/bin/test, then bash will use the builtin test.

$ help test
test: test [expr]
Evaluate conditional expression.

so [ -d dir ] and test -d dir are basically equal.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-15 Thread Simon McVittie
On 15/01/15 09:28, Colin Guthrie wrote:
 Although if the script is in bash I'd use  if [ -d ... rather than if
 test -d ... as (and bash experts (Harald?) can correct me here if I'm
 wrong) I believe [ is a bash built in (even if it is a binary in
 /usr/bin/), whereas it would have to fork out to run test.

bash, dash, zsh and busybox sh all have both test and [ as builtins,
at least as they're configured on Debian (and presumably Ubuntu too). No
idea about more obscure shells like ksh, but there's really no good
reason to implement one and not the other.

GNU coreutils provides /usr/bin/test and /usr/bin/[ as a fallback, but
they'd rarely be used.

Here's how you tell:

smcv@archetype:~$ type [
[ is a shell builtin
smcv@archetype:~$ type test
test is a shell builtin
smcv@archetype:~$ type dd   # a random non-builtin for comparison
dd is /bin/dd

S

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-14 Thread Steve Dickson
Hello,

On 01/12/2015 04:43 PM, Colin Guthrie wrote:
 
 But FWIW, your check for whether systemctl is installed via calling
 systemctl --help is IMO not very neat.
 
 If you're using bash here anyway, you might as well just do a:
 
 if [ -d /sys/fs/cgroup/systemd ]; then
 
 type check or if you want to be super sure you could do:
 
 if mountpoint -q /sys/fs/cgroup/systemd; then
 
 This is a simple trick to detect if systemd is running. If systemctl is
 then not found, then I'd be pretty surprised (but your code would
 continue if the call fails anyway, so this should be safe).
 
 This avoids one more fork.
 
 Technicaly you could avoid calling systemctl start by calling
 systemctl is-active first, but to be honest this isn't really needed.
I took Michael advice and used 'test -d /run/systemd/system'


 That seems to work OK (from a practical perspective things worked OK and
 I got my mount) but are obviously sub optimal, especially when the mount
 point is unmounted.

 In my case, I called umount but the rpc.statd process was still running:
 What is the expectation? When the umount should bring down rpc.statd?
 
 If it started it's own instance, it should IMO kill it again on umount,
 but I was more talking about the systemd context here. If the
 start-statd thing had done it's job correctly we wouldn't have gotten
 into this situation in the first place (as rpc-statd.service would have
 been started and contained it's own rpc.statd process happily!t's
 
 I don't really know how it should work on non-systemd systems as in that
 case I presume start-statd is called for each mount there (forgive me if
 I'm wrong) and thus you'd likely end up with lots of rpc.statd processes
 running, especially if you do lots of mount/umount cycles on a given
 share. Perhaps all you need to do is some very, very minimal fallback
 support here? e.g. checking the pid file and that the process of that
 pid is an rpc.statd process and only actually start it if it's not
 already running?
Well, there is code in rpc.statd, sm-notify and mount.nfs that checks 
to see if a rpc.statd is already running... But the code appears
to be a bit racy since in very a few environments, multiple rpc.statds
are being started up... 
  
 
 For systemd systems generally all that would happen is you've have a lot
 of redundant calls to systemctl start, but they should generally be
 harmless.
Well, the environment I just described, where multiple statd getting
started which are going using systemd to do the start ups.
 
 
 
 FWIW, I think there are a number of issues with the systemd units
 upstream. If you're interested in some feedback here is a small snippet.
 I'm happy to do some patches for you if you'd be willing to apply them.
Always... But I would like to have this conversation with the
NFS community at linux-...@vger.kernel.org. Maybe you could post
your ideas there? In a new thread?

 
 Main issues are:
 
 1. nfs-utils.service really is an abuse. It should be a nfs-utils.target
 (the comment inside aludes to the fact this is know, and it's just that
 you want systemctl restart nfs-utils to just work as a command. I
 can see the desire, but it's still an abuse. Perhaps someone would be
 willing to write a patch that does expansion to .service or .target
 should the unit type not be specified? Dunno how hard it would be tho'
Well we did make the nfs-client a target but the nfs-server was made 
a service... I remember bring this point up at the time... but I forget 
what was said.

 
 2. The way nfs-blkmap.service/target interact seems really non-standard.
 The fact that nfs-blkmap.service has no [Install] section will make it
 report oddly in systemctl status (it will not say enabled or
 disabled but static). The use of Requisite= to tie it to it's target
 is, well, creative. Personally, I'd have it as a
I see your point... 

 
 3. rpc-svcgssd.service and nfs-mountd.service have two PartOf=
 directives. This could lead to a very strange state where e.g.
 nfs-server.service is stopped, and thus the stop job is propigated to
 both these units, but they are actually still needed by nfs clients (I
 believe) as you also list it as part of nfs-utils.service (which as I
 mentioned already is really an abuse of a service.
At this point rpc-svcgssd.service not even being used, at least
in the distos I work with... The point being use BindsTo instead 
of PartOf?

 
 4. Numerous units make use of /etc/sysconfig/* tree. This is very much
 discourage for upstream units and the official mechanism for tweaking
 units is to put a dropin file in
 /etc/systemd/system/theservice.service.d/my-overrides.conf
Like it or not people expect things to be in /etc/sysconf/. From a distro
stand point that would be a very hard thing to change. But.. ff there
was a seamless way to make that change... That would be interesting... 
 
 
 In these files you can tweak a single line, typically the Exec= line, or
 add an Environment= line, without altering anything else.
 
 

[systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-12 Thread Colin Guthrie
Hi,

Looking into a thoroughly broken nfs-utils package here I noticed a
quirk in systemctl status and in umount behaviour.


In latest nfs-utils there is a helper binary shipped upstream called
/usr/sbin/start-statd (I'll send a separate mail talking about this
infrastructure with subject: Running system services required for
certain filesystems)

It sets the PATH to /sbin:/usr/sbin then tries to run systemctl
(something that is already broken here as systemctl is in bin, not sbin)
to start statd.service (again this seems to be broken as the unit
appears to be called nfs-statd.service upstream... go figure).

Either way we call the service nfs-lock.service here (for legacy reasons).

If this command fails (which it does for us for two reasons) it runs
rpc.statd --no-notify directly. This binary then run in the context of
the .mount unit and thus in the .mount cgroup.

That seems to work OK (from a practical perspective things worked OK and
I got my mount) but are obviously sub optimal, especially when the mount
point is unmounted.

In my case, I called umount but the rpc.statd process was still running:

[root@jimmy nfs-utils]$ pscg | grep 3256
 3256 rpcuser
4:devices:/system.slice/mnt-media-scratch.mount,1:name=systemd:/system.slice/mnt-media-scratch.mount
rpc.statd --no-notify

[root@jimmy nfs-utils]$ systemctl status mnt-media-scratch.mount
● mnt-media-scratch.mount - /mnt/media/scratch
   Loaded: loaded (/etc/fstab)
   Active: inactive (dead) since Mon 2015-01-12 09:58:52 GMT; 1min 12s ago
Where: /mnt/media/scratch
 What: marley.rasta.guthr.ie:/mnt/media/scratch
 Docs: man:fstab(5)
   man:systemd-fstab-generator(8)

Jan 07 14:55:13 jimmy mount[3216]: /usr/sbin/start-statd: line 8:
systemctl: command not found
Jan 07 14:55:14 jimmy rpc.statd[3256]: Version 1.3.0 starting
Jan 07 14:55:14 jimmy rpc.statd[3256]: Flags: TI-RPC
[root@jimmy nfs-utils]$


As you can see the mount is dead but the process is still running and
the systemctl status output does not correctly show the status of
binaries running in the cgroup. When the mount is active the process
does actually exist in this unit's context (provided systemd is used to
do the mount - if you call mount /path command separately, the
rpc.statd process can end up in weird cgroups - such as your user session!)

Anyway, assuming the process is in the .mount unit cgroup, should
systemd detect the umount and kill the processes accordingly, and if
not, should calling systemctl status on .mount units show processes
even if it's in an inactive state?

This is with 217 with a few cherry picks on top so might have been
addressed by now.


Cheers

Col

-- 

Colin Guthrie
colin(at)mageia.org
http://colin.guthr.ie/

Day Job:
  Tribalogic Limited http://www.tribalogic.net/
Open Source:
  Mageia Contributor http://www.mageia.org/
  PulseAudio Hacker http://www.pulseaudio.org/
  Trac Hacker http://trac.edgewall.org/
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-12 Thread Steve Dickson
Hello

On 01/12/2015 05:34 AM, Colin Guthrie wrote:
 Hi,
 
 Looking into a thoroughly broken nfs-utils package here I noticed a
 quirk in systemctl status and in umount behaviour.
 
 
 In latest nfs-utils there is a helper binary shipped upstream called
 /usr/sbin/start-statd (I'll send a separate mail talking about this
 infrastructure with subject: Running system services required for
 certain filesystems)
 
 It sets the PATH to /sbin:/usr/sbin then tries to run systemctl
 (something that is already broken here as systemctl is in bin, not sbin)
 to start statd.service (again this seems to be broken as the unit
 appears to be called nfs-statd.service upstream... go figure).
The PATH problem has been fixed in the latest nfs-utils.  

 
 Either way we call the service nfs-lock.service here (for legacy reasons).
With the latest nfs-utils rpc-statd.service is now called from start-statd
But yes, I did symbolically nfs-lock.service to rpc-statd.service when 
I moved to the upstream systemd scripts.

 
 If this command fails (which it does for us for two reasons) it runs
 rpc.statd --no-notify directly. This binary then run in the context of
 the .mount unit and thus in the .mount cgroup.
What are the two reason rpc.statd --no-notify fails? 

 
 That seems to work OK (from a practical perspective things worked OK and
 I got my mount) but are obviously sub optimal, especially when the mount
 point is unmounted.
 
 In my case, I called umount but the rpc.statd process was still running:
What is the expectation? When the umount should bring down rpc.statd?

 
 [root@jimmy nfs-utils]$ pscg | grep 3256
  3256 rpcuser
 4:devices:/system.slice/mnt-media-scratch.mount,1:name=systemd:/system.slice/mnt-media-scratch.mount
 rpc.statd --no-notify
 
 [root@jimmy nfs-utils]$ systemctl status mnt-media-scratch.mount
 ● mnt-media-scratch.mount - /mnt/media/scratch
Loaded: loaded (/etc/fstab)
Active: inactive (dead) since Mon 2015-01-12 09:58:52 GMT; 1min 12s ago
 Where: /mnt/media/scratch
  What: marley.rasta.guthr.ie:/mnt/media/scratch
  Docs: man:fstab(5)
man:systemd-fstab-generator(8)
 
 Jan 07 14:55:13 jimmy mount[3216]: /usr/sbin/start-statd: line 8:
 systemctl: command not found
 Jan 07 14:55:14 jimmy rpc.statd[3256]: Version 1.3.0 starting
 Jan 07 14:55:14 jimmy rpc.statd[3256]: Flags: TI-RPC
 [root@jimmy nfs-utils]$
Again this is fixed with the latest nfs-utils...

Question? Why are you using v3 mounts? With V4 all this goes away.

steved.
 
 
 As you can see the mount is dead but the process is still running and
 the systemctl status output does not correctly show the status of
 binaries running in the cgroup. When the mount is active the process
 does actually exist in this unit's context (provided systemd is used to
 do the mount - if you call mount /path command separately, the
 rpc.statd process can end up in weird cgroups - such as your user session!)
 
 Anyway, assuming the process is in the .mount unit cgroup, should
 systemd detect the umount and kill the processes accordingly, and if
 not, should calling systemctl status on .mount units show processes
 even if it's in an inactive state?
 
 This is with 217 with a few cherry picks on top so might have been
 addressed by now.
 
 
 Cheers
 
 Col
 
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-12 Thread Michael Biebl
2015-01-12 22:43 GMT+01:00 Colin Guthrie gm...@colin.guthr.ie:
 But FWIW, your check for whether systemctl is installed via calling
 systemctl --help is IMO not very neat.

 If you're using bash here anyway, you might as well just do a:

 if [ -d /sys/fs/cgroup/systemd ]; then

 type check or if you want to be super sure you could do:

 if mountpoint -q /sys/fs/cgroup/systemd; then


The canonical way to check if systemd is the active PID 1 is [1]

test -d /run/systemd/system

[1] http://www.freedesktop.org/software/systemd/man/sd_booted.html
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl status not showing still running processes in inactive .mount unit cgroups (NFS specifically)

2015-01-12 Thread Andrei Borzenkov
В Mon, 12 Jan 2015 10:34:07 +
Colin Guthrie co...@mageia.org пишет:

 
 Anyway, assuming the process is in the .mount unit cgroup, should
 systemd detect the umount and kill the processes accordingly, and if

It does not do it currently. It only starts killing if (u)mount times
out. Otherwise if umount is successful it goes to stopped state
immediately. Although it probably should, even for the sake of user
space helpers.

 not, should calling systemctl status on .mount units show processes
 even if it's in an inactive state?
 

I believe something very similar (not only for mount units) was
reported recently, but I do not have reference handy. I mean, processes
belonging to stopped unit (e.g. with KillMode=none) are not displayed.

 This is with 217 with a few cherry picks on top so might have been
 addressed by now.
 
 
 Cheers
 
 Col
 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel