Re: [systemd-devel] DBus api of systemd user instance

2015-03-09 Thread Alexander Larsson
On sön, 2015-03-08 at 23:14 +0100, Lennart Poettering wrote:
 On Sat, 07.03.15 08:45, Mantas Mikulėnas (graw...@gmail.com) wrote:

 The latter is private, as the name suggests. Do not access it from
 external programs, it is systemd's internal hack around ordering
 issues with dbus, and nobody but systemd's own tools should access
 it. It is going away when kdbus arrives, if you make use of it, then
 your application will break.

I'd just like to mention that I'm using this private hack (optionally)
for now in xdg-app. I'll love for this to use kdbus when that is
available, but for now this is the only sane way to get a cgroup for an
unprivileged app.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander LarssonRed Hat, Inc 
   al...@redhat.comalexander.lars...@gmail.com 
He's an otherworldly one-eyed rock star searching for his wife's true 
killer. She's a pregnant motormouth Hell's Angel with an MBA from 
Harvard. They fight crime! 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Cleaning up transient scopes

2015-03-05 Thread Alexander Larsson
On tor, 2015-03-05 at 00:00 +0100, Lennart Poettering wrote:
 On Wed, 04.03.15 18:51, Alexander Larsson (al...@redhat.com) wrote:
 
  If i run a transient scope on the user systemd instance like:
  
  $ systemd-run --user --scope true
  
  Then the scope seems to live past the end of the process. Is there any
  way to make it automatically go away with the last process in the
  cgroup?
 
 Well, yes, the idea is that that just works. However, this is kinda
 broken if the systemd instance managing your scope is not PID 1, as we
 don't get SIGCHLD then. 
 
 Do you create any subcgroups? presumably not?
 
 Normally it should just work then, but I must admit that --user scopes
 got much less testing that system scopes...

Oh, i'm not doing anything special at all:

$ systemd-run --user --scope sleep 10 
[1] 10613
$ Running as unit run-10613.scope.
systemd-ctl status --user run-10613.scope 
● run-10613.scope - /usr/bin/sleep 10
   Loaded: loaded (/run/user/1000/systemd/user/run-10613.scope; static)
  Drop-In: /run/user/1000/systemd/user/run-10613.scope.d
   └─50-Description.conf
   Active: active (running) since tor 2015-03-05 10:06:24 CET; 8s ago
   CGroup: /user.slice/user-1000.slice/user@1000.service/run-10613.scope
   └─10613 /usr/bin/sleep 10

mar 05 10:06:24 localhost.localdomain systemd[1405]: Starting /usr/bin/sleep 10.
mar 05 10:06:24 localhost.localdomain systemd[1405]: Started /usr/bin/sleep 10.
$ sleep 10
[1]+  Donesystemd-run --user --scope sleep 10
$ systemctl status --user run-10613.scope 
● run-10613.scope - /usr/bin/sleep 10
   Loaded: loaded (/run/user/1000/systemd/user/run-10613.scope; static)
  Drop-In: /run/user/1000/systemd/user/run-10613.scope.d
   └─50-Description.conf
   Active: active (running) since tor 2015-03-05 10:06:24 CET; 25s ago

mar 05 10:06:24 localhost.localdomain systemd[1405]: Starting /usr/bin/sleep 10.
mar 05 10:06:24 localhost.localdomain systemd[1405]: Started /usr/bin/sleep 10.

See, even when the sleep command died the scope still exists, and is
even ACTIVE.

Also, while we're on the topic of scopes. Is there any way to hang some
random metadata off a unit during creation, that can be read back? For
xdg-app I'd like to put information like the app id, the exact version,
the security level, etc into the scope. Then anyone talking to the app
could go: 
  getpeercred = cgroup = scope = unfakable (by the app) data about
  the application.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander LarssonRed Hat, Inc 
   al...@redhat.comalexander.lars...@gmail.com 
He's an uncontrollable coffee-fuelled filmmaker for the 21st century. 
She's an enchanted tomboy doctor with a song in her heart and a spring in 
her step. They fight crime! 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Device cgroups for user systemd scopes

2015-03-04 Thread Alexander Larsson
The user instance of systemd does not seem to apply the DevicePolicy for
scopes. I.e. I can run:

$ systemd-run --user --scope --property=DevicePolicy=strict glxgears
Running as unit run-994.scope.
... runs fine, should fail to use DRI ...
$ cat /run/user/1000/systemd/user/run-994.scope.d/50-DevicePolicy.conf 
[Scope]
DevicePolicy=strict
$ cat /proc/994/cgroup 
10:hugetlb:/
9:perf_event:/
8:blkio:/
7:net_cls,net_prio:/
6:freezer:/
5:devices:/user.slice
4:memory:/user.slice
3:cpu,cpuacct:/
2:cpuset:/
1:name=systemd:/user.slice/user-1000.slice/user@1000.service/run-994.scope

This is with systemd-216-20.fc21.x86_64 from Fedora 21 under gnome.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander LarssonRed Hat, Inc 
   al...@redhat.comalexander.lars...@gmail.com 
He's a world-famous Republican sorceror with a mysterious suitcase 
handcuffed to his arm. She's a cynical hip-hop politician from the wrong 
side of the tracks. They fight crime! 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Cleaning up transient scopes

2015-03-04 Thread Alexander Larsson
If i run a transient scope on the user systemd instance like:

$ systemd-run --user --scope true

Then the scope seems to live past the end of the process. Is there any
way to make it automatically go away with the last process in the
cgroup?

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander LarssonRed Hat, Inc 
   al...@redhat.comalexander.lars...@gmail.com 
He's a gun-slinging gay filmmaker possessed of the uncanny powers of an 
insect. She's an elegant tomboy widow with someone else's memories. They 
fight crime! 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Docker vs PrivateTmp

2015-02-02 Thread Alexander Larsson
On mån, 2015-02-02 at 12:12 +0100, Lennart Poettering wrote:
 On Fri, 30.01.15 11:02, Alexander Larsson (al...@redhat.com) wrote:
 
  I think the problem is that docker daemon makes 
  /var/lib/docker/devicemapper private in the host namespace to handle
  some scalability issues we found in the kernel. This causes problem not
  with docker containers (because they unmount all other mounts as per the
  above), but with other namespace-using apps. For instance, if a service
  with PrivateTmp is launched, it will inherit the existing mounts
  in /var/lib/docker/devicemapper at the point of startup, but when these
  are eventually unmounted in the host namespace this is not propagated
  into the service (due to it being a private mount, not a slave mount).
  
  We could try making this slave instead, but I don't know if that then
  fixes the scalability issues we had, because they were related to
  stupidities in the kernel wrt propagating mounts. If it doesn't work,
  then we have to put docker-daemon in its own namespace.
 
 The daemon should first create its own namespace, and then detach
 propagation, not the other way round. This really isn't stupidity in
 the kernel, but in docker's userspace...

The stupidity was the O(n^4) algorithm in the kernel when it was
duplicating all vfsmounts that could possibly be propagated, and then
immediately freeing them when they did not propagate, which interacted
poorly with some lame kernel O(n^2) allocator behaviour.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander LarssonRed Hat, Inc 
   al...@redhat.comalexander.lars...@gmail.com 
He's an oversexed shark-wrestling rock star from the 'hood. She's a 
high-kicking cigar-chomping former first lady with the power to see 
death. They fight crime! 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Docker vs PrivateTmp

2015-01-30 Thread Alexander Larsson
On fre, 2015-01-23 at 11:31 -0500, Daniel J Walsh wrote:
 On 01/22/2015 10:02 PM, Lennart Poettering wrote:
  On Sat, 17.01.15 23:02, Lars Kellogg-Stedman (l...@redhat.com) wrote:
 
  See the `devicemapper` mountpoint created by Docker for the container:
 
  # grep devicemapper/mnt /proc/mounts
  
  /dev/mapper/docker-253:6-98310-e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62
  
  /var/lib/docker/devicemapper/mnt/e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62
  ext4
  
  rw,context=system_u:object_r:svirt_sandbox_file_t:s0:c261,c1018,relatime,discard,stripe=16,data=ordered
  0 0
  I am not sure why docker makes these mounts visible in the host
  namespace at all. This smells like a bug.

They need to at least be visible to the docker daemon, because it needs
to look into it to do diffs between images when e.g. commiting. It
doesn't necessarily have to be in the host namespace though, it could be
in a different namespace owned only by the docker daemon. I wanted to do
that, but for reasons that escape me at the moment that was problematic
and I never got to it.

  Watch Docker fail to destroy the container because it is unable to remove 
  the mountpoint directory:
 
  Jan 17 22:43:03 pk115wp-lkellogg docker-1.4.1-dev[18239]:
  time=2015-01-17T22:43:03-05:00 level=error msg=Handler for DELETE
  /containers/{name:.*} returned error: Cannot destroy container 
  e68df3f45d61:
  Driver devicemapper failed to remove root filesystem
  e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62: 
  Device is
  Busy
  This smells as if Docker incorrectly sets the mount propagation bits
  on its own mounts.
 
  It would be good checking /proc/self/mountinfo inside and outside of
  docker's own namespace, and checking how the propagation bits are set
  for the individual mounts. It's a bit hard to read, but the
  interesting bits are in the 7th column of that file.
 
  In general: docker should do the equivalent of mount --make-rslave /
  as first thing after opening its mount namespace, so that from that
  point on mounts and especiall *un*mounts propagate from the host into
  the container, but not vice versa.
 
  If they do not invoke that, then the propagation will stay at
  shared, which means the mounts will appear in the host and vice
  versa, which is certainly undesired.
 
  Also, they should not use mount --make-rprivate /, as that means
  anything the host mounted will stay mounted in the container forever,
  which is a problem.
 
  Also, they really need to make this recursive, so that all mount
  points they have access too are detached from the host!

It was a while since I looked at this, but i believe that the docker
containers run as MS_PRIVATE, and they explicitly unmount all the host
filesystems exept the ones specifically mounted in as volumes.

I think the problem is that docker daemon makes 
/var/lib/docker/devicemapper private in the host namespace to handle
some scalability issues we found in the kernel. This causes problem not
with docker containers (because they unmount all other mounts as per the
above), but with other namespace-using apps. For instance, if a service
with PrivateTmp is launched, it will inherit the existing mounts
in /var/lib/docker/devicemapper at the point of startup, but when these
are eventually unmounted in the host namespace this is not propagated
into the service (due to it being a private mount, not a slave mount).

We could try making this slave instead, but I don't know if that then
fixes the scalability issues we had, because they were related to
stupidities in the kernel wrt propagating mounts. If it doesn't work,
then we have to put docker-daemon in its own namespace.

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander LarssonRed Hat, Inc 
   al...@redhat.comalexander.lars...@gmail.com 
He's an impetuous amnesiac hairdresser who dotes on his loving old ma. 
She's an elegant Bolivian single mother from out of town. They fight 
crime! 

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] udev database backwards compatibility guarantees

2014-10-27 Thread Alexander Larsson
On lör, 2014-10-25 at 13:45 +0200, Kay Sievers wrote:
  Kay, any ideas on the udev database stability?
 
 No stability. And so far no guarantees that things will not change.
 
 The versions of the udev daemon, libudev and the runtime data must
 match. Any expectations about version mix and match would require a
 promise we do not give at this moment.
 
 It might change with an imaginary sd-device library, but it is very
 unlikely to happen with the current udev.

So, libudev will not be supportable as bundled in a sandboxed app then?



___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] udev database backwards compatibility guarantees

2014-09-11 Thread Alexander Larsson
Hi, I'm looking at creating a runtime/app thing for Gnome in the style
of:
http://0pointer.net/blog/revisiting-how-we-put-together-linux-systems.html

However, I noticed that some core dependencies like mesa uses libudev.
And in fact, needs user-set additional info not in sysfs. In particular,
it reads ID_PATH_TAG on render device nodes to pick what GPU to use in
multi-gpu situations (PRIME):
http://lists.freedesktop.org/archives/mesa-dev/2014-June/061798.html

It seems to me that this means I need the host /run/udev inside the
application. I know that the udev database format changed in the past,
but can I rely on it being stable in the future, even if the host udev
is rev:ed to a later version than what is in the application runtime?

Of course, there is also the question of /dev and /sys management in
sandboxed apps in general. Clearly any modern app will require some
real devices for things like direct rendering. But it would be ideal to
not expose everything. How do we see this working?

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel