Re: [systemd-devel] DBus api of systemd user instance
On sön, 2015-03-08 at 23:14 +0100, Lennart Poettering wrote: On Sat, 07.03.15 08:45, Mantas Mikulėnas (graw...@gmail.com) wrote: The latter is private, as the name suggests. Do not access it from external programs, it is systemd's internal hack around ordering issues with dbus, and nobody but systemd's own tools should access it. It is going away when kdbus arrives, if you make use of it, then your application will break. I'd just like to mention that I'm using this private hack (optionally) for now in xdg-app. I'll love for this to use kdbus when that is available, but for now this is the only sane way to get a cgroup for an unprivileged app. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander LarssonRed Hat, Inc al...@redhat.comalexander.lars...@gmail.com He's an otherworldly one-eyed rock star searching for his wife's true killer. She's a pregnant motormouth Hell's Angel with an MBA from Harvard. They fight crime! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Cleaning up transient scopes
On tor, 2015-03-05 at 00:00 +0100, Lennart Poettering wrote: On Wed, 04.03.15 18:51, Alexander Larsson (al...@redhat.com) wrote: If i run a transient scope on the user systemd instance like: $ systemd-run --user --scope true Then the scope seems to live past the end of the process. Is there any way to make it automatically go away with the last process in the cgroup? Well, yes, the idea is that that just works. However, this is kinda broken if the systemd instance managing your scope is not PID 1, as we don't get SIGCHLD then. Do you create any subcgroups? presumably not? Normally it should just work then, but I must admit that --user scopes got much less testing that system scopes... Oh, i'm not doing anything special at all: $ systemd-run --user --scope sleep 10 [1] 10613 $ Running as unit run-10613.scope. systemd-ctl status --user run-10613.scope ● run-10613.scope - /usr/bin/sleep 10 Loaded: loaded (/run/user/1000/systemd/user/run-10613.scope; static) Drop-In: /run/user/1000/systemd/user/run-10613.scope.d └─50-Description.conf Active: active (running) since tor 2015-03-05 10:06:24 CET; 8s ago CGroup: /user.slice/user-1000.slice/user@1000.service/run-10613.scope └─10613 /usr/bin/sleep 10 mar 05 10:06:24 localhost.localdomain systemd[1405]: Starting /usr/bin/sleep 10. mar 05 10:06:24 localhost.localdomain systemd[1405]: Started /usr/bin/sleep 10. $ sleep 10 [1]+ Donesystemd-run --user --scope sleep 10 $ systemctl status --user run-10613.scope ● run-10613.scope - /usr/bin/sleep 10 Loaded: loaded (/run/user/1000/systemd/user/run-10613.scope; static) Drop-In: /run/user/1000/systemd/user/run-10613.scope.d └─50-Description.conf Active: active (running) since tor 2015-03-05 10:06:24 CET; 25s ago mar 05 10:06:24 localhost.localdomain systemd[1405]: Starting /usr/bin/sleep 10. mar 05 10:06:24 localhost.localdomain systemd[1405]: Started /usr/bin/sleep 10. See, even when the sleep command died the scope still exists, and is even ACTIVE. Also, while we're on the topic of scopes. Is there any way to hang some random metadata off a unit during creation, that can be read back? For xdg-app I'd like to put information like the app id, the exact version, the security level, etc into the scope. Then anyone talking to the app could go: getpeercred = cgroup = scope = unfakable (by the app) data about the application. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander LarssonRed Hat, Inc al...@redhat.comalexander.lars...@gmail.com He's an uncontrollable coffee-fuelled filmmaker for the 21st century. She's an enchanted tomboy doctor with a song in her heart and a spring in her step. They fight crime! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Device cgroups for user systemd scopes
The user instance of systemd does not seem to apply the DevicePolicy for scopes. I.e. I can run: $ systemd-run --user --scope --property=DevicePolicy=strict glxgears Running as unit run-994.scope. ... runs fine, should fail to use DRI ... $ cat /run/user/1000/systemd/user/run-994.scope.d/50-DevicePolicy.conf [Scope] DevicePolicy=strict $ cat /proc/994/cgroup 10:hugetlb:/ 9:perf_event:/ 8:blkio:/ 7:net_cls,net_prio:/ 6:freezer:/ 5:devices:/user.slice 4:memory:/user.slice 3:cpu,cpuacct:/ 2:cpuset:/ 1:name=systemd:/user.slice/user-1000.slice/user@1000.service/run-994.scope This is with systemd-216-20.fc21.x86_64 from Fedora 21 under gnome. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander LarssonRed Hat, Inc al...@redhat.comalexander.lars...@gmail.com He's a world-famous Republican sorceror with a mysterious suitcase handcuffed to his arm. She's a cynical hip-hop politician from the wrong side of the tracks. They fight crime! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Cleaning up transient scopes
If i run a transient scope on the user systemd instance like: $ systemd-run --user --scope true Then the scope seems to live past the end of the process. Is there any way to make it automatically go away with the last process in the cgroup? -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander LarssonRed Hat, Inc al...@redhat.comalexander.lars...@gmail.com He's a gun-slinging gay filmmaker possessed of the uncanny powers of an insect. She's an elegant tomboy widow with someone else's memories. They fight crime! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Docker vs PrivateTmp
On mån, 2015-02-02 at 12:12 +0100, Lennart Poettering wrote: On Fri, 30.01.15 11:02, Alexander Larsson (al...@redhat.com) wrote: I think the problem is that docker daemon makes /var/lib/docker/devicemapper private in the host namespace to handle some scalability issues we found in the kernel. This causes problem not with docker containers (because they unmount all other mounts as per the above), but with other namespace-using apps. For instance, if a service with PrivateTmp is launched, it will inherit the existing mounts in /var/lib/docker/devicemapper at the point of startup, but when these are eventually unmounted in the host namespace this is not propagated into the service (due to it being a private mount, not a slave mount). We could try making this slave instead, but I don't know if that then fixes the scalability issues we had, because they were related to stupidities in the kernel wrt propagating mounts. If it doesn't work, then we have to put docker-daemon in its own namespace. The daemon should first create its own namespace, and then detach propagation, not the other way round. This really isn't stupidity in the kernel, but in docker's userspace... The stupidity was the O(n^4) algorithm in the kernel when it was duplicating all vfsmounts that could possibly be propagated, and then immediately freeing them when they did not propagate, which interacted poorly with some lame kernel O(n^2) allocator behaviour. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander LarssonRed Hat, Inc al...@redhat.comalexander.lars...@gmail.com He's an oversexed shark-wrestling rock star from the 'hood. She's a high-kicking cigar-chomping former first lady with the power to see death. They fight crime! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Docker vs PrivateTmp
On fre, 2015-01-23 at 11:31 -0500, Daniel J Walsh wrote: On 01/22/2015 10:02 PM, Lennart Poettering wrote: On Sat, 17.01.15 23:02, Lars Kellogg-Stedman (l...@redhat.com) wrote: See the `devicemapper` mountpoint created by Docker for the container: # grep devicemapper/mnt /proc/mounts /dev/mapper/docker-253:6-98310-e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 /var/lib/docker/devicemapper/mnt/e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62 ext4 rw,context=system_u:object_r:svirt_sandbox_file_t:s0:c261,c1018,relatime,discard,stripe=16,data=ordered 0 0 I am not sure why docker makes these mounts visible in the host namespace at all. This smells like a bug. They need to at least be visible to the docker daemon, because it needs to look into it to do diffs between images when e.g. commiting. It doesn't necessarily have to be in the host namespace though, it could be in a different namespace owned only by the docker daemon. I wanted to do that, but for reasons that escape me at the moment that was problematic and I never got to it. Watch Docker fail to destroy the container because it is unable to remove the mountpoint directory: Jan 17 22:43:03 pk115wp-lkellogg docker-1.4.1-dev[18239]: time=2015-01-17T22:43:03-05:00 level=error msg=Handler for DELETE /containers/{name:.*} returned error: Cannot destroy container e68df3f45d61: Driver devicemapper failed to remove root filesystem e68df3f45d6151259ce84a0e467a3117840084e99ef3bbc654b33f08d2d6dd62: Device is Busy This smells as if Docker incorrectly sets the mount propagation bits on its own mounts. It would be good checking /proc/self/mountinfo inside and outside of docker's own namespace, and checking how the propagation bits are set for the individual mounts. It's a bit hard to read, but the interesting bits are in the 7th column of that file. In general: docker should do the equivalent of mount --make-rslave / as first thing after opening its mount namespace, so that from that point on mounts and especiall *un*mounts propagate from the host into the container, but not vice versa. If they do not invoke that, then the propagation will stay at shared, which means the mounts will appear in the host and vice versa, which is certainly undesired. Also, they should not use mount --make-rprivate /, as that means anything the host mounted will stay mounted in the container forever, which is a problem. Also, they really need to make this recursive, so that all mount points they have access too are detached from the host! It was a while since I looked at this, but i believe that the docker containers run as MS_PRIVATE, and they explicitly unmount all the host filesystems exept the ones specifically mounted in as volumes. I think the problem is that docker daemon makes /var/lib/docker/devicemapper private in the host namespace to handle some scalability issues we found in the kernel. This causes problem not with docker containers (because they unmount all other mounts as per the above), but with other namespace-using apps. For instance, if a service with PrivateTmp is launched, it will inherit the existing mounts in /var/lib/docker/devicemapper at the point of startup, but when these are eventually unmounted in the host namespace this is not propagated into the service (due to it being a private mount, not a slave mount). We could try making this slave instead, but I don't know if that then fixes the scalability issues we had, because they were related to stupidities in the kernel wrt propagating mounts. If it doesn't work, then we have to put docker-daemon in its own namespace. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander LarssonRed Hat, Inc al...@redhat.comalexander.lars...@gmail.com He's an impetuous amnesiac hairdresser who dotes on his loving old ma. She's an elegant Bolivian single mother from out of town. They fight crime! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] udev database backwards compatibility guarantees
On lör, 2014-10-25 at 13:45 +0200, Kay Sievers wrote: Kay, any ideas on the udev database stability? No stability. And so far no guarantees that things will not change. The versions of the udev daemon, libudev and the runtime data must match. Any expectations about version mix and match would require a promise we do not give at this moment. It might change with an imaginary sd-device library, but it is very unlikely to happen with the current udev. So, libudev will not be supportable as bundled in a sandboxed app then? ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] udev database backwards compatibility guarantees
Hi, I'm looking at creating a runtime/app thing for Gnome in the style of: http://0pointer.net/blog/revisiting-how-we-put-together-linux-systems.html However, I noticed that some core dependencies like mesa uses libudev. And in fact, needs user-set additional info not in sysfs. In particular, it reads ID_PATH_TAG on render device nodes to pick what GPU to use in multi-gpu situations (PRIME): http://lists.freedesktop.org/archives/mesa-dev/2014-June/061798.html It seems to me that this means I need the host /run/udev inside the application. I know that the udev database format changed in the past, but can I rely on it being stable in the future, even if the host udev is rev:ed to a later version than what is in the application runtime? Of course, there is also the question of /dev and /sys management in sandboxed apps in general. Clearly any modern app will require some real devices for things like direct rendering. But it would be ideal to not expose everything. How do we see this working? ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel