Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-27 Thread Lennart Poettering
On Tue, 27.01.15 10:53, Christian Seiler (christ...@iwakd.de) wrote:

 LXC predates systemd by about 2 years. (And at the very beginning,
 systemd didn't support containers out of the box, so it predates
 systemd's container support by even more.) And at that time, doing that
 was a way to sysvinit containers with no or minimal modification to
 /etc/inittab. So instead of saying that LXC breaks systemd's
 assumptions, you could also say systemd breaks LXC's assumptions. As I
 said: bubbles. ;-)

Well, LXC breaks everbody's assumptions, not just
systemd's. /dev/tty[1-6] refers to the VT, and TERM=linux is the right
$TERM for it. However, if you actually have a pty and an xterm behind
it then these settings will be incorrect for a ton of programs.

 Now I'm not going to argue with you that the method of doing
 $container_ttys= isn't vastly superior to what was there previously,
 because it is. So I don't disagree with the long-term solution at
 all.

Note that $container_ttys= is actually just a frontend for dynamically
instantiating console-getty@.service instances for the specified
ptys. You can just enable them statically too.

(And 'machinectl login' actually even instantiateds them during
runtime to allow dynamic logins to an local container that registers
with it...)

 But LXC 1.0 just doesn't support this yet, so the question is what to do
 in the mean time. If I do what I described:
 
  - logind can't open /dev/tty0, so all VT management in there is
disabled anyway
  - within systemd: vt_disallocate can't open /dev/tty0, so it just
returns an error, but that error code is never checked in
core/execute.c, so it just behaves as if the directive never had
been there
  - getty@.service statically enabled just runs agetty, so really only
$TERM is wrong

Well, it's also conditionalized to /dev/tty0. Instead of patching the
unit file you could as well just instantiate container-getty@.service
in /etc, get the right $TERM and be done with it.

 Speaking of, isn't there a bug in container-getty@.service?[*] It uses
 ConditionPathExists=/dev/pts/%I, starts agetty on pts/%I but sets
 TTYPath=/dev/%I instead of /dev/pts/%I... And having the utmp specifier
 be just a number (%I) instead of pts/%I is also probably weird.

True, and true!

Thanks for the pointer. Fixed in git!

 Fair enough[#], but did you receive my patches for the part about
 skipping on missing perms?

Yes, I have a huge backlog of unprocessed mail, and am currently
wading through it backwards in time. Sorry for the delay!

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-27 Thread Christian Seiler
On a general note: the stuff I mentioned that I did to modify the
container was just taken from the lxc-debian template that comes with
LXC 1.0, and I didn't have time to look at it thoroughly to see what's
actually needed there. The stuff I mentioned was more along the lines of
'what I did to get where I was if somebody wanted to reproduce it'
instead of 'I recommend doing that'.

Am 27.01.2015 um 03:08 schrieb Lennart Poettering:
  - explicitly enable getty@tty{1,2,3,4}.service
 
 Why?

Ah, it's nice to see we all live in our own bubbles. :-)

LXC predates systemd by about 2 years. (And at the very beginning,
systemd didn't support containers out of the box, so it predates
systemd's container support by even more.) And at that time, doing that
was a way to sysvinit containers with no or minimal modification to
/etc/inittab. So instead of saying that LXC breaks systemd's
assumptions, you could also say systemd breaks LXC's assumptions. As I
said: bubbles. ;-)

Now I'm not going to argue with you that the method of doing
$container_ttys= isn't vastly superior to what was there previously,
because it is. So I don't disagree with the long-term solution at all.

But LXC 1.0 just doesn't support this yet, so the question is what to do
in the mean time. If I do what I described:

 - logind can't open /dev/tty0, so all VT management in there is
   disabled anyway
 - within systemd: vt_disallocate can't open /dev/tty0, so it just
   returns an error, but that error code is never checked in
   core/execute.c, so it just behaves as if the directive never had
   been there
 - getty@.service statically enabled just runs agetty, so really only
   $TERM is wrong
 - but that was wrong already with sysvinit containers, and I never had
   any major issues because of that

So yeah, it's not pretty, it shouldn't stay in the long run, I
completely agree with your reasoning why you don't like it, but
currently it does seem to 'work'.

That being the case, thinking about it, I actually don't use this
feature myself (with kernels = 3.12 or so, lxc-attach works quite well,
so I never actually had the need to use a console to log in to a
container, for normal purposes I use SSH anyway), so maybe I'm just
going to deactivate the whole thing in my local config anyway.

Speaking of, isn't there a bug in container-getty@.service?[*] It uses
ConditionPathExists=/dev/pts/%I, starts agetty on pts/%I but sets
TTYPath=/dev/%I instead of /dev/pts/%I... And having the utmp specifier
be just a number (%I) instead of pts/%I is also probably weird.

[*]
http://cgit.freedesktop.org/systemd/systemd/tree/units/container-ge...@.service.m4.in

  - mask systemd-udevd.service (haven't tested if that's actually needed,
the lxc-debian template also does this however)
 
 There's no point in doing that. udev uses
 ConditionPathIsReadWrite=/sys anyway, and is automatically skipped
 hence when /sys is read-only.

Ah, good point, so it is in fact not needed. No idea why that's in there
then. Perhaps from a historic attempt when systemd didn't have that
Condition in there?

  - touch /etc/fstab if you debootstrap it directly
 
 You can just remove it. You don't need it in containers (and not even
 on most hosts, unless you actually need to refer to external
 partitions that cannot be auto-configured.

Ah, indeed, just tried it. Interesting, why did I write that? No idea...

 I am tempted to just
 change nspawn to mount a private tmpfs into /run/user, too, as it
 already mounts /run anyway.

 That would solve /run-quota issues for CAP_SYS_ADMIN-less containers,
 but is unnecessary (although harmless) for those that do have it.

 I decided against doing this after all. [...] Hence, we either do
 something (possibly skipping it it on missing perms) or, we don't do
 it at all, [...]

Fair enough[#], but did you receive my patches for the part about
skipping on missing perms?

http://lists.freedesktop.org/archives/systemd-devel/2015-January/027343.html
http://lists.freedesktop.org/archives/systemd-devel/2015-January/027344.html

[#] One could probably always do --tmpfs=/run/lock:options with nspawn
anyway, if one wants to explicitly do this.

Christian

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-27 Thread Christian Seiler
Am 27.01.2015 um 14:46 schrieb Lennart Poettering:
 Note that $container_ttys= is actually just a frontend for dynamically
 instantiating console-getty@.service instances for the specified
 ptys. You can just enable them statically too.

No, I can't, because you only support PTY numbers in that interface and
I can't predict which ones will get assigned. Oh, I see now, I can use
../ in the statically enabled, and that actually works. If I now combine
that with LXC's feature to add a subdir to the ttys, I can do the following:

 - tell LXC to create /dev/lxc/ttyN instead
 - statically enable container-getty@..-lxc-ttyN.service
 - just tried it: works

Thanks!

Christian

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-26 Thread Lennart Poettering
On Fri, 23.01.15 19:35, Christian Seiler (christ...@iwakd.de) wrote:

  - explicitly enable getty@tty{1,2,3,4}.service

Why? This cannot work. The getty services assume a Linux console tty,
they will issue ioctls and ansi sequences that only the linux console
supports, and do VT management on them.

/dev/tty1, /dev/tty2 and so on must refer to proper VT devices, that
come with the matching files in /sys/class/tty/, with matching
/dev/vcsa* and so on. Mounting a pseudo-tty to /dev/tty1, /dev/tt2 and
so on is a *really* bad idea.

If LXC suggests such a configuration, then please talk to the LXC
guys, this is *very* broken and should not be done. You'll confuse
systemd, logind and the gettys with that. You'll get an incorrect
TERM, and everything else will be fucked up, too.

Also, there's really no reason to do this. Just create as many ptys as
you wish, and then pass $container_ttys= to PID 1 with their names,
and systemd will do the right thing. See
http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/
for details.

  - no ConditrionPathExists=/dev/tty0 for getty@.service

Yeah, well, this stuff is there for a reason. Don't remove that
stuff. This automatically disables the VT logic if no VT is
available. You shouldn't hack around that. Just use proper container
gettys instead (container-getty@.service, which are automatcailly
instantiated via $container_ttys= among others).

  - mask systemd-udevd.service (haven't tested if that's actually needed,
the lxc-debian template also does this however)

There's no point in doing that. udev uses
ConditionPathIsReadWrite=/sys anyway, and is automatically skipped
hence when /sys is read-only. COntainer manager really should set up
/sys read-only in containers, so that the various cotnainers don't
confuse each other by all trying to manage and change /sys, and more
importantly cannot fuck with security-sensitive settings. 

  - touch /etc/fstab if you debootstrap it directly

You can just remove it. You don't need it in containers (and not even
on most hosts, unless you actually need to refer to external
partitions that cannot be auto-configured.

  - I hope I didn't forget anything

I spent quite some time to ensuer that systemd systems work
out-of-the-box in container managers. Any container manager that
implements this stuff
http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/
should just work out-of-the-box, without *any* modification of the
system to boot.

  I am tempted to just
  change nspawn to mount a private tmpfs into /run/user, too, as it
  already mounts /run anyway.
 
 That would solve /run-quota issues for CAP_SYS_ADMIN-less containers,
 but is unnecessary (although harmless) for those that do have it.

I decided against doing this after all. I think that systemd in a
container and on baremetal should work as similar as possible, and
thus not have orthogonal setups in /run. Hence, we either do something
(possibly skipping it it on missing perms) or, we don't do it at all,
but we don't do completely different things in different cases.

 (Note that in Debian you can also configure it to be on the same tmpfs
 as /run, but since on Debian it has mode 1777, there's a good reason NOT
 to do that.)

Yuck. Maybe Debian should lock that down. World-writable directories
are dangerous, nobody should use that.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-26 Thread Cameron Norman
On Mon, Jan 26, 2015 at 6:08 PM, Lennart Poettering
lenn...@poettering.net wrote:
 On Fri, 23.01.15 19:35, Christian Seiler (christ...@iwakd.de) wrote:

  - I hope I didn't forget anything

 I spent quite some time to ensuer that systemd systems work
 out-of-the-box in container managers. Any container manager that
 implements this stuff
 http://www.freedesktop.org/wiki/Software/systemd/ContainerInterface/
 should just work out-of-the-box, without *any* modification of the
 system to boot.

Indeed, it seems the only thing that LXC needs to do to meet systemd
is add an lxc.ptys option and phase out lxc.ttys in favor of that.

I have filed a bug on LXC here: https://github.com/lxc/lxc/issues/419

Cheers,
--
Cameron
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-23 Thread Christian Seiler
Am 23.01.2015 um 18:57 schrieb Lennart Poettering:
 Am 2015-01-23 08:29, schrieb Mantas Mikulėnas:
 IIRC, the reason for tmpfs on /run/user/* was lack of tmpfs quotas...
 if thats still a problem, maybe there could be one tmpfs at /run/user,
 still preventing users from touching root-only /run?

 Yes, that's a good idea. Initially when posting this thread I thought
 that there just had to be a trade-off between dropping CAP_SYS_ADMIN
 (and making it more difficult to escape the container), and a user
 inside the container DOSing the container by filling up /run.

 But with your idea, I can at least separate /run/user from /run
 itself 
 
 Hmm, which container manager are you using?

LXC 1.0.6 (which comes with Debian Jessie). I use the following
configuration for containers w/o CAP_SYS_ADMIN (note: I'm not claiming
this is secure (non-userns-containers may never be), and also this is
still work in progress and I'm only posting this as a proof of concept
and so that other people can reproduce it):

/etc/lxc/lxc.conf:

lxc.cgroup.use = @all

/etc/lxc/jessie-container.conf:

# This is still work in progress, I can probably get rid of some of
# those FSs, I'm not really comfortable with e.g. debugfs there.
# But if I remove them, I'll probably have to mask the units unless I
# want error messages on every container startup, and I would really
# like to keep the delta low... Still thinking about that.
lxc.mount.auto = proc sys cgroup:mixed
lxc.mount.entry = tmpfs dev/shm tmpfs rw,nosuid,nodev,create=dir 0 0
lxc.mount.entry = tmpfs run tmpfs rw,nosuid,nodev,mode=755,create=dir 0 0
lxc.mount.entry = tmpfs run/lock tmpfs
rw,nosuid,nodev,noexec,relatime,size=5120k,create=dir 0 0
lxc.mount.entry = debugfs sys/kernel/debug debugfs rw,relatime 0 0
lxc.mount.entry = mqueue dev/mqueue mqueue rw,relatime,create=dir 0 0
lxc.mount.entry = hugetlbfs dev/hugepages hugetlbfs
rw,relatime,create=dir 0 0
# here I'll probably add the /run/user entry
lxc.tty = 4
lxc.pts = 1024

lxc.cap.drop = sys_admin sys_module mac_admin mac_override net_admin
sys_time syslog

lxc.cgroup.devices.deny = a
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = b *:* m
lxc.cgroup.devices.allow = c 1:3 rwm   #/dev/null
lxc.cgroup.devices.allow = c 1:5 rwm   #/dev/zero
lxc.cgroup.devices.allow = c 1:7 rwm   #/dev/full
lxc.cgroup.devices.allow = c 5:0 rwm   #/dev/tty
lxc.cgroup.devices.allow = c 1:8 rwm   #/dev/random
lxc.cgroup.devices.allow = c 1:9 rwm   #/dev/urandom
lxc.cgroup.devices.allow = c 1:9 rwm   #/dev/urandom
lxc.cgroup.devices.allow = c 5:2 rwm   #/dev/pts/ptmx
lxc.cgroup.devices.allow = c 136:* rwm #/ev/pts/*
lxc.cgroup.devices.allow = c 254:0 rm  #/dev/rtc{,0}
lxc.cgroup.devices.allow = c 10:228 rm #/dev/hpet

# this is just the Debian default, I didn't change anything
# there (so far):
lxc.seccomp = /usr/share/lxc/config/common.seccomp

lxc.autodev = 1
lxc.kmsg = 0

lxc.haltsignal = SIGRTMIN+14

/var/lib/lxc/$container/config:

lxc.include = /etc/lxc/jessie-container.conf
lxc.utsname = something
lxc.rootfs  = /path/to/something
lxc.arch= amd64

# network:
lxc.network.type = veth
# (and other directives that specify IP etc.)

Also inside the container the following changes w.r.t. vanilla Jessie:

 - explicitly enable getty@tty{1,2,3,4}.service
 - no ConditrionPathExists=/dev/tty0 for getty@.service
 - mask systemd-udevd.service (haven't tested if that's actually needed,
   the lxc-debian template also does this however)
 - touch /etc/fstab if you debootstrap it directly
 - I hope I didn't forget anything

Didn't try other Distros inside the containers yet (or LXC w/ systemd on
other distros for that matter).

Also, on the host, I DON'T have cgmanager or similar installed.

 I am tempted to just
 change nspawn to mount a private tmpfs into /run/user, too, as it
 already mounts /run anyway.

That would solve /run-quota issues for CAP_SYS_ADMIN-less containers,
but is unnecessary (although harmless) for those that do have it.

 (the same way mode=1777 /run/lock is a separate tmpfs already)
 by just a simple static mount entry for the container.
 
 Hmm, /run/lock is a sepatate tmpfs? /run/lock is a pretty useless,
 legacy thing. Which distro is this?

Debian Jessie. But a box with Fedora 19 here also has it (although not
mode=1777 but mode=0755) and in both Debian Jessie and Fedora 19 there
is some stuff in there. Although on Fedora it's not a separate tmpfs.

(Note that in Debian you can also configure it to be on the same tmpfs
as /run, but since on Debian it has mode 1777, there's a good reason NOT
to do that.)

Christian

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-23 Thread Lennart Poettering
On Fri, 23.01.15 15:45, Christian Seiler (christ...@iwakd.de) wrote:

 Am 2015-01-23 08:29, schrieb Mantas Mikulėnas:
 IIRC, the reason for tmpfs on /run/user/* was lack of tmpfs quotas...
 if thats still a problem, maybe there could be one tmpfs at /run/user,
 still preventing users from touching root-only /run?
 
 Yes, that's a good idea. Initially when posting this thread I thought
 that there just had to be a trade-off between dropping CAP_SYS_ADMIN
 (and making it more difficult to escape the container), and a user
 inside the container DOSing the container by filling up /run.
 
 But with your idea, I can at least separate /run/user from /run
 itself 

Hmm, which container manager are you using? I am tempted to just
change nspawn to mount a private tmpfs into /run/user, too, as it
already mounts /run anyway.

 (the same way mode=1777 /run/lock is a separate tmpfs already)
 by just a simple static mount entry for the container.

Hmm, /run/lock is a sepatate tmpfs? /run/lock is a pretty useless,
legacy thing. Which distro is this?

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-23 Thread David Herrmann
Hi

On Thu, Jan 22, 2015 at 3:53 PM, Christian Seiler christ...@iwakd.de wrote:
 [1] Note that the only other issue I stumbled upon has now been fixed,
 so in general I would say that systemd already works really well
 in containers without CAP_SYS_ADMIN if you know how to set them
 up properly.

Just as a heads-up: The device-delegation API
(src/logind/logind-session-device.c) will also fail if you run without
CAP_SYS_ADMIN. Admittedly, DRM and input devices usually don't matter
in containers, so it's fine. But on main systems, we really need
CAP_SYS_ADMIN.

Thanks
David
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-23 Thread Christian Seiler

Am 2015-01-23 08:29, schrieb Mantas Mikulėnas:

IIRC, the reason for tmpfs on /run/user/* was lack of tmpfs quotas...
if thats still a problem, maybe there could be one tmpfs at 
/run/user,

still preventing users from touching root-only /run?


Yes, that's a good idea. Initially when posting this thread I thought
that there just had to be a trade-off between dropping CAP_SYS_ADMIN
(and making it more difficult to escape the container), and a user
inside the container DOSing the container by filling up /run.

But with your idea, I can at least separate /run/user from /run
itself (the same way mode=1777 /run/lock is a separate tmpfs already)
by just a simple static mount entry for the container.

Thanks for bringing this up!

Christian

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-23 Thread Michael Biebl
2015-01-23 8:29 GMT+01:00 Mantas Mikulėnas graw...@gmail.com:
 IIRC, the reason for tmpfs on /run/user/* was lack of tmpfs quotas... if
 that's still a problem, maybe there could be one tmpfs at /run/user, still
 preventing users from touching root-only /run?

FWIW, as long as logind didn't setup per-user tmpfs, we used such a
/run/user tmpfs in Debian to avoid users accidentally DoSing the
system by filling up /run.

Michael

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-23 Thread Lennart Poettering
On Fri, 23.01.15 09:29, Mantas Mikulėnas (graw...@gmail.com) wrote:

 On Fri, Jan 23, 2015 at 4:04 AM, Lennart Poettering lenn...@poettering.net
 wrote:
 
  On Thu, 22.01.15 15:53, Christian Seiler (christ...@iwakd.de) wrote:
 
   Nevertheless, I think it would be great if this could also be fixed,
   because you never know what other applications people might come up
   with.
  
   The solution would probably be to just add a code path to chown
   the directory instead of mounting a tmpfs on top of it. That doesn't
   separate users from root inside the container quite as much, but in
   containers without CAP_SYS_ADMIN, I think that's a trade-off that's
   worth making.
  
   What do you think?
 
  Yeah, I agree. If we cannot mount the tmpfs due to EPERM we should add
  a fallback to use a simple directory instead. Would be happy to take a
  patch for that.
 
 
 IIRC, the reason for tmpfs on /run/user/* was lack of tmpfs quotas... if
 that's still a problem, maybe there could be one tmpfs at /run/user, still
 preventing users from touching root-only /run?

Well, we logind cannot mount that either. If so, the container manager
would have to mount that, which it can, logind should be happy with
it.

In general though I think our code paths should be do it fully and
skip it if we lack the perms. I am not a fan of adding a multitude
of additional code paths along the lines of try something different
if we lack the perms...

Hence, let's keep this simple: either we mount per-user tmpfs, or we
don't, but let's not invent complex fallback strategies...

I mean, I am not sure I am convinced that CAP_SYS_ADMIN-less
contianers really make that much sense anyway, and I think people
should be OK with them not providing the same guarantees as the ones
that have it...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-22 Thread Christian Seiler

I've been playing around with systemd on Debian Jessie in
CAP_SYS_ADMIN-less and I came upon the following issue[1]:

Without CAP_SYS_ADMIN, logind is unable to mount a per-user tmpfs to
/run/user/$UID. Relevant journal messages:

systemd-logind[48]: Failed to mount per-user tmpfs directory 
/run/user/600: Operation not permitted
sshd[1357]: pam_systemd(sshd:session): Failed to create session: Access 
denied


The user is still allowed to log in, but there are some unwanted side
effects:

 - ls -l /run/user
   total 0
   drwx-- 2 root root 40 Jan 22 15:00 0
   drwx-- 2 root root 40 Jan 22 14:46 600

   Therefore, /run/user/$UID is effectively useless because the
   permissions are wrong (logind aborts after mkdir but before
   mount). Also: lack of cleanup on this error could be considered
   a second (more minor) problem.

 - XDG_RUNTIME_DIR not set (pam_systemd aborts beforehand)

 - user not registered in logind (loginctl doesn't show user)

 - user not put in cgroup (logind aborts before that logic happens),
   they stay in the getty@tty*.service / ssh.service cgroup.

 - probably some more stuff related to this

Obviously, without CAP_SYS_ADMIN you can never mount not even a tmpfs
to /run/user/$UID, so it's clear why it doesn't work.

For now, this is mostly a cosmetic issue for me, because I don't
really need logind functionality in such containers, so it's not a
huge problem for me.

Nevertheless, I think it would be great if this could also be fixed,
because you never know what other applications people might come up
with.

The solution would probably be to just add a code path to chown
the directory instead of mounting a tmpfs on top of it. That doesn't
separate users from root inside the container quite as much, but in
containers without CAP_SYS_ADMIN, I think that's a trade-off that's
worth making.

What do you think?

Regards,
Christian

[1] Note that the only other issue I stumbled upon has now been fixed,
so in general I would say that systemd already works really well
in containers without CAP_SYS_ADMIN if you know how to set them
up properly.

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-22 Thread Lennart Poettering
On Thu, 22.01.15 15:53, Christian Seiler (christ...@iwakd.de) wrote:

 Nevertheless, I think it would be great if this could also be fixed,
 because you never know what other applications people might come up
 with.
 
 The solution would probably be to just add a code path to chown
 the directory instead of mounting a tmpfs on top of it. That doesn't
 separate users from root inside the container quite as much, but in
 containers without CAP_SYS_ADMIN, I think that's a trade-off that's
 worth making.
 
 What do you think?

Yeah, I agree. If we cannot mount the tmpfs due to EPERM we should add
a fallback to use a simple directory instead. Would be happy to take a
patch for that.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] logind vs CAP_SYS_ADMIN-lessness

2015-01-22 Thread Mantas Mikulėnas
On Fri, Jan 23, 2015 at 4:04 AM, Lennart Poettering lenn...@poettering.net
wrote:

 On Thu, 22.01.15 15:53, Christian Seiler (christ...@iwakd.de) wrote:

  Nevertheless, I think it would be great if this could also be fixed,
  because you never know what other applications people might come up
  with.
 
  The solution would probably be to just add a code path to chown
  the directory instead of mounting a tmpfs on top of it. That doesn't
  separate users from root inside the container quite as much, but in
  containers without CAP_SYS_ADMIN, I think that's a trade-off that's
  worth making.
 
  What do you think?

 Yeah, I agree. If we cannot mount the tmpfs due to EPERM we should add
 a fallback to use a simple directory instead. Would be happy to take a
 patch for that.


IIRC, the reason for tmpfs on /run/user/* was lack of tmpfs quotas... if
that's still a problem, maybe there could be one tmpfs at /run/user, still
preventing users from touching root-only /run?

-- 
Mantas Mikulėnas graw...@gmail.com
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel