Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-11 Thread Lennart Poettering
On Wed, 11.02.15 17:53, Djalal Harouni (tix...@opendz.org) wrote:

> On Wed, Feb 11, 2015 at 05:06:56PM +0100, Lennart Poettering wrote:
> > On Wed, 11.02.15 13:53, Djalal Harouni (tix...@opendz.org) wrote:
> > 
> > > On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote:
> > > > On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote:
> > > > 
> > > > > Hello!
> > > > > Does it possible to create container as regular user? Oh what 
> > > > > capabilities
> > > > > i need to add to create container not using root?
> > > > 
> > > > Invoking containers without privileges is not supported by nspawn, and
> > > > this is unlikely to change, as I fail to see any strong usecase for
> > > > this... 
> > > >
> > > > If somebody can englighten me about the usecase for allowing
> > > > containers to be run by unprivileged users, I'd be willing to change
> > > > my mind though...
> > > A quick argument against it, IOW just wait and see!
> > > 
> > > As unprivileged we don't have CAP_SYS_MODULE set, but inside
> > > unprivileged containers we are root, and a call to cap_get_flag() on
> > > CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true,
> > > we don't have CAP_SYS_MODULE... this will confuse programs running
> > > inside containers, we'll have to add more code paths for this special
> > > case... and not only CAP_SYS_MODULE, perhaps there are other cases...
> > 
> > Well, but we could drop CAP_SYS_MODULE both before and after setting
> > up the userns, so that the cap is missing fro the PID both inside and
> > outside of it...
> Indeed, yes but still there are other obscure cases, like CAP_SYS_ADMIN,
> even if you have it, you won't be able to mount file systems like btrfs
> and others, only a subset of virtual filesystems support unprivileged
> user mounting... yeh we could drop it too, and it seems that systemd was
> adapted recently to work in this situation, but what about other code ?
> or if you want todo some sort of system replication inside
> container...

Well, some mounting is allowed if you have in CAP_SYS_ADMIN, so we can
pass this out, I figure...

Note that the inability to mount btrfs shouldn't be too limiting,
since we don't expose physical devices in nspawn anyway, and what you
don't have you cannot mount anyway...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-11 Thread Djalal Harouni
On Wed, Feb 11, 2015 at 05:06:56PM +0100, Lennart Poettering wrote:
> On Wed, 11.02.15 13:53, Djalal Harouni (tix...@opendz.org) wrote:
> 
> > On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote:
> > > On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote:
> > > 
> > > > Hello!
> > > > Does it possible to create container as regular user? Oh what 
> > > > capabilities
> > > > i need to add to create container not using root?
> > > 
> > > Invoking containers without privileges is not supported by nspawn, and
> > > this is unlikely to change, as I fail to see any strong usecase for
> > > this... 
> > >
> > > If somebody can englighten me about the usecase for allowing
> > > containers to be run by unprivileged users, I'd be willing to change
> > > my mind though...
> > A quick argument against it, IOW just wait and see!
> > 
> > As unprivileged we don't have CAP_SYS_MODULE set, but inside
> > unprivileged containers we are root, and a call to cap_get_flag() on
> > CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true,
> > we don't have CAP_SYS_MODULE... this will confuse programs running
> > inside containers, we'll have to add more code paths for this special
> > case... and not only CAP_SYS_MODULE, perhaps there are other cases...
> 
> Well, but we could drop CAP_SYS_MODULE both before and after setting
> up the userns, so that the cap is missing fro the PID both inside and
> outside of it...
Indeed, yes but still there are other obscure cases, like CAP_SYS_ADMIN,
even if you have it, you won't be able to mount file systems like btrfs
and others, only a subset of virtual filesystems support unprivileged
user mounting... yeh we could drop it too, and it seems that systemd was
adapted recently to work in this situation, but what about other code ?
or if you want todo some sort of system replication inside container...

I guess we'll endup trying to know if this is the real capability or the
diminished version... or if we are inside a userns...


> Lennart
> 
> -- 
> Lennart Poettering, Red Hat

-- 
Djalal Harouni
http://opendz.org
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-11 Thread Lennart Poettering
On Wed, 11.02.15 13:53, Djalal Harouni (tix...@opendz.org) wrote:

> On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote:
> > On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote:
> > 
> > > Hello!
> > > Does it possible to create container as regular user? Oh what capabilities
> > > i need to add to create container not using root?
> > 
> > Invoking containers without privileges is not supported by nspawn, and
> > this is unlikely to change, as I fail to see any strong usecase for
> > this... 
> >
> > If somebody can englighten me about the usecase for allowing
> > containers to be run by unprivileged users, I'd be willing to change
> > my mind though...
> A quick argument against it, IOW just wait and see!
> 
> As unprivileged we don't have CAP_SYS_MODULE set, but inside
> unprivileged containers we are root, and a call to cap_get_flag() on
> CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true,
> we don't have CAP_SYS_MODULE... this will confuse programs running
> inside containers, we'll have to add more code paths for this special
> case... and not only CAP_SYS_MODULE, perhaps there are other cases...

Well, but we could drop CAP_SYS_MODULE both before and after setting
up the userns, so that the cap is missing fro the PID both inside and
outside of it...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-11 Thread Djalal Harouni
On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote:
> On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote:
> 
> > Hello!
> > Does it possible to create container as regular user? Oh what capabilities
> > i need to add to create container not using root?
> 
> Invoking containers without privileges is not supported by nspawn, and
> this is unlikely to change, as I fail to see any strong usecase for
> this... 
>
> If somebody can englighten me about the usecase for allowing
> containers to be run by unprivileged users, I'd be willing to change
> my mind though...
A quick argument against it, IOW just wait and see!

As unprivileged we don't have CAP_SYS_MODULE set, but inside
unprivileged containers we are root, and a call to cap_get_flag() on
CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true,
we don't have CAP_SYS_MODULE... this will confuse programs running
inside containers, we'll have to add more code paths for this special
case... and not only CAP_SYS_MODULE, perhaps there are other cases...


-- 
Djalal Harouni
http://opendz.org
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-10 Thread Lennart Poettering
On Thu, 05.02.15 15:48, Vasiliy Tolstov (v.tols...@selfip.ru) wrote:

> 2015-02-05 12:44 GMT+03:00 Alban Crequy :
> 
> > Manual page namespaces(7):
> >
> >Creation of new namespaces using clone(2) and unshare(2) in most
> > cases
> >requires the CAP_SYS_ADMIN capability.  User namespaces are the
> >exception: since  Linux 3.8, no privilege is required to create a
> > user
> >namespace.
> >
> 
> So as i understand i can't create full featured container with network
> under non root user (and not have cap_sys_admin)

unprivileged containers are unlikely to ever support that. creating a
network interface on the host will necessary require privileges. If
you hence want "full network" support (by which i assume you mean veth
links and stuff), then you are generally out of luck...

You can run nspawn containers without CAP_SYS_ADMIN via nspawn's
--drop-capability=CAP_SYS_ADMIN switch. However, YMMY, as the code you
run inside of the container must be Ok with that not having those
perms and systemd at least until very recently didn't like that at
all...

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-10 Thread Lennart Poettering
On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote:

> Hello!
> Does it possible to create container as regular user? Oh what capabilities
> i need to add to create container not using root?

Invoking containers without privileges is not supported by nspawn, and
this is unlikely to change, as I fail to see any strong usecase for
this... 

If somebody can englighten me about the usecase for allowing
containers to be run by unprivileged users, I'd be willing to change
my mind though...

Note that to my knowledge any support for unprivileged containers has
been disabled in the kernel on many distros though including Fedora's,
since it's basically one giant security hole.

Note that many of machinectl's commands involve polkit checks, which
means it's easy to open them up for unprivileged clients. However,
in that case the containers would be forked off and maintained
privileged, only the clients will be unprivileged...

LXC supports unprivileged containers though, this might be an option
for you.

Lennart

-- 
Lennart Poettering, Red Hat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-05 Thread Alban Crequy
On 5 February 2015 at 12:48, Vasiliy Tolstov  wrote:
>
> 2015-02-05 12:44 GMT+03:00 Alban Crequy :
>>
>> Manual page namespaces(7):
>>
>>Creation of new namespaces using clone(2) and unshare(2) in most
>> cases
>>requires the CAP_SYS_ADMIN capability.  User namespaces are the
>>exception: since  Linux 3.8, no privilege is required to create a
>> user
>>namespace.
>
>
> So as i understand i can't create full featured container with network under
> non root user (and not have cap_sys_admin)

caps like CAP_SYS_ADMIN don't have an global meaning anymore but
refers to operations a process can do *in its current namespace*. An
unprivileged process (uid!=0, without cap_sys_admin) can join a user
namespace and get uid=0 & cap_sys_admin for operations inside the user
namespace, but it will still have uid!=0 & !cap_sys_admin for
operations in the parent user namespace.

user_namespaces(7) contains userns_child_exec.c and it creates a fully
featured container with network without being root. (I attached a
patched version I was testing)

# # Because I'm using the kernel patched by my distribution
# echo 1 > /proc/sys/kernel/unprivileged_userns_clone

$ gcc -lcap -o userns_child_exec userns_child_exec.c

Here it seems to work:

alban@alban:~$ ls -l /tmp/userns_child_exec
-rwxr-xr-x 1 alban alban 14488 Feb  5 23:24 /tmp/userns_child_exec
alban@alban:~$ id -u
1000
alban@alban:~$ ip link # ---> will show lo, eth0, wlan0...
alban@alban:~$ /tmp/userns_child_exec -p -m -U -M '0 1000 1' -G '0
1000 1' -n bash
About to exec bash
root@alban:~# id
uid=0(root) gid=0(root) groups=0(root),65534(nogroup)
root@alban:~# ip link # ---> only lo visible in this namespace

Cheers,
Alban
--- userns_child_exec.orig.c	2015-02-05 23:20:19.208741366 +0100
+++ userns_child_exec.c	2015-01-30 17:01:56.948493001 +0100
@@ -108,6 +108,30 @@
 close(fd);
 }
 
+static void
+write_file(char *content, char *path)
+{
+int fd;
+size_t content_len;
+
+content_len = strlen(content);
+
+fd = open(path, O_RDWR);
+if (fd == -1) {
+fprintf(stderr, "ERROR: open %s: %s\n", path,
+strerror(errno));
+exit(EXIT_FAILURE);
+}
+
+if (write(fd, content, content_len) != content_len) {
+fprintf(stderr, "ERROR: write %s: %s\n", content,
+strerror(errno));
+exit(EXIT_FAILURE);
+}
+
+close(fd);
+}
+
 static int  /* Start function for cloned child */
 childFunc(void *arg)
 {
@@ -149,6 +173,7 @@
 const int MAP_BUF_SIZE = 100;
 char map_buf[MAP_BUF_SIZE];
 char map_path[PATH_MAX];
+char groups_path[PATH_MAX];
 
 /* Parse command-line options. The initial '+' character in
the final getopt() argument prevents GNU-style permutation
@@ -225,6 +250,11 @@
 update_map(uid_map, map_path);
 }
 if (gid_map != NULL || map_zero) {
+snprintf(groups_path, PATH_MAX, "/proc/%ld/setgroups",
+(long) child_pid);
+write_file("deny\n", groups_path);
+}
+if (gid_map != NULL || map_zero) {
 snprintf(map_path, PATH_MAX, "/proc/%ld/gid_map",
 (long) child_pid);
 if (map_zero) {
/* userns_child_exec.c

   Licensed under GNU General Public License v2 or later

   Create a child process that executes a shell command in new
   namespace(s); allow UID and GID mappings to be specified when
   creating a user namespace.
*/
#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

/* A simple error-handling function: print an error message based
   on the value in 'errno' and terminate the calling process */

#define errExit(msg)do { perror(msg); exit(EXIT_FAILURE); \
} while (0)

struct child_args {
char **argv;/* Command to be executed by child, with args */
intpipe_fd[2];  /* Pipe used to synchronize parent and child */
};

static int verbose;

static void
usage(char *pname)
{
fprintf(stderr, "Usage: %s [options] cmd [arg...]\n\n", pname);
fprintf(stderr, "Create a child process that executes a shell "
"command in a new user namespace,\n"
"and possibly also other new namespace(s).\n\n");
fprintf(stderr, "Options can be:\n\n");
#define fpe(str) fprintf(stderr, "%s", str);
fpe("-i  New IPC namespace\n");
fpe("-m  New mount namespace\n");
fpe("-n  New network namespace\n");
fpe("-p  New PID namespace\n");
fpe("-u  New UTS namespace\n");
fpe("-U  New user namespace\n");
fpe("-M uid_map  Specify UID map for user namespace\n");
fpe("-G gid_map  Specify GID map for user namespace\n");
fpe("-z  Map user's UID and GID to 0 in user namespace\n");
fpe("(equivalent to: -M '0  1' -G '0  1')\n");
fpe("-v  Display verbose messages\n");
fpe("\n");
fpe("If -z, -M, or -G is specified, -U is req

Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-05 Thread Vasiliy Tolstov
2015-02-05 12:44 GMT+03:00 Alban Crequy :

> Manual page namespaces(7):
>
>Creation of new namespaces using clone(2) and unshare(2) in most
> cases
>requires the CAP_SYS_ADMIN capability.  User namespaces are the
>exception: since  Linux 3.8, no privilege is required to create a
> user
>namespace.
>

So as i understand i can't create full featured container with network
under non root user (and not have cap_sys_admin)


-- 
Vasiliy Tolstov,
e-mail: v.tols...@selfip.ru
jabber: v...@selfip.ru
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-05 Thread Alban Crequy
[reposting - sorry I forgot to Cc the mailing list]

On 4 February 2015 at 23:03, Vasiliy Tolstov  wrote:
> Hello!
> Does it possible to create container as regular user? Oh what capabilities i
> need to add to create container not using root?

Hello,

Manual page namespaces(7):

   Creation of new namespaces using clone(2) and unshare(2) in most cases
   requires the CAP_SYS_ADMIN capability.  User namespaces are the
   exception: since  Linux 3.8, no privilege is required to create a user
   namespace.

systemd-nspawn uses: src/nspawn/nspawn.c:

pid = raw_clone(SIGCHLD|CLONE_NEWNS|
  (arg_share_system ? 0 : CLONE_NEWIPC|CLONE_NEWPID|CLONE_NEWUTS)|
  (arg_private_network ? CLONE_NEWNET : 0), NULL);

So you need to have CAP_SYS_ADMIN to use systemd-nspawn.


If you want to try user namespaces, it is something that is still
moving... Manual page user_namespaces(7):

   Starting  in  Linux  3.8,  unprivileged  processes  can create
   user namespaces, and mount, PID, IPC, network, and UTS
   namespaces can be created with just the CAP_SYS_ADMIN
   capability in the caller's user namespace.

But it is not true in most Linux distributions as they disable
unprivileged user namespaces and require CAP_SYS_ADMIN anyway. See for
example:
http://anonscm.debian.org/viewvc/kernel/dists/trunk/linux/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch?revision=20773&view=markup
and: echo 1 > /proc/sys/kernel/unprivileged_userns_clone

Additionally, the program userns_child_exec.c included in manual page
namespaces(7) does not work as is yet because since the changes
introduced by CVE-2014-8989, it needs to adjust /proc/pid/setgroups.
See:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=66d2f338ee4c449396b6f99f5e75cd18eb6df272

Cheers,
Alban
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] systemd-nspawn create container under unprivileged user

2015-02-04 Thread Vasiliy Tolstov
Hello!
Does it possible to create container as regular user? Oh what capabilities
i need to add to create container not using root?

-- 
Vasiliy Tolstov,
e-mail: v.tols...@selfip.ru
jabber: v...@selfip.ru
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel