Re: plan9 semantics on Linux - mount namespaces

2018-02-16 Thread Eric W. Biederman
Enrico Weigelt  writes:

> On 13.02.2018 22:12, Enrico Weigelt wrote:
>
> CC @contain...@lists.linux-foundation.org
>
>> Hi folks,
>>
>>
>> I'm currently trying to implement plan9 semantics on Linux and
>> yet sorting out how to do the mount namespace handling.
>>
>> On plan9, any unprivileged process can create its own namespace
>> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
>>
>> What is the reason for not allowing arbitrary users to create their
>> own private mount namespace ? What could go wrong here ?

suid root executables could be fooled.  An easy case is fooling
/bin/su into reading a different copy of /etc/shadow, and allowing
arbitrary changes between users.

>> IMHO, we could allow mount/bind under the following conditions:
>>
>> * the process is in a private mount namespace
>> * no suid-flag is honored (either force all mounts to nosuid or
>>    completely mask it out)
>> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
>>
>> Maybe that all could be enabled by a new capability.
>>
>>
>> any suggestions ?

User namespaces limit the contained processes to not having any
permissions outside of the user namespace.  While still allowing the
fully unix permission model inside user namespaces.

I am in the final stages of getting the changes in the vfs and in fuse
to allow unprivileged users to mount that filesystem.  plan9fs would
also be a candidate for that kind of treatment if it had a maintainer.

Eric


Re: plan9 semantics on Linux - mount namespaces

2018-02-16 Thread Eric W. Biederman
Enrico Weigelt  writes:

> On 13.02.2018 22:12, Enrico Weigelt wrote:
>
> CC @contain...@lists.linux-foundation.org
>
>> Hi folks,
>>
>>
>> I'm currently trying to implement plan9 semantics on Linux and
>> yet sorting out how to do the mount namespace handling.
>>
>> On plan9, any unprivileged process can create its own namespace
>> and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
>>
>> What is the reason for not allowing arbitrary users to create their
>> own private mount namespace ? What could go wrong here ?

suid root executables could be fooled.  An easy case is fooling
/bin/su into reading a different copy of /etc/shadow, and allowing
arbitrary changes between users.

>> IMHO, we could allow mount/bind under the following conditions:
>>
>> * the process is in a private mount namespace
>> * no suid-flag is honored (either force all mounts to nosuid or
>>    completely mask it out)
>> * only certain whitelisted filesystems allowed (eg. 9P and FUSE)
>>
>> Maybe that all could be enabled by a new capability.
>>
>>
>> any suggestions ?

User namespaces limit the contained processes to not having any
permissions outside of the user namespace.  While still allowing the
fully unix permission model inside user namespaces.

I am in the final stages of getting the changes in the vfs and in fuse
to allow unprivileged users to mount that filesystem.  plan9fs would
also be a candidate for that kind of treatment if it had a maintainer.

Eric


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Aleksa Sarai
On 2018-02-14, Enrico Weigelt  wrote:
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

On Linux you need to have CAP_SYS_ADMIN (in the user_ns that owns your
current mnt_ns) in order to mount anything, and to create any namespaces
(in your current user_ns). So, in order to use the functionality of
mnt_ns (the ability to create mounts only a subset of processes can
see) as an unprivileged user, you need to use user_ns.

(Note there is an additional restriction, namely that a mnt_ns that was
set up in the non-root user_ns cannot mount any filesystems that do not
have the FS_USERNS_MOUNT option set. This is also for security, as
exposing the kernel filesystem parser to arbitrary data by unprivileged
users wasn't deemed to be a safe thing to do. The unprivileged FUSE work
that Richard linked to will likely be useful for pushing FS_USERNS_MOUNT
into more filesystems -- like 9p.)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Aleksa Sarai
On 2018-02-14, Enrico Weigelt  wrote:
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

On Linux you need to have CAP_SYS_ADMIN (in the user_ns that owns your
current mnt_ns) in order to mount anything, and to create any namespaces
(in your current user_ns). So, in order to use the functionality of
mnt_ns (the ability to create mounts only a subset of processes can
see) as an unprivileged user, you need to use user_ns.

(Note there is an additional restriction, namely that a mnt_ns that was
set up in the non-root user_ns cannot mount any filesystems that do not
have the FS_USERNS_MOUNT option set. This is also for security, as
exposing the kernel filesystem parser to arbitrary data by unprivileged
users wasn't deemed to be a safe thing to do. The unprivileged FUSE work
that Richard linked to will likely be useful for pushing FS_USERNS_MOUNT
into more filesystems -- like 9p.)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 19:12, Richard Weinberger wrote:


BTW: Your issue is fixed/known. Just checked.


aha, on 1.2.28 ... I'll have to upgrade.


--mtx


--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 19:12, Richard Weinberger wrote:


BTW: Your issue is fixed/known. Just checked.


aha, on 1.2.28 ... I'll have to upgrade.


--mtx


--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Am Mittwoch, 14. Februar 2018, 19:01:52 CET schrieb Enrico Weigelt:
> On 14.02.2018 18:50, Richard Weinberger wrote:
> >> hmm, now it works, but only when strace'ing it.
> >> that's really strange.
> > 
> > On my box, with my patch applied, also busybox works now.
> 
> hmm, w/o strace, too ?

Sure.

> Which version are you using ? I've got 1.27.2

Both master and 1.12.x

BTW: Your issue is fixed/known. Just checked.

commit 1b510900e24459353922a1bc83c0b58bc8bafe1c
Author: Denys Vlasenko 
Date:   Thu Nov 9 16:06:33 2017 +0100

unshare: -r should map root to user, not the other way around

Signed-off-by: Denys Vlasenko 

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Am Mittwoch, 14. Februar 2018, 19:01:52 CET schrieb Enrico Weigelt:
> On 14.02.2018 18:50, Richard Weinberger wrote:
> >> hmm, now it works, but only when strace'ing it.
> >> that's really strange.
> > 
> > On my box, with my patch applied, also busybox works now.
> 
> hmm, w/o strace, too ?

Sure.

> Which version are you using ? I've got 1.27.2

Both master and 1.12.x

BTW: Your issue is fixed/known. Just checked.

commit 1b510900e24459353922a1bc83c0b58bc8bafe1c
Author: Denys Vlasenko 
Date:   Thu Nov 9 16:06:33 2017 +0100

unshare: -r should map root to user, not the other way around

Signed-off-by: Denys Vlasenko 

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 18:50, Richard Weinberger wrote:


hmm, now it works, but only when strace'ing it.
that's really strange.


On my box, with my patch applied, also busybox works now.


hmm, w/o strace, too ?
Which version are you using ? I've got 1.27.2


But still I wonder whether user_ns really solves my problem, as I don't
want to create sandboxed users, but only private namespaces just like
on Plan9.


Well, I'd be surprised if that works out of the box.
Since you're posting on LKML I assumed you're hacking the kernel to support
plan9-alike namespaces...


Yes, that's the plan :)


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 18:50, Richard Weinberger wrote:


hmm, now it works, but only when strace'ing it.
that's really strange.


On my box, with my patch applied, also busybox works now.


hmm, w/o strace, too ?
Which version are you using ? I've got 1.27.2


But still I wonder whether user_ns really solves my problem, as I don't
want to create sandboxed users, but only private namespaces just like
on Plan9.


Well, I'd be surprised if that works out of the box.
Since you're posting on LKML I assumed you're hacking the kernel to support
plan9-alike namespaces...


Yes, that's the plan :)


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Am Mittwoch, 14. Februar 2018, 18:21:12 CET schrieb Enrico Weigelt:
> On 14.02.2018 16:17, Richard Weinberger wrote:
> >  From taking a *very* quick look into busybox source, I suspect this
> >  should fix> 
> > it:
> > 
> > diff --git a/util-linux/unshare.c b/util-linux/unshare.c
> > index 875e3f86e304..3f59cf4d27c2 100644
> > --- a/util-linux/unshare.c
> > +++ b/util-linux/unshare.c
> > @@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
> > 
> >  * in that user namespace.
> >  */
> > 
> > xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
> > 
> > -   sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
> > +   sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
> > 
> > xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
> > 
> > -   sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
> > +   sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
> > 
> > xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
> > 
> > } else
> > if (setgrp_str) {
> 
> hmm, now it works, but only when strace'ing it.
> that's really strange.

On my box, with my patch applied, also busybox works now.
 
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

Well, I'd be surprised if that works out of the box.
Since you're posting on LKML I assumed you're hacking the kernel to support 
plan9-alike namespaces...

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Am Mittwoch, 14. Februar 2018, 18:21:12 CET schrieb Enrico Weigelt:
> On 14.02.2018 16:17, Richard Weinberger wrote:
> >  From taking a *very* quick look into busybox source, I suspect this
> >  should fix> 
> > it:
> > 
> > diff --git a/util-linux/unshare.c b/util-linux/unshare.c
> > index 875e3f86e304..3f59cf4d27c2 100644
> > --- a/util-linux/unshare.c
> > +++ b/util-linux/unshare.c
> > @@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
> > 
> >  * in that user namespace.
> >  */
> > 
> > xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
> > 
> > -   sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
> > +   sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
> > 
> > xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
> > 
> > -   sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
> > +   sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
> > 
> > xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
> > 
> > } else
> > if (setgrp_str) {
> 
> hmm, now it works, but only when strace'ing it.
> that's really strange.

On my box, with my patch applied, also busybox works now.
 
> But still I wonder whether user_ns really solves my problem, as I don't
> want to create sandboxed users, but only private namespaces just like
> on Plan9.

Well, I'd be surprised if that works out of the box.
Since you're posting on LKML I assumed you're hacking the kernel to support 
plan9-alike namespaces...

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 16:17, Richard Weinberger wrote:


 From taking a *very* quick look into busybox source, I suspect this should fix
it:

diff --git a/util-linux/unshare.c b/util-linux/unshare.c
index 875e3f86e304..3f59cf4d27c2 100644
--- a/util-linux/unshare.c
+++ b/util-linux/unshare.c
@@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
 * in that user namespace.
 */
xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
} else
if (setgrp_str) {



hmm, now it works, but only when strace'ing it.
that's really strange.

But still I wonder whether user_ns really solves my problem, as I don't
want to create sandboxed users, but only private namespaces just like
on Plan9.


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 16:17, Richard Weinberger wrote:


 From taking a *very* quick look into busybox source, I suspect this should fix
it:

diff --git a/util-linux/unshare.c b/util-linux/unshare.c
index 875e3f86e304..3f59cf4d27c2 100644
--- a/util-linux/unshare.c
+++ b/util-linux/unshare.c
@@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
 * in that user namespace.
 */
xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
} else
if (setgrp_str) {



hmm, now it works, but only when strace'ing it.
that's really strange.

But still I wonder whether user_ns really solves my problem, as I don't
want to create sandboxed users, but only private namespaces just like
on Plan9.


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Enrico,

Am Mittwoch, 14. Februar 2018, 16:02:18 CET schrieb Enrico Weigelt:
> stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0

busybox...

> brk(NULL)   = 0x58000
> brk(0x79000)= 0x79000
> open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
> read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
> read(3, "", 1024)   = 0
> close(3)= 0
> getgid32()  = 1
> setgid32(1) = 0
> setuid32(1) = 0
> geteuid32() = 1
> getegid32() = 1
> unshare(CLONE_NEWUTS|CLONE_NEWUSER) = 0
> open("/proc/self/setgroups", O_WRONLY|O_LARGEFILE) = 3
> write(3, "deny", 4) = 4
> close(3)= 0
> open("/proc/self/uid_map", O_WRONLY|O_LARGEFILE) = 3
> write(3, "1 0 1", 5)= -1 EPERM (Operation not permitted)

This mapping looks broken.
Please report to busybox folks.

>From taking a *very* quick look into busybox source, I suspect this should fix 
it:

diff --git a/util-linux/unshare.c b/util-linux/unshare.c
index 875e3f86e304..3f59cf4d27c2 100644
--- a/util-linux/unshare.c
+++ b/util-linux/unshare.c
@@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
 * in that user namespace.
 */
xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
} else
if (setgrp_str) {

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Enrico,

Am Mittwoch, 14. Februar 2018, 16:02:18 CET schrieb Enrico Weigelt:
> stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0

busybox...

> brk(NULL)   = 0x58000
> brk(0x79000)= 0x79000
> open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
> read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
> read(3, "", 1024)   = 0
> close(3)= 0
> getgid32()  = 1
> setgid32(1) = 0
> setuid32(1) = 0
> geteuid32() = 1
> getegid32() = 1
> unshare(CLONE_NEWUTS|CLONE_NEWUSER) = 0
> open("/proc/self/setgroups", O_WRONLY|O_LARGEFILE) = 3
> write(3, "deny", 4) = 4
> close(3)= 0
> open("/proc/self/uid_map", O_WRONLY|O_LARGEFILE) = 3
> write(3, "1 0 1", 5)= -1 EPERM (Operation not permitted)

This mapping looks broken.
Please report to busybox folks.

>From taking a *very* quick look into busybox source, I suspect this should fix 
it:

diff --git a/util-linux/unshare.c b/util-linux/unshare.c
index 875e3f86e304..3f59cf4d27c2 100644
--- a/util-linux/unshare.c
+++ b/util-linux/unshare.c
@@ -350,9 +350,9 @@ int unshare_main(int argc UNUSED_PARAM, char **argv)
 * in that user namespace.
 */
xopen_xwrite_close(PATH_PROC_SETGROUPS, "deny");
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)reuid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)reuid);
xopen_xwrite_close(PATH_PROC_UIDMAP, uidmap_buf);
-   sprintf(uidmap_buf, "%u 0 1", (unsigned)regid);
+   sprintf(uidmap_buf, "0 %u 1", (unsigned)regid);
xopen_xwrite_close(PATH_PROC_GIDMAP, uidmap_buf);
} else
if (setgrp_str) {

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 15:19, Richard Weinberger wrote:


Works here(tm).
Can you debug it? Maybe we miss something obvious.


daemon@alphabox:~ strace unshare -U -r --setgroups=deny
execve("/bin/unshare", ["unshare", "-U", "-r", "--setgroups=deny"], 
0x7ee51e0c /* 11 vars */) = 0

brk(NULL)   = 0x58000
fcntl64(0, F_GETFD) = 0
fcntl64(1, F_GETFD) = 0
fcntl64(2, F_GETFD) = 0
access("/etc/suid-debug", F_OK) = -1 ENOENT (No such file or 
directory)

uname({sysname="Linux", nodename="alphabox", ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f9
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or 
directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
open("/lib/tls/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
(No such file or directory)
stat64("/lib/tls/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/v7l", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/tls", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/v7l", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)

open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0Yi\1\0004\0\0\0"..., 512) 
= 512

fstat64(3, {st_mode=S_IFREG|0755, st_size=878136, ...}) = 0
mmap2(NULL, 947496, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 
0) = 0x76e82000

mprotect(0x76f55000, 61440, PROT_NONE)  = 0
mmap2(0x76f64000, 12288, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd2000) = 0x76f64000
mmap2(0x76f67000, 9512, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x76f67000

close(3)= 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f8f000

set_tls(0x76f8f4c0, 0x76f8fb98, 0x76f92050, 0x76f8f4c0, 0x76f92050) = 0
mprotect(0x76f64000, 8192, PROT_READ)   = 0
mprotect(0x76f91000, 4096, PROT_READ)   = 0
getuid32()  = 1
stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
brk(NULL)   = 0x58000
brk(0x79000)= 0x79000
open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
read(3, "", 1024)   = 0
close(3)= 0
getgid32()  = 1
setgid32(1) = 0
setuid32(1) = 0
geteuid32() = 1
getegid32() = 1

Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 15:19, Richard Weinberger wrote:


Works here(tm).
Can you debug it? Maybe we miss something obvious.


daemon@alphabox:~ strace unshare -U -r --setgroups=deny
execve("/bin/unshare", ["unshare", "-U", "-r", "--setgroups=deny"], 
0x7ee51e0c /* 11 vars */) = 0

brk(NULL)   = 0x58000
fcntl64(0, F_GETFD) = 0
fcntl64(1, F_GETFD) = 0
fcntl64(2, F_GETFD) = 0
access("/etc/suid-debug", F_OK) = -1 ENOENT (No such file or 
directory)

uname({sysname="Linux", nodename="alphabox", ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f9
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or 
directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
open("/lib/tls/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT 
(No such file or directory)
stat64("/lib/tls/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/v7l/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/v7l", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/tls/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/tls/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/tls/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/tls", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No 
such file or directory)
stat64("/lib/v7l/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/v7l/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/v7l/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/v7l", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)
open("/lib/neon/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon/vfp", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/neon/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such 
file or directory)
stat64("/lib/neon", 0x7eae8710) = -1 ENOENT (No such file or 
directory)
open("/lib/vfp/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file 
or directory)
stat64("/lib/vfp", 0x7eae8710)  = -1 ENOENT (No such file or 
directory)

open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0Yi\1\0004\0\0\0"..., 512) 
= 512

fstat64(3, {st_mode=S_IFREG|0755, st_size=878136, ...}) = 0
mmap2(NULL, 947496, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 
0) = 0x76e82000

mprotect(0x76f55000, 61440, PROT_NONE)  = 0
mmap2(0x76f64000, 12288, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd2000) = 0x76f64000
mmap2(0x76f67000, 9512, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x76f67000

close(3)= 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 
0) = 0x76f8f000

set_tls(0x76f8f4c0, 0x76f8fb98, 0x76f92050, 0x76f8f4c0, 0x76f92050) = 0
mprotect(0x76f64000, 8192, PROT_READ)   = 0
mprotect(0x76f91000, 4096, PROT_READ)   = 0
getuid32()  = 1
stat64("/etc/busybox.conf", {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
brk(NULL)   = 0x58000
brk(0x79000)= 0x79000
open("/etc/busybox.conf", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=198, ...}) = 0
read(3, "[SUID]\n#lines starting with # ar"..., 1024) = 198
read(3, "", 1024)   = 0
close(3)= 0
getgid32()  = 1
setgid32(1) = 0
setuid32(1) = 0
geteuid32() = 1
getegid32() = 1

Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Am Mittwoch, 14. Februar 2018, 15:03:55 CET schrieb Enrico Weigelt:
> On 14.02.2018 13:53, Richard Weinberger wrote:
> > It does what you ask it for. > Also see the --setgroups switch.> AFAICT
> > --setgroups=deny is the new
> default, then your command line should just> work. Maybe your unshare
> tool is too old.
> Also doesn't help:
> 
> daemon@alphabox:~ unshare -U -r --setgroups=deny
> unshare: can't open '/proc/self/setgroups': Permission denied

Works here(tm).
Can you debug it? Maybe we miss something obvious.
 
> >> What I'd like to achieve is that processes can manipulate their private
> >> >> namespace at will and mount other filesystems (primarily 9p and
> fuse). For that, I need to get rid of setuid (and per-file caps) for
> these>> private namespaces.>
> 
> > This is exactly why we have the user namespace.
> > In the user namespace you can create your own mount namespace and do
> > (almost) whatever you want.
> 
> What's the exact relation between user and mnt namespace ?
> Why do I need an own user ns for private mnt ns ? (except for the suid
> bit, which I wanna get rid of anyways).

mount related system calls are root-only. Therefore you need the user 
namespace to become a root in your own little world. :)

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Am Mittwoch, 14. Februar 2018, 15:03:55 CET schrieb Enrico Weigelt:
> On 14.02.2018 13:53, Richard Weinberger wrote:
> > It does what you ask it for. > Also see the --setgroups switch.> AFAICT
> > --setgroups=deny is the new
> default, then your command line should just> work. Maybe your unshare
> tool is too old.
> Also doesn't help:
> 
> daemon@alphabox:~ unshare -U -r --setgroups=deny
> unshare: can't open '/proc/self/setgroups': Permission denied

Works here(tm).
Can you debug it? Maybe we miss something obvious.
 
> >> What I'd like to achieve is that processes can manipulate their private
> >> >> namespace at will and mount other filesystems (primarily 9p and
> fuse). For that, I need to get rid of setuid (and per-file caps) for
> these>> private namespaces.>
> 
> > This is exactly why we have the user namespace.
> > In the user namespace you can create your own mount namespace and do
> > (almost) whatever you want.
> 
> What's the exact relation between user and mnt namespace ?
> Why do I need an own user ns for private mnt ns ? (except for the suid
> bit, which I wanna get rid of anyways).

mount related system calls are root-only. Therefore you need the user 
namespace to become a root in your own little world. :)

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 13:53, Richard Weinberger wrote:

It does what you ask it for. > Also see the --setgroups switch.> AFAICT --setgroups=deny is the new 
default, then your command line should just> work. Maybe your unshare 
tool is too old.

Also doesn't help:

daemon@alphabox:~ unshare -U -r --setgroups=deny
unshare: can't open '/proc/self/setgroups': Permission denied

What I'd like to achieve is that processes can manipulate their private >> namespace at will and mount other filesystems (primarily 9p and 
fuse). For that, I need to get rid of setuid (and per-file caps) for 
these>> private namespaces.>

This is exactly why we have the user namespace.
In the user namespace you can create your own mount namespace and do (almost)
whatever you want.


What's the exact relation between user and mnt namespace ?
Why do I need an own user ns for private mnt ns ? (except for the suid
bit, which I wanna get rid of anyways).


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 13:53, Richard Weinberger wrote:

It does what you ask it for. > Also see the --setgroups switch.> AFAICT --setgroups=deny is the new 
default, then your command line should just> work. Maybe your unshare 
tool is too old.

Also doesn't help:

daemon@alphabox:~ unshare -U -r --setgroups=deny
unshare: can't open '/proc/self/setgroups': Permission denied

What I'd like to achieve is that processes can manipulate their private >> namespace at will and mount other filesystems (primarily 9p and 
fuse). For that, I need to get rid of setuid (and per-file caps) for 
these>> private namespaces.>

This is exactly why we have the user namespace.
In the user namespace you can create your own mount namespace and do (almost)
whatever you want.


What's the exact relation between user and mnt namespace ?
Why do I need an own user ns for private mnt ns ? (except for the suid
bit, which I wanna get rid of anyways).


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Enrico,

Am Mittwoch, 14. Februar 2018, 13:38:48 CET schrieb Enrico Weigelt:
> On 14.02.2018 12:30, Richard Weinberger wrote:
> > On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt  wrote:
> >> On 14.02.2018 11:24, Aleksa Sarai wrote:
> >>> What distribution are you using and which release?
> >> 
> >> On a self-compiled system.
> >> 
> >> Forgot to enable namespaces in the kernel. Now it seems to work
> >> as root, but not as an unprivileged user:
> >> 
> >> 
> >> daemon@alphabox:~ unshare -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> >> daemon@alphabox:~ unshare -f -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> > 
> > Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
> > setgroups is a corner case and needs special care.
> 
> I'm still confused. Does the unshare program do something wrong here ?

It does what you ask it for.
Also see the --setgroups switch.
AFAICT --setgroups=deny is the new default, then your command line should just 
work. Maybe your unshare tool is too old.

> Anyways, I doubt that user namespaces help solving my problem.
> 
> What I'd like to achieve is that processes can manipulate their private
> namespace at will and mount other filesystems (primarily 9p and fuse).
> 
> For that, I need to get rid of setuid (and per-file caps) for these
> private namespaces.

This is exactly why we have the user namespace.
In the user namespace you can create your own mount namespace and do (almost) 
whatever you want.
Please note that you cannot mount any kind of filesystem.
For FUSE, see https://lwn.net/Articles/684774/

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
Enrico,

Am Mittwoch, 14. Februar 2018, 13:38:48 CET schrieb Enrico Weigelt:
> On 14.02.2018 12:30, Richard Weinberger wrote:
> > On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt  wrote:
> >> On 14.02.2018 11:24, Aleksa Sarai wrote:
> >>> What distribution are you using and which release?
> >> 
> >> On a self-compiled system.
> >> 
> >> Forgot to enable namespaces in the kernel. Now it seems to work
> >> as root, but not as an unprivileged user:
> >> 
> >> 
> >> daemon@alphabox:~ unshare -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> >> daemon@alphabox:~ unshare -f -r -U
> >> unshare: can't open '/proc/self/setgroups': Permission denied
> > 
> > Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
> > setgroups is a corner case and needs special care.
> 
> I'm still confused. Does the unshare program do something wrong here ?

It does what you ask it for.
Also see the --setgroups switch.
AFAICT --setgroups=deny is the new default, then your command line should just 
work. Maybe your unshare tool is too old.

> Anyways, I doubt that user namespaces help solving my problem.
> 
> What I'd like to achieve is that processes can manipulate their private
> namespace at will and mount other filesystems (primarily 9p and fuse).
> 
> For that, I need to get rid of setuid (and per-file caps) for these
> private namespaces.

This is exactly why we have the user namespace.
In the user namespace you can create your own mount namespace and do (almost) 
whatever you want.
Please note that you cannot mount any kind of filesystem.
For FUSE, see https://lwn.net/Articles/684774/

Thanks,
//richard

-- 
sigma star gmbh - Eduard-Bodem-Gasse 6 - 6020 Innsbruck - Austria
ATU66964118 - FN 374287y


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 12:30, Richard Weinberger wrote:

On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt  wrote:

On 14.02.2018 11:24, Aleksa Sarai wrote:


What distribution are you using and which release?



On a self-compiled system.

Forgot to enable namespaces in the kernel. Now it seems to work
as root, but not as an unprivileged user:


daemon@alphabox:~ unshare -r -U
unshare: can't open '/proc/self/setgroups': Permission denied
daemon@alphabox:~ unshare -f -r -U
unshare: can't open '/proc/self/setgroups': Permission denied



Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
setgroups is a corner case and needs special care.


I'm still confused. Does the unshare program do something wrong here ?

Anyways, I doubt that user namespaces help solving my problem.

What I'd like to achieve is that processes can manipulate their private 
namespace at will and mount other filesystems (primarily 9p and fuse).


For that, I need to get rid of setuid (and per-file caps) for these
private namespaces.


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 12:30, Richard Weinberger wrote:

On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt  wrote:

On 14.02.2018 11:24, Aleksa Sarai wrote:


What distribution are you using and which release?



On a self-compiled system.

Forgot to enable namespaces in the kernel. Now it seems to work
as root, but not as an unprivileged user:


daemon@alphabox:~ unshare -r -U
unshare: can't open '/proc/self/setgroups': Permission denied
daemon@alphabox:~ unshare -f -r -U
unshare: can't open '/proc/self/setgroups': Permission denied



Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
setgroups is a corner case and needs special care.


I'm still confused. Does the unshare program do something wrong here ?

Anyways, I doubt that user namespaces help solving my problem.

What I'd like to achieve is that processes can manipulate their private 
namespace at will and mount other filesystems (primarily 9p and fuse).


For that, I need to get rid of setuid (and per-file caps) for these
private namespaces.


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt  wrote:
> On 14.02.2018 11:24, Aleksa Sarai wrote:
>
>> What distribution are you using and which release?
>
>
> On a self-compiled system.
>
> Forgot to enable namespaces in the kernel. Now it seems to work
> as root, but not as an unprivileged user:
>
>
> daemon@alphabox:~ unshare -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
> daemon@alphabox:~ unshare -f -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
>

Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
setgroups is a corner case and needs special care.

-- 
Thanks,
//richard


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Richard Weinberger
On Wed, Feb 14, 2018 at 12:27 PM, Enrico Weigelt  wrote:
> On 14.02.2018 11:24, Aleksa Sarai wrote:
>
>> What distribution are you using and which release?
>
>
> On a self-compiled system.
>
> Forgot to enable namespaces in the kernel. Now it seems to work
> as root, but not as an unprivileged user:
>
>
> daemon@alphabox:~ unshare -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
> daemon@alphabox:~ unshare -f -r -U
> unshare: can't open '/proc/self/setgroups': Permission denied
>

Please read http://man7.org/linux/man-pages/man7/user_namespaces.7.html
setgroups is a corner case and needs special care.

-- 
Thanks,
//richard


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 11:24, Aleksa Sarai wrote:

What distribution are you using and which release? 


On a self-compiled system.

Forgot to enable namespaces in the kernel. Now it seems to work
as root, but not as an unprivileged user:


daemon@alphabox:~ unshare -r -U
unshare: can't open '/proc/self/setgroups': Permission denied
daemon@alphabox:~ unshare -f -r -U
unshare: can't open '/proc/self/setgroups': Permission denied


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 11:24, Aleksa Sarai wrote:

What distribution are you using and which release? 


On a self-compiled system.

Forgot to enable namespaces in the kernel. Now it seems to work
as root, but not as an unprivileged user:


daemon@alphabox:~ unshare -r -U
unshare: can't open '/proc/self/setgroups': Permission denied
daemon@alphabox:~ unshare -f -r -U
unshare: can't open '/proc/self/setgroups': Permission denied


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Aleksa Sarai
On 2018-02-14, Enrico Weigelt  wrote:
> On 14.02.2018 04:54, Aleksa Sarai wrote:
> 
> > It depends how old your kernel is and what distro you use. Arch Linux >
> > disables user namespaces entirely, Debian requires that you set a
> sysctl> to enable unprivileged user namespaces, and RHEL requires you to
> set> both a sysctl and a kernel boot-flag. Also check how old your kernel
> is> (unprivileged user namespace support was added in 3.8).
> Just tried on a mainline kernel (4.15). Same problem:
> 
> root@alphabox:~ unshare -U -r
> unshare: unshare(0x1400): Invalid argument
> root@alphabox:/proc/sys/user cat max_user_namespaces
> 5922

What distribution are you using and which release? Also, are you trying
to do this inside a Docker container or something similar (Docker has
seccomp filters that block CLONE_NEWUSER by default, for instance).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Aleksa Sarai
On 2018-02-14, Enrico Weigelt  wrote:
> On 14.02.2018 04:54, Aleksa Sarai wrote:
> 
> > It depends how old your kernel is and what distro you use. Arch Linux >
> > disables user namespaces entirely, Debian requires that you set a
> sysctl> to enable unprivileged user namespaces, and RHEL requires you to
> set> both a sysctl and a kernel boot-flag. Also check how old your kernel
> is> (unprivileged user namespace support was added in 3.8).
> Just tried on a mainline kernel (4.15). Same problem:
> 
> root@alphabox:~ unshare -U -r
> unshare: unshare(0x1400): Invalid argument
> root@alphabox:/proc/sys/user cat max_user_namespaces
> 5922

What distribution are you using and which release? Also, are you trying
to do this inside a Docker container or something similar (Docker has
seccomp filters that block CLONE_NEWUSER by default, for instance).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 04:54, Aleksa Sarai wrote:

It depends how old your kernel is and what distro you use. Arch Linux > disables user namespaces entirely, Debian requires that you set a 
sysctl> to enable unprivileged user namespaces, and RHEL requires you to 
set> both a sysctl and a kernel boot-flag. Also check how old your 
kernel is> (unprivileged user namespace support was added in 3.8).

Just tried on a mainline kernel (4.15). Same problem:

root@alphabox:~ unshare -U -r
unshare: unshare(0x1400): Invalid argument


root@alphabox:/proc/sys/user cat max_user_namespaces
5922


Am I missing something ?


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-14 Thread Enrico Weigelt

On 14.02.2018 04:54, Aleksa Sarai wrote:

It depends how old your kernel is and what distro you use. Arch Linux > disables user namespaces entirely, Debian requires that you set a 
sysctl> to enable unprivileged user namespaces, and RHEL requires you to 
set> both a sysctl and a kernel boot-flag. Also check how old your 
kernel is> (unprivileged user namespace support was added in 3.8).

Just tried on a mainline kernel (4.15). Same problem:

root@alphabox:~ unshare -U -r
unshare: unshare(0x1400): Invalid argument


root@alphabox:/proc/sys/user cat max_user_namespaces
5922


Am I missing something ?


--mtx

--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Aleksa Sarai
On 2018-02-14, Enrico Weigelt  wrote:
> On 13.02.2018 22:27, Aleksa Sarai wrote:
> 
> > You can do this by creating a new user namespace (CLONE_NEWUSER), which
> > then gives you the required permissions to create other namespaces
> > (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> > containers operate.
> 
> hmm, unshare -U doesn't work for me (even as root). But docker works,
> so user namespaces should be working. Any idea what could be wrong ?

It depends how old your kernel is and what distro you use. Arch Linux
disables user namespaces entirely, Debian requires that you set a sysctl
to enable unprivileged user namespaces, and RHEL requires you to set
both a sysctl and a kernel boot-flag. Also check how old your kernel is
(unprivileged user namespace support was added in 3.8).

Also Docker doesn't use user namespaces by default (you need to manually
enable it with --userns-remap, check the docs for more details). You
probably also want to be using "unshare -r" in your testing (as "unshare
-U" will leave you without mapped users).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Aleksa Sarai
On 2018-02-14, Enrico Weigelt  wrote:
> On 13.02.2018 22:27, Aleksa Sarai wrote:
> 
> > You can do this by creating a new user namespace (CLONE_NEWUSER), which
> > then gives you the required permissions to create other namespaces
> > (CLONE_NEWNS). This is how "rootless containers" or unprivileged
> > containers operate.
> 
> hmm, unshare -U doesn't work for me (even as root). But docker works,
> so user namespaces should be working. Any idea what could be wrong ?

It depends how old your kernel is and what distro you use. Arch Linux
disables user namespaces entirely, Debian requires that you set a sysctl
to enable unprivileged user namespaces, and RHEL requires you to set
both a sysctl and a kernel boot-flag. Also check how old your kernel is
(unprivileged user namespace support was added in 3.8).

Also Docker doesn't use user namespaces by default (you need to manually
enable it with --userns-remap, check the docs for more details). You
probably also want to be using "unshare -r" in your testing (as "unshare
-U" will leave you without mapped users).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Enrico Weigelt

On 13.02.2018 22:27, Aleksa Sarai wrote:


You can do this by creating a new user namespace (CLONE_NEWUSER), which
then gives you the required permissions to create other namespaces
(CLONE_NEWNS). This is how "rootless containers" or unprivileged
containers operate.


hmm, unshare -U doesn't work for me (even as root). But docker works,
so user namespaces should be working. Any idea what could be wrong ?


--mtx


--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Enrico Weigelt

On 13.02.2018 22:27, Aleksa Sarai wrote:


You can do this by creating a new user namespace (CLONE_NEWUSER), which
then gives you the required permissions to create other namespaces
(CLONE_NEWNS). This is how "rootless containers" or unprivileged
containers operate.


hmm, unshare -U doesn't work for me (even as root). But docker works,
so user namespaces should be working. Any idea what could be wrong ?


--mtx


--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Aleksa Sarai
On 2018-02-13, Enrico Weigelt  wrote:
> On 13.02.2018 22:12, Enrico Weigelt wrote:
> > I'm currently trying to implement plan9 semantics on Linux and
> > yet sorting out how to do the mount namespace handling.
> > 
> > On plan9, any unprivileged process can create its own namespace
> > and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
> > 
> > What is the reason for not allowing arbitrary users to create their
> > own private mount namespace ? What could go wrong here ?

You can do this by creating a new user namespace (CLONE_NEWUSER), which
then gives you the required permissions to create other namespaces
(CLONE_NEWNS). This is how "rootless containers" or unprivileged
containers operate.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Aleksa Sarai
On 2018-02-13, Enrico Weigelt  wrote:
> On 13.02.2018 22:12, Enrico Weigelt wrote:
> > I'm currently trying to implement plan9 semantics on Linux and
> > yet sorting out how to do the mount namespace handling.
> > 
> > On plan9, any unprivileged process can create its own namespace
> > and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.
> > 
> > What is the reason for not allowing arbitrary users to create their
> > own private mount namespace ? What could go wrong here ?

You can do this by creating a new user namespace (CLONE_NEWUSER), which
then gives you the required permissions to create other namespaces
(CLONE_NEWNS). This is how "rootless containers" or unprivileged
containers operate.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH



signature.asc
Description: PGP signature


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Enrico Weigelt

On 13.02.2018 22:12, Enrico Weigelt wrote:

CC @contain...@lists.linux-foundation.org


Hi folks,


I'm currently trying to implement plan9 semantics on Linux and
yet sorting out how to do the mount namespace handling.

On plan9, any unprivileged process can create its own namespace
and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.

What is the reason for not allowing arbitrary users to create their
own private mount namespace ? What could go wrong here ?

IMHO, we could allow mount/bind under the following conditions:

* the process is in a private mount namespace
* no suid-flag is honored (either force all mounts to nosuid or
   completely mask it out)
* only certain whitelisted filesystems allowed (eg. 9P and FUSE)

Maybe that all could be enabled by a new capability.


any suggestions ?


--mtx




--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287


Re: plan9 semantics on Linux - mount namespaces

2018-02-13 Thread Enrico Weigelt

On 13.02.2018 22:12, Enrico Weigelt wrote:

CC @contain...@lists.linux-foundation.org


Hi folks,


I'm currently trying to implement plan9 semantics on Linux and
yet sorting out how to do the mount namespace handling.

On plan9, any unprivileged process can create its own namespace
and mount/bind at will, while on Linux this requires CAP_SYS_ADMIN.

What is the reason for not allowing arbitrary users to create their
own private mount namespace ? What could go wrong here ?

IMHO, we could allow mount/bind under the following conditions:

* the process is in a private mount namespace
* no suid-flag is honored (either force all mounts to nosuid or
   completely mask it out)
* only certain whitelisted filesystems allowed (eg. 9P and FUSE)

Maybe that all could be enabled by a new capability.


any suggestions ?


--mtx




--
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
i...@metux.net -- +49-151-27565287