Questions from xemul@ and pros/cons about nexessity of these patchset:

1) Do we able to remount external mounts superblock readonly(which should not be permitted)?

VZ6: we check mnt->owner in ve_remount_allowed - so we can not remount.
VZ7: we CAN remount, if have dev in ve->devmnt_list, as we do not have mnt->owner as in VZ6(mainstream way now is to prohibit remount from unpriviledged mounts at all).

In VZ7 "Port: user namespace owned mounts" should help with that problem.

2) Why it is bed to also gain capabilities on host, in contrary to just getting out from CT's userns to host's one?(VZ7)

a) With setns we can get out of all other namespaces with:
nsenter  --mount=/proc/1/ns/mnt
if we have CAP_SYS_ADMIN and we used some imaginary exploit to get out of both pid and userns. I don't know the way to do it without capability.

b) With root userns and CAP_SYS_PTRACE and other needed caps, one can attach to all ptraceable tasks on node. - Not sure if it is very bad.

c) all ns_capable checks will succeed in root userns e.g. with CAP_KILL exploit will be able to kill everybody's processes.

3) How to escape VZCT so that we do not have caps in init_user_ns?(VZ7)

a)In commit_creds/override_creds/revert_creds we change task->cred. If attacker can somehow pass there credentials of non-root user from host. Thus having task->cred->user_ns set to init_user_ns, but becoming unprivileged.

b)Function set_cred_user_ns is not the case as anyway changes caps to CAP_FULL_SET, and where seem no other place where user_ns is changed on cred.

It seem that is not an easy task to escape without getting caps, but do we sure that at some time that would not happen?

Additionally:

CONs:
a) These is not yet mainstream ebiderman@ can easily decide to rewrite it from scratch.

PROs:
a)(VZ7) These patch set protects non-root users on host from being able to gain all capabilities. - Not so important as we have only admin on host.

b) These patch set will allow users with CAP_FSETID not drop suid on write. Now behavior differs in VZ6 and VZ7 CT as we moved to user namespaces in VZ7:
(Have CAP_FSETID in both cases.)

VZ6CT# touch test; chmod u+s test; echo "change" > test; ls -l test; cat test
-rwSr--r-- 1 root root 7 Mar 15 10:00 test
change

VZ7CT# touch test; chmod u+s test; echo "change" > test; ls -l test; cat test
-rwxrwxrwx 1 root root 7 Mar 15 10:01 test
change

(VZ7) - only in VZ7 as VZ6 don't have mainstream user-namespaces for containers.

On 03/10/2016 07:31 PM, Pavel Tikhomirov wrote:
We need it as secure way to provide privileged access to mounts
in containers. For instance setting suid bit, security.capability
xattr, allowing remount in CT.

Those patches are not all in MS now, but actually "[PATCH v4 0/7]
Initial support for user namespace owned mounts" patch series is
partially in Eric W. Biederman's testing tree, here:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git 
for-testing

Other patches are Acked and have stayed without any feedback since
December 2015, see "[PATCH v2 00/19] Support fuse mounts in user
namespaces".

https://jira.sw.ru/browse/PSBM-43294
https://jira.sw.ru/browse/PSBM-43267

Major changes when porting:

1) dropped patches for selinux and other LSM as we do not
support them yet
2) dropped patch "new/fs: Update posix_acl support to handle
user namespace mounts" as that will need porting >=8 additional
patches from mainstream and now I do not know about any real
need of per s_user_ns posix ACLs in Virtuozzo, please correct
me if I'm wrong.
3) dropped fuse specific patches
4) port several needed patches from MS and add some fixes - migth
need to send them upstream.

Now we mount ploop disk to container root before userns for CT has
been created so root mount will have wrong s_user_ns on sb, for
testing purpose I include last patch which allows to change s_user_ns
on remount, but that is not safe, need to think how we can add it to
right userns from the begining.

Pavel Tikhomirov (22):
   ms/fs/super.c: fix WARN on alloc_super() fail path
   ebiederm/fs: Add user namesapace member to struct super_block
   fs: fix a posible leak of allocated superblock
   ms/mnt: Only change user settable mount flags in remount
   ms/mnt: Move the test for MNT_LOCK_READONLY from change_mount_flags
     into do_remount
   ms/mnt: Correct permission checks in do_remount
   ebiederm/userns: Simpilify MNT_NODEV handling.
   ebiederm/fs: Limit file caps to the user namespace of the super block
   port/block_dev: Support checking inode permissions in lookup_bdev()
   port/block_dev: Check permissions towards block device inode when
     mounting
   port/fs: Treat foreign mounts as nosuid
   fs: remove excess check for in_userns
   port/userns: Replace in_userns with current_in_userns
   port/fs: Check for invalid i_uid in may_follow_link()
   port/cred: Reject inodes with invalid ids in set_create_file_as()
   port/fs: Refuse uid/gid changes which don't map into s_user_ns
   port/fs: Ensure the mounter of a filesystem is privileged towards its
     inodes
   port/fs: Don't remove suid for CAP_FSETID in s_user_ns
   ms/fs: Add a missing permission check to do_umount
   port/fs: Allow superblock owner to access do_remount_sb()
   port/capabilities: Allow privileged user in s_user_ns to set
     security.* xattrs
   draft/ext4: add option to set userns of superblock

  drivers/md/dm-table.c          |  2 +-
  drivers/mtd/mtdsuper.c         |  2 +-
  fs/attr.c                      | 11 ++++++
  fs/block_dev.c                 | 20 ++++++++---
  fs/exec.c                      |  2 +-
  fs/ext4/super.c                | 68 +++++++++++++++++++++++++++++++++----
  fs/inode.c                     |  6 +++-
  fs/namei.c                     | 11 ++++--
  fs/namespace.c                 | 76 ++++++++++++++++++++++++++++++++++--------
  fs/proc/namespaces.c           |  2 ++
  fs/proc/root.c                 |  3 +-
  fs/quota/quota.c               |  2 +-
  fs/super.c                     | 46 +++++++++++++++++++++----
  include/linux/fs.h             | 15 ++++++++-
  include/linux/mount.h          | 10 +++++-
  include/linux/uidgid.h         | 10 ++++++
  include/linux/user_namespace.h |  6 ++++
  kernel/capability.c            | 13 +++++---
  kernel/cred.c                  |  2 ++
  kernel/user_namespace.c        | 14 ++++++++
  security/commoncap.c           | 14 +++++---
  security/selinux/hooks.c       |  2 +-
  22 files changed, 286 insertions(+), 51 deletions(-)


--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.
_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to