Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Nikolay Borisov


On 11/18/2015 04:58 PM, Al Viro wrote:
> On Wed, Nov 18, 2015 at 08:22:38AM -0600, Seth Forshee wrote:
> 
>> But it still requires the admin set it up that way, no? And aren't
>> privileges required to set up those devices in the first place?
>>
>> I'm not saying that it wouldn't be a good idea to lock down the backing
>> stores for those types of devices too, just that it isn't something that
>> a regular user could exploit without an admin doing something to
>> facilitate it.
> 
> Sigh...  If it boils down to "all admins within all containers must be
> trusted not to try and break out" (along with "roothole in any container
> escalates to kernel-mode code execution on host"), then what the fuck
> is the *point* of bothering with containers, userns, etc. in the first
> place?  If your model is basically "you want isolation, just use kvm",
> fine, but where's the place for userns in all that?
> 
> And if you are talking about the _host_ admin, then WTF not have him just
> mount what's needed as part of setup and to hell with mounting those
> inside the container?
> 
> Look at that from the hosting company POV - they are offering a bunch of
> virtual machines on one physical system.  And you want the admins on those
> virtual machines independent from the host admin.  Fine, but then you
> really need to keep them unable to screw each other or gain kernel-mode
> execution on the host.

Actually from the POV of a hosting company there's also the use case of
wanting to use container as substitutes for virtual machines (of course
we are a long way from that). But being able to do what those patches
enable (i.e. what Seth has pointed to with mount -o loop) is beneficial
and desirable.

> 
> Again, what's the point of all that?  I assumed the model where containers
> do, you know, contain what's in them, regardless of trust.  You guys seem
> to assume something different and I really wonder what it _is_...
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Seth Forshee
On Wed, Nov 18, 2015 at 02:58:18PM +, Al Viro wrote:
> On Wed, Nov 18, 2015 at 08:22:38AM -0600, Seth Forshee wrote:
> 
> > But it still requires the admin set it up that way, no? And aren't
> > privileges required to set up those devices in the first place?
> > 
> > I'm not saying that it wouldn't be a good idea to lock down the backing
> > stores for those types of devices too, just that it isn't something that
> > a regular user could exploit without an admin doing something to
> > facilitate it.
> 
> Sigh...  If it boils down to "all admins within all containers must be
> trusted not to try and break out" (along with "roothole in any container
> escalates to kernel-mode code execution on host"), then what the fuck
> is the *point* of bothering with containers, userns, etc. in the first
> place?  If your model is basically "you want isolation, just use kvm",
> fine, but where's the place for userns in all that?
> 
> And if you are talking about the _host_ admin, then WTF not have him just
> mount what's needed as part of setup and to hell with mounting those
> inside the container?

Yes, the host admin. I'm not talking about trusting the admin inside the
container at all.

>From my perspective the idea is essentially to allow mounting with fuse
or with ext4 using "mount -o loop ..." within a container.

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Austin S Hemmelgarn

On 2015-11-18 09:58, Al Viro wrote:

On Wed, Nov 18, 2015 at 08:22:38AM -0600, Seth Forshee wrote:


But it still requires the admin set it up that way, no? And aren't
privileges required to set up those devices in the first place?

I'm not saying that it wouldn't be a good idea to lock down the backing
stores for those types of devices too, just that it isn't something that
a regular user could exploit without an admin doing something to
facilitate it.


Sigh...  If it boils down to "all admins within all containers must be
trusted not to try and break out" (along with "roothole in any container
escalates to kernel-mode code execution on host"), then what the fuck
is the *point* of bothering with containers, userns, etc. in the first
place?  If your model is basically "you want isolation, just use kvm",
fine, but where's the place for userns in all that?

In this case, Seth is referring to the host admin, not the container admin.


And if you are talking about the _host_ admin, then WTF not have him just
mount what's needed as part of setup and to hell with mounting those
inside the container?
This is decidedly non-trivial to handle in some cases.  IIRC, one of the 
particular things that sparked this in the first place was the Chrome 
Native Client having to have CAP_SYS_ADMIN or SUID set on it to handle 
setting up it's own sandbox, which is not something that should ever be 
set on an executable that runs untrusted code (which is the whole point 
of NaCl).


Look at that from the hosting company POV - they are offering a bunch of
virtual machines on one physical system.  And you want the admins on those
virtual machines independent from the host admin.  Fine, but then you
really need to keep them unable to screw each other or gain kernel-mode
execution on the host.

Again, what's the point of all that?  I assumed the model where containers
do, you know, contain what's in them, regardless of trust.  You guys seem
to assume something different and I really wonder what it _is_...
Yes, hosting and isolation of untrusted code are valid uses for 
containers, which is why I suggested the ability to disallow mounts 
other than FUSE, and make that the default state.  There are other 
perfectly valid uses for them as well, and for me the two I'm 
particularly interested in are safe deployment of a new system from an 
existing system (ala Gentoo or Arch installation, or manual installation 
of *BSD), and running non-native distros without virtualization (On a 
single user system, virtualization is overkill when all you want is a 
Debian or Fedora or Arch testing environment and don't care about their 
specific kernel features).




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Al Viro
On Wed, Nov 18, 2015 at 09:05:12AM -0600, Seth Forshee wrote:

> Yes, the host admin. I'm not talking about trusting the admin inside the
> container at all.

Then why not have the same host admin just plain mount it when setting the
container up and be done with that?  From the host namespace, before spawning
the docker instance or whatever framework you are using.  IDGI...
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] security/apparmor: do not define list_entry_next

2015-11-18 Thread John Johansen
On 11/18/2015 04:14 AM, Sergey Senozhatsky wrote:
> Cosmetic.
> 
> Do not define list_entry_next() and use list_next_entry()
> from list.h.
> 

two days to late,

Geliang Tang already submitted the same patch in
[PATCH 3/3] apparmor: use list_next_entry instead of list_entry_next

and I've pulled it into my tree

still thank you for your work, it is appreciated

> Signed-off-by: Sergey Senozhatsky 
> ---
>  security/apparmor/apparmorfs.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
> index ad4fa49..946e8fd 100644
> --- a/security/apparmor/apparmorfs.c
> +++ b/security/apparmor/apparmorfs.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "include/apparmor.h"
>  #include "include/apparmorfs.h"
> @@ -549,9 +550,6 @@ fail2:
>   return error;
>  }
>  
> -
> -#define list_entry_next(pos, member) \
> - list_entry(pos->member.next, typeof(*pos), member)
>  #define list_entry_is_head(pos, head, member) (>member == (head))
>  
>  /**
> @@ -582,7 +580,7 @@ static struct aa_namespace *__next_namespace(struct 
> aa_namespace *root,
>   parent = ns->parent;
>   while (ns != root) {
>   mutex_unlock(>lock);
> - next = list_entry_next(ns, base.list);
> + next = list_next_entry(ns, base.list);
>   if (!list_entry_is_head(next, >sub_ns, base.list)) {
>   mutex_lock(>lock);
>   return next;
> @@ -636,7 +634,7 @@ static struct aa_profile *__next_profile(struct 
> aa_profile *p)
>   parent = rcu_dereference_protected(p->parent,
>  mutex_is_locked(>ns->lock));
>   while (parent) {
> - p = list_entry_next(p, base.list);
> + p = list_next_entry(p, base.list);
>   if (!list_entry_is_head(p, >base.profiles, base.list))
>   return p;
>   p = parent;
> @@ -645,7 +643,7 @@ static struct aa_profile *__next_profile(struct 
> aa_profile *p)
>   }
>  
>   /* is next another profile in the namespace */
> - p = list_entry_next(p, base.list);
> + p = list_next_entry(p, base.list);
>   if (!list_entry_is_head(p, >base.profiles, base.list))
>   return p;
>  
> 

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Seth Forshee
On Wed, Nov 18, 2015 at 02:10:45PM -0500, Theodore Ts'o wrote:
> On Tue, Nov 17, 2015 at 12:34:44PM -0600, Seth Forshee wrote:
> > On Tue, Nov 17, 2015 at 05:55:06PM +, Al Viro wrote:
> > > On Tue, Nov 17, 2015 at 11:25:51AM -0600, Seth Forshee wrote:
> > > 
> > > > Shortly after that I plan to follow with support for ext4. I've been
> > > > fuzzing ext4 for a while now and it has held up well, and I'm currently
> > > > working on hand-crafted attacks. Ted has commented privately (to others,
> > > > not to me personally) that he will fix bugs for such attacks, though I
> > > > haven't seen any public comments to that effect.
> > > 
> > > _Static_ attacks, or change-image-under-mounted-fs attacks?
> > 
> > Right now only static attacks, change-image-under-mounted-fs attacks
> > will be next.
> 
> I will fix bugs about static attacks.  That is, it's interesting to me
> that a buggy file system (no matter how it is created), not cause the
> kernel to crash --- and privilege escalation attacks tend to be
> strongly related to those bugs where we're not doing strong enough
> checking.
> 
> Protecting against a malicious user which changes the image under the
> file system is a whole other kettle of fish.  I am not at all user you
> can do this without completely sacrificing performance or making the
> code impossible to maintain.  So my comments do *not* extend to
> protecting against a malicious user who is changing the block device
> underneath the kernel.
> 
> If you want to submit patches to make the kernel more robust against
> these attacks, I'm certainly willing to look at the patches.  But I'm
> certainly not guaranteeing that they will go in, and I'm certainly not
> promising to fix all vulnerabilities that you might find that are
> caused by a malicious block device.  Sorry, that's too much buying a
> pig in a poke

Thanks Ted. My plan right now is to explore the possibility of blocking
writes to the backing store from userspace while it's mounted.

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Theodore Ts'o
On Tue, Nov 17, 2015 at 12:34:44PM -0600, Seth Forshee wrote:
> On Tue, Nov 17, 2015 at 05:55:06PM +, Al Viro wrote:
> > On Tue, Nov 17, 2015 at 11:25:51AM -0600, Seth Forshee wrote:
> > 
> > > Shortly after that I plan to follow with support for ext4. I've been
> > > fuzzing ext4 for a while now and it has held up well, and I'm currently
> > > working on hand-crafted attacks. Ted has commented privately (to others,
> > > not to me personally) that he will fix bugs for such attacks, though I
> > > haven't seen any public comments to that effect.
> > 
> > _Static_ attacks, or change-image-under-mounted-fs attacks?
> 
> Right now only static attacks, change-image-under-mounted-fs attacks
> will be next.

I will fix bugs about static attacks.  That is, it's interesting to me
that a buggy file system (no matter how it is created), not cause the
kernel to crash --- and privilege escalation attacks tend to be
strongly related to those bugs where we're not doing strong enough
checking.

Protecting against a malicious user which changes the image under the
file system is a whole other kettle of fish.  I am not at all user you
can do this without completely sacrificing performance or making the
code impossible to maintain.  So my comments do *not* extend to
protecting against a malicious user who is changing the block device
underneath the kernel.

If you want to submit patches to make the kernel more robust against
these attacks, I'm certainly willing to look at the patches.  But I'm
certainly not guaranteeing that they will go in, and I'm certainly not
promising to fix all vulnerabilities that you might find that are
caused by a malicious block device.  Sorry, that's too much buying a
pig in a poke

- Ted

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Austin S Hemmelgarn

On 2015-11-17 16:32, Seth Forshee wrote:

On Tue, Nov 17, 2015 at 03:54:50PM -0500, Austin S Hemmelgarn wrote:

On 2015-11-17 14:16, Seth Forshee wrote:

On Tue, Nov 17, 2015 at 02:02:09PM -0500, Austin S Hemmelgarn wrote:

On 2015-11-17 12:55, Al Viro wrote:

On Tue, Nov 17, 2015 at 11:25:51AM -0600, Seth Forshee wrote:


Shortly after that I plan to follow with support for ext4. I've been
fuzzing ext4 for a while now and it has held up well, and I'm currently
working on hand-crafted attacks. Ted has commented privately (to others,
not to me personally) that he will fix bugs for such attacks, though I
haven't seen any public comments to that effect.


_Static_ attacks, or change-image-under-mounted-fs attacks?

To properly protect against attacks on mounted filesystems, we'd
need some new concept of a userspace immutable file (that is, one
where nobody can write to it except the kernel, and only the kernel
can change it between regular access and this new state), and then
have the kernel set an image (or block device) to this state when a
filesystem is mounted from it (this introduces all kinds of other
issues too however, for example stuff that allows an online fsck on
the device will stop working, as will many un-deletion tools).


Yeah, Serge and I were just tossing that idea around on irc. If we can
make that work then it's probably the best solution.

 From a naive perspective it seems like all we really have to do is make
the block device inode immutable to userspace when it is mounted. And
the parent block device if it's a partition, which might be a bit
troublesome. We'd have to ensure that writes couldn't happen via any fds
already open when the device was mounted too.

We'd need some cooperation from the loop driver too I suppose, to make
sure the file backing the loop device is also immutable.


 From a completeness perspective, you'd also need to hook into DM,
MD, and bcache to handle their backing devices.  There's not much we
could do about iSCSI/ATAoE/NBD devices, and I think being able to


But really no one would be able to mount any of those without
intervention from a privileged user anyway. The same is true today of
loop devices, but I have some patches to change that.
Um, no, depending on how your system is configured, it's fully possible 
to mount those as a regular user with no administrative interaction 
required.  All that needs done is some udev rules (or something 
equivalent) to add ACL's to the device nodes allowing regular users to 
access them (and there are systems I've seen that are naive enough to do 
this).  And I can almost assure you that there will be someone out there 
who decides to directly expose such block devices to a user namespace.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Austin S Hemmelgarn

On 2015-11-17 17:01, Seth Forshee wrote:

On Tue, Nov 17, 2015 at 09:05:42PM +, Al Viro wrote:

On Tue, Nov 17, 2015 at 03:39:16PM -0500, Austin S Hemmelgarn wrote:


This is absolutely insane, no matter how much LSM snake oil you slatter on
the whole thing.  All of a sudden you are exposing a huge attack surface
in the place where it would hurt most and as the consolation we are offered
basically "Ted is willing to fix holes when they are found".


None of the LSM changes are intended to protect against attacks from
these sorts of attacks at all, so that's irrelevant.

As I said before, I'm also working to find holes up front. That plus a
commitment from the maintainer seems like a good start at least. What
bar would you set for a given filesystem to be considered "safe enough"?


For the context of static image attacks, anything that's foun
_needs_ to be fixed regardless, and unless you can find some way to
actually prevent attacks on mounted filesystems that doesn't involve
a complete re-write of the filesystem drivers, then there's not much
we can do about it.  Yes, unprivileged mounts expose an attack
surface, but so does userspace access to the network stack, and so
do a lot of other features that are considered essential in a modern
general purpose operating system.


"X is exposes an attack surface.  Y exposes a diferent attack surface.
Y is considered important.  Therefore X is important enough to implement it"

Right...


That isn't the argument he made. I would summarize the argument as,
"Saying that X exposes an attack surface isn't by itself enough to
reject X, otherwise we wouldn't expose anything (such as example Y)."

It's good to see someone understood my meaning...


You believe that the attack surface is too large, and that's
understandable. Is it your opinion that this is a fundamental problem
for an in-kernel filesystem driver, i.e. that we can never be confident
enough in an in-kernel filesystem parser to allow untrusted data? If
not, what would it take to establish a level of confidence that you
would be comfortable with?
While I can't speak for Al's opinion on this, I would like to point out 
my earlier comment:
> It's unfeasible from a practical standpoint to expect filesystems to 
> assume that stuff they write might change under them due to malicious 
> intent of a third party.
We can't protect against everything, not without making the system 
completely unusable for general purpose computing.  There is always some 
degree of trust involved in usage of a computer, the OS has to trust 
that the hardware works correctly, the administrator has to trust the OS 
to behave correctly, and the users have to trust the administrator.  The 
administrator also needs to have at least some trust in the users, 
otherwise he shouldn't be allowing them to use the system.


Perhaps we should have an option that can only be enabled on creation of 
the userns that would allow it to use regular kernel mounts, and without 
that option we default to only allowing FUSE and a couple of virtual 
filesystems (like /proc and devtmpfs).




smime.p7s
Description: S/MIME Cryptographic Signature


[PATCH 3/3] security/apparmor: do not define list_entry_next

2015-11-18 Thread Sergey Senozhatsky
Cosmetic.

Do not define list_entry_next() and use list_next_entry()
from list.h.

Signed-off-by: Sergey Senozhatsky 
---
 security/apparmor/apparmorfs.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index ad4fa49..946e8fd 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "include/apparmor.h"
 #include "include/apparmorfs.h"
@@ -549,9 +550,6 @@ fail2:
return error;
 }
 
-
-#define list_entry_next(pos, member) \
-   list_entry(pos->member.next, typeof(*pos), member)
 #define list_entry_is_head(pos, head, member) (>member == (head))
 
 /**
@@ -582,7 +580,7 @@ static struct aa_namespace *__next_namespace(struct 
aa_namespace *root,
parent = ns->parent;
while (ns != root) {
mutex_unlock(>lock);
-   next = list_entry_next(ns, base.list);
+   next = list_next_entry(ns, base.list);
if (!list_entry_is_head(next, >sub_ns, base.list)) {
mutex_lock(>lock);
return next;
@@ -636,7 +634,7 @@ static struct aa_profile *__next_profile(struct aa_profile 
*p)
parent = rcu_dereference_protected(p->parent,
   mutex_is_locked(>ns->lock));
while (parent) {
-   p = list_entry_next(p, base.list);
+   p = list_next_entry(p, base.list);
if (!list_entry_is_head(p, >base.profiles, base.list))
return p;
p = parent;
@@ -645,7 +643,7 @@ static struct aa_profile *__next_profile(struct aa_profile 
*p)
}
 
/* is next another profile in the namespace */
-   p = list_entry_next(p, base.list);
+   p = list_next_entry(p, base.list);
if (!list_entry_is_head(p, >base.profiles, base.list))
return p;
 
-- 
2.6.2.280.g74301d6

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Seth Forshee
On Wed, Nov 18, 2015 at 07:23:48AM -0500, Austin S Hemmelgarn wrote:
> On 2015-11-17 16:32, Seth Forshee wrote:
> >On Tue, Nov 17, 2015 at 03:54:50PM -0500, Austin S Hemmelgarn wrote:
> >>On 2015-11-17 14:16, Seth Forshee wrote:
> >>>On Tue, Nov 17, 2015 at 02:02:09PM -0500, Austin S Hemmelgarn wrote:
> On 2015-11-17 12:55, Al Viro wrote:
> >On Tue, Nov 17, 2015 at 11:25:51AM -0600, Seth Forshee wrote:
> >
> >>Shortly after that I plan to follow with support for ext4. I've been
> >>fuzzing ext4 for a while now and it has held up well, and I'm currently
> >>working on hand-crafted attacks. Ted has commented privately (to others,
> >>not to me personally) that he will fix bugs for such attacks, though I
> >>haven't seen any public comments to that effect.
> >
> >_Static_ attacks, or change-image-under-mounted-fs attacks?
> To properly protect against attacks on mounted filesystems, we'd
> need some new concept of a userspace immutable file (that is, one
> where nobody can write to it except the kernel, and only the kernel
> can change it between regular access and this new state), and then
> have the kernel set an image (or block device) to this state when a
> filesystem is mounted from it (this introduces all kinds of other
> issues too however, for example stuff that allows an online fsck on
> the device will stop working, as will many un-deletion tools).
> >>>
> >>>Yeah, Serge and I were just tossing that idea around on irc. If we can
> >>>make that work then it's probably the best solution.
> >>>
> >>> From a naive perspective it seems like all we really have to do is make
> >>>the block device inode immutable to userspace when it is mounted. And
> >>>the parent block device if it's a partition, which might be a bit
> >>>troublesome. We'd have to ensure that writes couldn't happen via any fds
> >>>already open when the device was mounted too.
> >>>
> >>>We'd need some cooperation from the loop driver too I suppose, to make
> >>>sure the file backing the loop device is also immutable.
> >>>
> >> From a completeness perspective, you'd also need to hook into DM,
> >>MD, and bcache to handle their backing devices.  There's not much we
> >>could do about iSCSI/ATAoE/NBD devices, and I think being able to
> >
> >But really no one would be able to mount any of those without
> >intervention from a privileged user anyway. The same is true today of
> >loop devices, but I have some patches to change that.
> Um, no, depending on how your system is configured, it's fully
> possible to mount those as a regular user with no administrative
> interaction required.  All that needs done is some udev rules (or
> something equivalent) to add ACL's to the device nodes allowing
> regular users to access them (and there are systems I've seen that
> are naive enough to do this).  And I can almost assure you that
> there will be someone out there who decides to directly expose such
> block devices to a user namespace.

But it still requires the admin set it up that way, no? And aren't
privileges required to set up those devices in the first place?

I'm not saying that it wouldn't be a good idea to lock down the backing
stores for those types of devices too, just that it isn't something that
a regular user could exploit without an admin doing something to
facilitate it.
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Seth Forshee
On Wed, Nov 18, 2015 at 07:46:53AM -0500, Austin S Hemmelgarn wrote:
> On 2015-11-17 17:01, Seth Forshee wrote:
> >On Tue, Nov 17, 2015 at 09:05:42PM +, Al Viro wrote:
> >>On Tue, Nov 17, 2015 at 03:39:16PM -0500, Austin S Hemmelgarn wrote:
> >>
> This is absolutely insane, no matter how much LSM snake oil you slatter on
> the whole thing.  All of a sudden you are exposing a huge attack surface
> in the place where it would hurt most and as the consolation we are 
> offered
> basically "Ted is willing to fix holes when they are found".
> >
> >None of the LSM changes are intended to protect against attacks from
> >these sorts of attacks at all, so that's irrelevant.
> >
> >As I said before, I'm also working to find holes up front. That plus a
> >commitment from the maintainer seems like a good start at least. What
> >bar would you set for a given filesystem to be considered "safe enough"?
> >
> >>>For the context of static image attacks, anything that's foun
> >>>_needs_ to be fixed regardless, and unless you can find some way to
> >>>actually prevent attacks on mounted filesystems that doesn't involve
> >>>a complete re-write of the filesystem drivers, then there's not much
> >>>we can do about it.  Yes, unprivileged mounts expose an attack
> >>>surface, but so does userspace access to the network stack, and so
> >>>do a lot of other features that are considered essential in a modern
> >>>general purpose operating system.
> >>
> >>"X is exposes an attack surface.  Y exposes a diferent attack surface.
> >>Y is considered important.  Therefore X is important enough to implement it"
> >>
> >>Right...
> >
> >That isn't the argument he made. I would summarize the argument as,
> >"Saying that X exposes an attack surface isn't by itself enough to
> >reject X, otherwise we wouldn't expose anything (such as example Y)."
> It's good to see someone understood my meaning...
> >
> >You believe that the attack surface is too large, and that's
> >understandable. Is it your opinion that this is a fundamental problem
> >for an in-kernel filesystem driver, i.e. that we can never be confident
> >enough in an in-kernel filesystem parser to allow untrusted data? If
> >not, what would it take to establish a level of confidence that you
> >would be comfortable with?
> While I can't speak for Al's opinion on this, I would like to point
> out my earlier comment:
> > It's unfeasible from a practical standpoint to expect filesystems
> to > assume that stuff they write might change under them due to
> malicious > intent of a third party.

So maybe the first requirement is that the user cannot modify the
backing store directly while the device is mounted.

> We can't protect against everything, not without making the system
> completely unusable for general purpose computing.  There is always
> some degree of trust involved in usage of a computer, the OS has to
> trust that the hardware works correctly, the administrator has to
> trust the OS to behave correctly, and the users have to trust the
> administrator.  The administrator also needs to have at least some
> trust in the users, otherwise he shouldn't be allowing them to use
> the system.
> 
> Perhaps we should have an option that can only be enabled on
> creation of the userns that would allow it to use regular kernel
> mounts, and without that option we default to only allowing FUSE and
> a couple of virtual filesystems (like /proc and devtmpfs).

I've considered the idea of something more global like a sysctl, or a
per-filesystem knob in sysfs. I guess a per-container knob is another
option, I'm not sure what interface we use to expose it though.
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Al Viro
On Wed, Nov 18, 2015 at 08:22:38AM -0600, Seth Forshee wrote:

> But it still requires the admin set it up that way, no? And aren't
> privileges required to set up those devices in the first place?
> 
> I'm not saying that it wouldn't be a good idea to lock down the backing
> stores for those types of devices too, just that it isn't something that
> a regular user could exploit without an admin doing something to
> facilitate it.

Sigh...  If it boils down to "all admins within all containers must be
trusted not to try and break out" (along with "roothole in any container
escalates to kernel-mode code execution on host"), then what the fuck
is the *point* of bothering with containers, userns, etc. in the first
place?  If your model is basically "you want isolation, just use kvm",
fine, but where's the place for userns in all that?

And if you are talking about the _host_ admin, then WTF not have him just
mount what's needed as part of setup and to hell with mounting those
inside the container?

Look at that from the hosting company POV - they are offering a bunch of
virtual machines on one physical system.  And you want the admins on those
virtual machines independent from the host admin.  Fine, but then you
really need to keep them unable to screw each other or gain kernel-mode
execution on the host.

Again, what's the point of all that?  I assumed the model where containers
do, you know, contain what's in them, regardless of trust.  You guys seem
to assume something different and I really wonder what it _is_...
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] security/apparmor: do not define list_entry_next

2015-11-18 Thread Sergey Senozhatsky
On (11/18/15 10:19), John Johansen wrote:
> On 11/18/2015 04:14 AM, Sergey Senozhatsky wrote:
> > Cosmetic.
> > 
> > Do not define list_entry_next() and use list_next_entry()
> > from list.h.
> > 
> 
> two days to late,
> 
> Geliang Tang already submitted the same patch in
> [PATCH 3/3] apparmor: use list_next_entry instead of list_entry_next
> 

Oh, wow. sorry for the noise. thanks.

-ss

> and I've pulled it into my tree
> 
> still thank you for your work, it is appreciated
> 
> > Signed-off-by: Sergey Senozhatsky 
> > ---
> >  security/apparmor/apparmorfs.c | 10 --
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
> > index ad4fa49..946e8fd 100644
> > --- a/security/apparmor/apparmorfs.c
> > +++ b/security/apparmor/apparmorfs.c
> > @@ -21,6 +21,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include "include/apparmor.h"
> >  #include "include/apparmorfs.h"
> > @@ -549,9 +550,6 @@ fail2:
> > return error;
> >  }
> >  
> > -
> > -#define list_entry_next(pos, member) \
> > -   list_entry(pos->member.next, typeof(*pos), member)
> >  #define list_entry_is_head(pos, head, member) (>member == (head))
> >  
> >  /**
> > @@ -582,7 +580,7 @@ static struct aa_namespace *__next_namespace(struct 
> > aa_namespace *root,
> > parent = ns->parent;
> > while (ns != root) {
> > mutex_unlock(>lock);
> > -   next = list_entry_next(ns, base.list);
> > +   next = list_next_entry(ns, base.list);
> > if (!list_entry_is_head(next, >sub_ns, base.list)) {
> > mutex_lock(>lock);
> > return next;
> > @@ -636,7 +634,7 @@ static struct aa_profile *__next_profile(struct 
> > aa_profile *p)
> > parent = rcu_dereference_protected(p->parent,
> >mutex_is_locked(>ns->lock));
> > while (parent) {
> > -   p = list_entry_next(p, base.list);
> > +   p = list_next_entry(p, base.list);
> > if (!list_entry_is_head(p, >base.profiles, base.list))
> > return p;
> > p = parent;
> > @@ -645,7 +643,7 @@ static struct aa_profile *__next_profile(struct 
> > aa_profile *p)
> > }
> >  
> > /* is next another profile in the namespace */
> > -   p = list_entry_next(p, base.list);
> > +   p = list_next_entry(p, base.list);
> > if (!list_entry_is_head(p, >base.profiles, base.list))
> > return p;
> >  
> > 
> 
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread James Morris
On Wed, 18 Nov 2015, Richard Weinberger wrote:

> On Wed, Nov 18, 2015 at 4:13 PM, Al Viro  wrote:
> > On Wed, Nov 18, 2015 at 09:05:12AM -0600, Seth Forshee wrote:
> >
> >> Yes, the host admin. I'm not talking about trusting the admin inside the
> >> container at all.
> >
> > Then why not have the same host admin just plain mount it when setting the
> > container up and be done with that?  From the host namespace, before 
> > spawning
> > the docker instance or whatever framework you are using.  IDGI...
> 
> Because hosting companies sell containers as "full virtual machines"
> and customers expect to be able mount stuff like disk images they upload.

I don't think this is a valid reason for merging functionality into the 
kernel.


-- 
James Morris


--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] User namespace mount updates

2015-11-18 Thread Richard Weinberger
Am 19.11.2015 um 08:47 schrieb James Morris:
> On Wed, 18 Nov 2015, Richard Weinberger wrote:
> 
>> On Wed, Nov 18, 2015 at 4:13 PM, Al Viro  wrote:
>>> On Wed, Nov 18, 2015 at 09:05:12AM -0600, Seth Forshee wrote:
>>>
 Yes, the host admin. I'm not talking about trusting the admin inside the
 container at all.
>>>
>>> Then why not have the same host admin just plain mount it when setting the
>>> container up and be done with that?  From the host namespace, before 
>>> spawning
>>> the docker instance or whatever framework you are using.  IDGI...
>>
>> Because hosting companies sell containers as "full virtual machines"
>> and customers expect to be able mount stuff like disk images they upload.
> 
> I don't think this is a valid reason for merging functionality into the 
> kernel.

Erm, I don't want this in the kernel. That's why I've proposed the lklfuse 
approach.

Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html