On Wed, Apr 11, 2018 at 01:37:18PM -0500, Eric W. Biederman wrote:
> Christian Brauner <christian.brau...@canonical.com> writes:
> > On Wed, Apr 11, 2018 at 11:40:14AM -0500, Eric W. Biederman wrote:
> >> Christian Brauner <christian.brau...@canonical.com> writes:
> >> > Yeah, agreed.
> >> > But I think the patch is not complete. To guarantee that no non-initial
> >> > user namespace actually receives uevents we need to:
> >> > 1. only sent uevents to uevent sockets that are located in network
> >> > namespaces that are owned by init_user_ns
> >> > 2. filter uevents that are sent to sockets in mc_list that have opened a
> >> > uevent socket that is owned by init_user_ns *from* a
> >> > non-init_user_ns
> >> >
> >> > We account for 1. by only recording uevent sockets in the global uevent
> >> > socket list who are owned by init_user_ns.
> >> > But to account for 2. we need to filter by the user namespace who owns
> >> > the socket in mc_list. So in addition to that we also need to slightly
> >> > change the filter logic in kobj_bcast_filter() I think:
> >> >
> >> > diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
> >> > index 22a2c1a98b8f..064d7d29ace5 100644
> >> > --- a/lib/kobject_uevent.c
> >> > +++ b/lib/kobject_uevent.c
> >> > @@ -251,7 +251,8 @@ static int kobj_bcast_filter(struct sock *dsk,
> >> > struct sk_buff *skb, void *data)
> >> > return sock_ns != ns;
> >> > }
> >> >
> >> > - return 0;
> >> > + /* Check if socket was opened from non-initial user namespace.
> >> > */
> >> > + return sk_user_ns(dsk) != &init_user_ns;
> >> > }
> >> > #endif
> >> >
> >> >
> >> > But correct me if I'm wrong.
> >> You are worrying about NETLINK_LISTEN_ALL_NSID sockets. That has
> >> permissions and an explicit opt-in to receiving packets from multiple
> >> network namespaces.
> > I don't think that's what I'm talking about unless that is somehow the
> > default for NETLINK_KOBJECT_UEVENT sockets. What I'm worried about is
> > doing
> > unshare -U --map-root
> > then opening a NETLINK_KOBJECT_UEVENT socket and starting to listen to
> > uevents. Imho, this should not be possible because I'm in a
> > non-init_user_ns. But currently I'm able to - even with the patch to
> > come - since the uevent socket in the kernel was created when init_net
> > was created and hence is *owned* by the init_user_ns which means it is
> > in the list of uevent sockets. Here's a demo of what I mean:
> > https://asciinema.org/a/175632
> Why do you care about this case?
It's not so much that I care about this case since any workload that
wants to run a separate udevd will have to unshare the network namespace
and the user namespace for it to make complete sense.
What I do care about is that the two of us are absolutely in the clear
about what semantics we are going to expose to userspace and it seems
that we were talking past each other wrt to this "corner case".
For userspace, it needs to be very clear that the intention is to filter
by *owning user namespace of the network namespace a given task resides
in* and not by user namespace of the task per se. This is what this
corner case basically shows, I think.
> Everyone is allowed to listen to the uevent netlink sockets with or
> without user namespaces. So there are no permission issues, and
> this is not even a data information leak.
> If you don't want programs in your user namespace to have access you
> will be able to unshare the network namespace.