Re: Fw: [Bug 193911] New: net_prio.ifpriomap is not aware of the network namespace, and discloses all network interface

2017-02-21 Thread Tejun Heo
Hello,

On Mon, Feb 13, 2017 at 08:04:57AM +1300, Eric W. Biederman wrote:
> > Yeah, the whole thing never considered netns or delegation.  Maybe the
> > read function itself should probably filter on the namespace of the
> > reader?  I'm not completely sure whether trying to fix it won't cause
> > some of existing use cases to break.  Eric, what do you think?
> 
> Apologies for the delay I just made it back from vacation.

Hope you enjoyed the vacation.

> There are cases where we do look at the reader/opener of the  file, and
> it is a pain, almost always the best policy is to have the context fixed
> at mount time.
> 
> I don't see an obvious answer of what better semantics for this file
> should be.  Perhaps Docker can mount over this file on older kernels?
> 
> The namespace primitives that people build containers out of were never
> guaranteed not to leak the fact that you are in a container.  So a small
> essentially harmless information leak is not something I panic about.
> It is the setting up of the container itself that must know what the
> primitives do to ensure that leaks don't happen, if you want to avoid leaks.
> 
> That said if this controller/file does not consider netns and delegation
> I suspect the right thing to do is put it under CONFIG_BROKEN or
> possibly
> CONFIG_I_REALLY_NEED_THIS_SILLY_CODE_FOR_BACKWARDS_COMPATIBILITY
> aka CONFIG_STAGING and let the code age out of the kernel there.

I see.  Yeah, it is broken in terms of ns support.  Marking it BROKEN
from the config seems a bit drastic tho.

> If someone actually cares about this code and wants to fix it to do the
> something reasonable and is willing to dig through all of the subtleties
> I can help with that.  I may be wrong but the code feels like something
> that just isn't interesting enough to make it worth fixing.

The network part of cgroup v2 can do the same thing, doesn't have
these issues and can be used in conjunction with cgroup v1, so we
already have a way out of this.

I don't think we need to take an active action on it at this point.
People who need namespace support can adopt cgroup v2 without breaking
other things and I'm not sure there's enough benefit in marking the v1
features BROKEN at this point.

Thanks.

-- 
tejun


Re: Fw: [Bug 193911] New: net_prio.ifpriomap is not aware of the network namespace, and discloses all network interface

2017-02-12 Thread Eric W. Biederman
Tejun Heo  writes:

> Hello,
>
> On Sun, Feb 05, 2017 at 11:05:36PM -0800, Cong Wang wrote:
>> > To be more specific, the read operation of net_prio.ifpriomap is handled 
>> > by the
>> > function read_priomap. Tracing from this function, we can find it invokes
>> > for_each_netdev_rcu and set the first parameter as the address of 
>> > init_net. It
>> > iterates all network devices of the host regardless of the network 
>> > namespace.
>> > Thus, from the view of a container, it can read the names of all network
>> > devices of the host.
>> 
>> I think that is probably because cgroup files don't provide a net pointer
>> for the context, if so we probably need some API similar to
>> class_create_file_ns().
>
> Yeah, the whole thing never considered netns or delegation.  Maybe the
> read function itself should probably filter on the namespace of the
> reader?  I'm not completely sure whether trying to fix it won't cause
> some of existing use cases to break.  Eric, what do you think?

Apologies for the delay I just made it back from vacation.

There are cases where we do look at the reader/opener of the  file, and
it is a pain, almost always the best policy is to have the context fixed
at mount time.

I don't see an obvious answer of what better semantics for this file
should be.  Perhaps Docker can mount over this file on older kernels?

The namespace primitives that people build containers out of were never
guaranteed not to leak the fact that you are in a container.  So a small
essentially harmless information leak is not something I panic about.
It is the setting up of the container itself that must know what the
primitives do to ensure that leaks don't happen, if you want to avoid leaks.

That said if this controller/file does not consider netns and delegation
I suspect the right thing to do is put it under CONFIG_BROKEN or
possibly
CONFIG_I_REALLY_NEED_THIS_SILLY_CODE_FOR_BACKWARDS_COMPATIBILITY
aka CONFIG_STAGING and let the code age out of the kernel there.

If someone actually cares about this code and wants to fix it to do the
something reasonable and is willing to dig through all of the subtleties
I can help with that.  I may be wrong but the code feels like something
that just isn't interesting enough to make it worth fixing.

Eric


Re: Fw: [Bug 193911] New: net_prio.ifpriomap is not aware of the network namespace, and discloses all network interface

2017-02-06 Thread Tejun Heo
Hello,

On Sun, Feb 05, 2017 at 11:05:36PM -0800, Cong Wang wrote:
> > To be more specific, the read operation of net_prio.ifpriomap is handled by 
> > the
> > function read_priomap. Tracing from this function, we can find it invokes
> > for_each_netdev_rcu and set the first parameter as the address of init_net. 
> > It
> > iterates all network devices of the host regardless of the network 
> > namespace.
> > Thus, from the view of a container, it can read the names of all network
> > devices of the host.
> 
> I think that is probably because cgroup files don't provide a net pointer
> for the context, if so we probably need some API similar to
> class_create_file_ns().

Yeah, the whole thing never considered netns or delegation.  Maybe the
read function itself should probably filter on the namespace of the
reader?  I'm not completely sure whether trying to fix it won't cause
some of existing use cases to break.  Eric, what do you think?

Thanks.

-- 
tejun


Re: Fw: [Bug 193911] New: net_prio.ifpriomap is not aware of the network namespace, and discloses all network interface

2017-02-05 Thread Cong Wang
On Fri, Feb 3, 2017 at 3:53 PM, Stephen Hemminger
 wrote:
>
>
> Begin forwarded message:
>
> Date: Fri, 03 Feb 2017 21:14:28 +
> From: bugzilla-dae...@bugzilla.kernel.org
> To: step...@networkplumber.org
> Subject: [Bug 193911] New: net_prio.ifpriomap is not aware of the network 
> namespace, and discloses all network interface
>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=193911
>
> Bug ID: 193911
>Summary: net_prio.ifpriomap is not aware of the network
> namespace, and discloses all network interface
>Product: Networking
>Version: 2.5
> Kernel Version: 4.9
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: step...@networkplumber.org
>   Reporter: xga...@email.wm.edu
> Regression: No
>
> The pseudo file net_prio.ifpriomap (under /sys/fs/cgroup/net_prio) contains a
> map of the priorities assigned to traffic starting from processes in a cgroup
> and leaving the system on various interfaces. The data format is in the form 
> of
> [ifname priority].
>
> We find that the kernel handler function hooked at net_prio.ifpriomap is not
> aware of the network namespace, and thus it discloses all network interfaces 
> on
> the physical machine to the containerized applications.
>
> To be more specific, the read operation of net_prio.ifpriomap is handled by 
> the
> function read_priomap. Tracing from this function, we can find it invokes
> for_each_netdev_rcu and set the first parameter as the address of init_net. It
> iterates all network devices of the host regardless of the network namespace.
> Thus, from the view of a container, it can read the names of all network
> devices of the host.

I think that is probably because cgroup files don't provide a net pointer
for the context, if so we probably need some API similar to
class_create_file_ns().


Fw: [Bug 193911] New: net_prio.ifpriomap is not aware of the network namespace, and discloses all network interface

2017-02-03 Thread Stephen Hemminger


Begin forwarded message:

Date: Fri, 03 Feb 2017 21:14:28 +
From: bugzilla-dae...@bugzilla.kernel.org
To: step...@networkplumber.org
Subject: [Bug 193911] New: net_prio.ifpriomap is not aware of the network 
namespace, and discloses all network interface


https://bugzilla.kernel.org/show_bug.cgi?id=193911

Bug ID: 193911
   Summary: net_prio.ifpriomap is not aware of the network
namespace, and discloses all network interface
   Product: Networking
   Version: 2.5
Kernel Version: 4.9
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: step...@networkplumber.org
  Reporter: xga...@email.wm.edu
Regression: No

The pseudo file net_prio.ifpriomap (under /sys/fs/cgroup/net_prio) contains a
map of the priorities assigned to traffic starting from processes in a cgroup
and leaving the system on various interfaces. The data format is in the form of
[ifname priority]. 

We find that the kernel handler function hooked at net_prio.ifpriomap is not
aware of the network namespace, and thus it discloses all network interfaces on
the physical machine to the containerized applications. 

To be more specific, the read operation of net_prio.ifpriomap is handled by the
function read_priomap. Tracing from this function, we can find it invokes
for_each_netdev_rcu and set the first parameter as the address of init_net. It
iterates all network devices of the host regardless of the network namespace.
Thus, from the view of a container, it can read the names of all network
devices of the host.

Here is an example. I checked it on Linux kernel 4.4 with Docker version
1.12.1. I do not have the latest kernel at hand. But there is no code change
between 4.4 and 4.9 for this function. It should be reproducible in the latest
kernel. 

I initiated a Docker container and checked the net_prio.ifpriomap inside the
container. It displayed all network interfaces information on the host.

Container: 
root@25e25d553c3b:/# cat /sys/fs/cgroup/net_prio/net_prio.ifpriomap 
lo 0
eth0 0
eth1 0
xenbr0 0
lxdbr0 0
virbr0 0
virbr0-nic 0
docker0 0
vnet0 0
vnet1 0
veth132de4a 0

Host:
@:~$ cat /sys/fs/cgroup/net_prio/net_prio.ifpriomap 
lo 0
eth0 0
eth1 0
xenbr0 0
lxdbr0 0
virbr0 0
virbr0-nic 0
docker0 0
vnet0 0
vnet1 0
veth132de4a 0

From the information displayed above, this file exposes the same network
interface information in a container and on a host, which we considered to be a
leakage for the network namespace.

-- 
You are receiving this mail because:
You are the assignee for the bug.