Re: [PATCH v2 net-next 0/5] Add bpf support to set sk_bound_dev_if

2016-10-31 Thread Thomas Graf
On 10/31/16 at 11:16am, David Ahern wrote:
> On 10/31/16 11:01 AM, David Miller wrote:
> > Also, any reason why you don't allow the cgroup bpf sk filter to return
> > an error code so that the sock creation could be cancelled if the eBPF
> > program desires that?  It could be useful, I suppose.
> 
> My first draft at this feature had that but I removed it for simplicity now. 
> Can certainly add it back.

We're trying to standardize on common return codes for all program
types. The lwt bpf series defines BPF_ codes which are compatible
with TC_ACT_* values to make lwt_bpf and cls_bpf compatible. Would
be great to use the same return codes and implement the ones that
make sense.


Re: [PATCH v2 net-next 0/5] Add bpf support to set sk_bound_dev_if

2016-10-31 Thread David Ahern
On 10/31/16 11:01 AM, David Miller wrote:
> From: David Ahern 
> Date: Wed, 26 Oct 2016 17:58:37 -0700
> 
>> The recently added VRF support in Linux leverages the bind-to-device
>> API for programs to specify an L3 domain for a socket. While
>> SO_BINDTODEVICE has been around for ages, not every ipv4/ipv6 capable
>> program has support for it. Even for those programs that do support it,
>> the API requires processes to be started as root (CAP_NET_RAW) which
>> is not desirable from a general security perspective.
>>
>> This patch set leverages Daniel Mack's work to attach bpf programs to
>> a cgroup:
>>
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg134028.html
>>
>> to provide a capability to set sk_bound_dev_if for all AF_INET{6}
>> sockets opened by a process in a cgroup when the sockets are allocated.
>>
>> This capability enables running any program in a VRF context and is key
>> to deploying Management VRF, a fundamental configuration for networking
>> gear, with any Linux OS installation.
> 
> Ok, after some review I think I understand what's going on here.
> 
> It would initially seem simpler to just support forced sk_bound_dev_if
> in cgroups.  But I think I understand why you may have gone this way:

That's what the l3mdev cgroup patch does -- force the sk_bound_dev_if for 
sockets. Tejun pushed back on adding new controllers. The cgroup+bpf is another 
way to accomplish the end goal. The key is using the cgroup infra for 
parent-child inheritance of the policy, holder of the policy "data" to be 
applied, tracking what processes are in a group, what the group is for a 
specific process, and on. No need to reinvent that part.

> 
> 1) The cgroup-bpf code always has the cgroup hierarchy propagation
>logic.
> 
> 2) The may be use cases for doing things with other sock members.
> 
> With respect to #2, do you know of any such planned use cases already?

One suggestion is the local port binding limitations that Mahesh and Anoop were 
looking into.

> 
> Also, any reason why you don't allow the cgroup bpf sk filter to return
> an error code so that the sock creation could be cancelled if the eBPF
> program desires that?  It could be useful, I suppose.

My first draft at this feature had that but I removed it for simplicity now. 
Can certainly add it back.



Re: [PATCH v2 net-next 0/5] Add bpf support to set sk_bound_dev_if

2016-10-31 Thread David Miller
From: David Ahern 
Date: Wed, 26 Oct 2016 17:58:37 -0700

> The recently added VRF support in Linux leverages the bind-to-device
> API for programs to specify an L3 domain for a socket. While
> SO_BINDTODEVICE has been around for ages, not every ipv4/ipv6 capable
> program has support for it. Even for those programs that do support it,
> the API requires processes to be started as root (CAP_NET_RAW) which
> is not desirable from a general security perspective.
> 
> This patch set leverages Daniel Mack's work to attach bpf programs to
> a cgroup:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg134028.html
> 
> to provide a capability to set sk_bound_dev_if for all AF_INET{6}
> sockets opened by a process in a cgroup when the sockets are allocated.
> 
> This capability enables running any program in a VRF context and is key
> to deploying Management VRF, a fundamental configuration for networking
> gear, with any Linux OS installation.

Ok, after some review I think I understand what's going on here.

It would initially seem simpler to just support forced sk_bound_dev_if
in cgroups.  But I think I understand why you may have gone this way:

1) The cgroup-bpf code always has the cgroup hierarchy propagation
   logic.

2) The may be use cases for doing things with other sock members.

With respect to #2, do you know of any such planned use cases already?

Also, any reason why you don't allow the cgroup bpf sk filter to return
an error code so that the sock creation could be cancelled if the eBPF
program desires that?  It could be useful, I suppose.

Thanks.