listen ranges to net cgroup

Alexei Starovoitov Tue, 16 Aug 2016 20:03:07 -0700

On Mon, Aug 15, 2016 at 12:05:40PM -0700, Mahesh Bandewar wrote:
> On Fri, Aug 12, 2016 at 9:29 PM, Alexei Starovoitov
> <alexei.starovoi...@gmail.com> wrote:
> [...]
> >> +static bool range_in_ranges(struct net_range *r, struct net_ranges *rs)
> >> +{
> >> +     int ri;
> >> +
> >> +     for (ri = 0; ri < rs->num_entries; ri++)
> >> +             if (r->min_value >= rs->range[ri].min_value &&
> >> +                 r->max_value <= rs->range[ri].max_value)
> >> +                     return true;
> >> +
> >> +     return false;
> >> +}
> >> +
> >> +/* Returns true if all the ranges in rs1 are subsets of at least one of 
> >> the
> >> + * ranges in rs2, ans returns false otherwise.
> >> + */
> >> +static bool ranges_in_ranges(struct net_ranges *rs1, struct net_ranges 
> >> *rs2)
> >> +{
> >> +     int ri;
> >> +
> >> +     for (ri = 0; ri < rs1->num_entries; ri++)
> >> +             if (!range_in_ranges(&rs1->range[ri], rs2))
> >> +                     return false;
> >> +
> >> +     return true;
> >> +}
> >
> > This is not a scalable implementation. The next step would be to
> > do a binary search, hash table or something else here?
> > I think the root of the problem is in hard coded checks
> > that quickly become inefficient and inadequate as applicability
> > of the feature grows.
> > We already have BPF that can suite this purpose much better
> > without bloating the kernel code with parsing and matching logic.
> >
> This is not a per packet cost and mostly paid once when you are
> establishing the channel. Having said that I do agree with you that
> the check can be optimized by doing something like a hash-table etc.
> 
> This is an ABI extension to the cgroup so having the code where it is
> now makes sense to maintain that ABI compatibility. Implementing using
> something else (e.g. eBPF etc.) would just mean that this code needs
> to be just moved at a different place than the current place so the
> net gain would be nothing.


In cgroup+bpf approach we don't need to extend abi every time new 
port range style or new statistics needs to be added. These hundreds
lines of hard coded checks can be avoided.
Daniel Mack is working on cgroup+bpf patches that will allow
attaching bpf to a cgroup. It will be new networking controller.
The same approach should be used here. I'm proposing to do
the same hooks as net_cgroup_bind_allowed() in inet_bind() and
net_cgroup_listen_allowed() in inet_csk_listen_start(), but the
code that does the check should be a bpf program that decides whether
port permitted or not.
New bpf program type can be trivially introduced that takes single
'portno' argument. The user space will attach it to a cgroup and then
free to do whatever port filtering logic and statistics by changing
the program without ever touching abi. Including hash tables that
are already part of bpf.

Daniel's slightly different approach to cgroup+bpf can do the same
in more generic way by hooking next to sk_filter() and parsing
the packet (it will work for any protocol), but it has
per-packet cost which I understand you want to avoid here by
adding hooks to inet_bind() and inet_csk_listen_start()
which makes sense.

Re: [PATCH 2/5] net: add bind/listen ranges to net cgroup

Reply via email to