Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Daniel Mack
On 09/05/2016 08:32 PM, Alexei Starovoitov wrote:
> On 9/5/16 10:09 AM, Daniel Borkmann wrote:
>> On 09/05/2016 04:09 PM, Daniel Mack wrote:

>>> I really don't think it's worth sparing 8 bytes here and then do the
>>> binary compat dance after flags are added, for no real gain.
>>
>> Sure, but there's not much of a dance needed, see for example how map_flags
>> were added some time ago. So, iff there's really no foreseeable use-case in
>> sight and since we have this flexibility in place already, then I don't
>> quite
>> follow why it's needed, if there's zero pain to add it later on. I would
>> understand it of course, if it cannot be handled later on anymore.
> 
> I agree with Daniel B. Since flags are completely unused right now,
> there is no plan to use it for anything in the coming months and
> even worse they make annoying hole in the struct, let's not
> add them. We can safely do that later. CHECK_ATTR() allows us to
> do it easily. It's not like syscall where flags are must have,
> since we cannot add it later. Here it's done trivially.

Okay then. If you both agree, I won't interfere :)


Daniel



Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Alexei Starovoitov

On 9/5/16 10:09 AM, Daniel Borkmann wrote:

On 09/05/2016 04:09 PM, Daniel Mack wrote:

On 09/05/2016 03:56 PM, Daniel Borkmann wrote:

On 09/05/2016 02:54 PM, Daniel Mack wrote:

On 08/30/2016 01:00 AM, Daniel Borkmann wrote:

On 08/26/2016 09:58 PM, Daniel Mack wrote:



enum bpf_map_type {
@@ -147,6 +149,13 @@ union bpf_attr {
__aligned_u64pathname;
__u32bpf_fd;
};
+
+struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH
commands */
+__u32target_fd;/* container object to attach
to */
+__u32attach_bpf_fd;/* eBPF program to attach */
+__u32attach_type;/* BPF_ATTACH_TYPE_* */
+__u64attach_flags;


Could we just do ...

__u32 dst_fd;
__u32 src_fd;
__u32 attach_type;

... and leave flags out, since unused anyway? Also see below.


I'd really like to keep the flags, even if they're unused right now.
This only adds 8 bytes during the syscall operation, so it doesn't
harm.
However, we cannot change the userspace API after the fact, and who
knows what this (rather generic) interface will be used for later on.


With the below suggestion added, then flags doesn't need to be
added currently as it can be done safely at a later point in time
with respecting old binaries. See also the syscall handling code
in kernel/bpf/syscall.c +825 and the CHECK_ATTR() macro. The
underlying idea of this was taken from perf_event_open() syscall
back then, see [1] for a summary.

[1] https://lkml.org/lkml/2014/8/26/116


Yes, I know that's possible, and I like the idea, but I don't think any
new interface should come without flags really, as flags are something
that will most certainly be needed at some point anyway. I didn't have
them in my first shot, but Alexei pointed out that they should be added,
and I agree.

Also, this optimization wouldn't make the transported struct payload any
smaller anyway, because the member of that union used by BPF_PROG_LOAD
is still by far the biggest.

I really don't think it's worth sparing 8 bytes here and then do the
binary compat dance after flags are added, for no real gain.


Sure, but there's not much of a dance needed, see for example how map_flags
were added some time ago. So, iff there's really no foreseeable use-case in
sight and since we have this flexibility in place already, then I don't
quite
follow why it's needed, if there's zero pain to add it later on. I would
understand it of course, if it cannot be handled later on anymore.


I agree with Daniel B. Since flags are completely unused right now,
there is no plan to use it for anything in the coming months and
even worse they make annoying hole in the struct, let's not
add them. We can safely do that later. CHECK_ATTR() allows us to
do it easily. It's not like syscall where flags are must have,
since we cannot add it later. Here it's done trivially.



Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Joe Perches
On Mon, 2016-09-05 at 14:56 +0200, Daniel Mack wrote:
> On 08/27/2016 02:08 AM, Alexei Starovoitov wrote:
[]
> > +   switch (attr->attach_type) {
> > +   case BPF_ATTACH_TYPE_CGROUP_INET_INGRESS:
> > +   case BPF_ATTACH_TYPE_CGROUP_INET_EGRESS: {
> > +   struct cgroup *cgrp;
> > +
> > +   prog = bpf_prog_get_type(attr->attach_bpf_fd,
> > +    BPF_PROG_TYPE_CGROUP_SOCKET_FILTER);
> > +   if (IS_ERR(prog))
> > +   return PTR_ERR(prog);
> > +
> > +   cgrp = cgroup_get_from_fd(attr->target_fd);
> > +   if (IS_ERR(cgrp)) {
> > +   bpf_prog_put(prog);
> > +   return PTR_ERR(cgrp);
> > +   }
> > +
> > +   cgroup_bpf_update(cgrp, prog, attr->attach_type);
> > +   cgroup_put(cgrp);
> > +
> > +   break;
> > +   }
> this } formatting style is confusing. The above } looks
> like it matches 'switch () {'.
> May be move 'struct cgroup *cgrp' to the top to avoid that?

This style of case statements that declare local variables
with an open brace after the case statement

switch (bar) {
[cases...]
case foo: {
local declarations;

code...
}
[cases...]
}

is used quite frequently in the kernel.
I think it's fine as is.



Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Daniel Borkmann

On 09/05/2016 04:09 PM, Daniel Mack wrote:

On 09/05/2016 03:56 PM, Daniel Borkmann wrote:

On 09/05/2016 02:54 PM, Daniel Mack wrote:

On 08/30/2016 01:00 AM, Daniel Borkmann wrote:

On 08/26/2016 09:58 PM, Daniel Mack wrote:



enum bpf_map_type {
@@ -147,6 +149,13 @@ union bpf_attr {
__aligned_u64   pathname;
__u32   bpf_fd;
};
+
+   struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
+   __u32   target_fd;  /* container object to attach 
to */
+   __u32   attach_bpf_fd;  /* eBPF program to attach */
+   __u32   attach_type;/* BPF_ATTACH_TYPE_* */
+   __u64   attach_flags;


Could we just do ...

__u32 dst_fd;
__u32 src_fd;
__u32 attach_type;

... and leave flags out, since unused anyway? Also see below.


I'd really like to keep the flags, even if they're unused right now.
This only adds 8 bytes during the syscall operation, so it doesn't harm.
However, we cannot change the userspace API after the fact, and who
knows what this (rather generic) interface will be used for later on.


With the below suggestion added, then flags doesn't need to be
added currently as it can be done safely at a later point in time
with respecting old binaries. See also the syscall handling code
in kernel/bpf/syscall.c +825 and the CHECK_ATTR() macro. The
underlying idea of this was taken from perf_event_open() syscall
back then, see [1] for a summary.

[1] https://lkml.org/lkml/2014/8/26/116


Yes, I know that's possible, and I like the idea, but I don't think any
new interface should come without flags really, as flags are something
that will most certainly be needed at some point anyway. I didn't have
them in my first shot, but Alexei pointed out that they should be added,
and I agree.

Also, this optimization wouldn't make the transported struct payload any
smaller anyway, because the member of that union used by BPF_PROG_LOAD
is still by far the biggest.

I really don't think it's worth sparing 8 bytes here and then do the
binary compat dance after flags are added, for no real gain.


Sure, but there's not much of a dance needed, see for example how map_flags
were added some time ago. So, iff there's really no foreseeable use-case in
sight and since we have this flexibility in place already, then I don't quite
follow why it's needed, if there's zero pain to add it later on. I would
understand it of course, if it cannot be handled later on anymore.


Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Daniel Mack
On 09/05/2016 05:30 PM, David Laight wrote:
> From: Daniel Mack
 +
 +  struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
 +  __u32   target_fd;  /* container object to attach 
 to */
 +  __u32   attach_bpf_fd;  /* eBPF program to attach */
 +  __u32   attach_type;/* BPF_ATTACH_TYPE_* */
 +  __u64   attach_flags;
 +  };
>>>
>>> there is a 4 byte hole in this struct. Can we pack it differently?
>>
>> Okay - I swapped "type" and "flags" to repair this.
> 
> That just moves the pad to the end of the structure.
> Still likely to cause a problem for 32bit apps on a 64bit kernel.

What kind of problem do you have in mind? Again, this is embedded in a
union of much bigger total size, and the API is not used in any kind of
hot-path.

> If you can't think of any flags, why 64 of them?

I can't think of them right now, but this is about defining an API that
can be used in other context as well. Also, it doesn't matter at all,
they don't harm. IMO, it's just better to have them right away than to
do a binary compat dance once someone needs them.


Thanks,
Daniel



RE: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread David Laight
From: Daniel Mack
> >> +
> >> +  struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
> >> +  __u32   target_fd;  /* container object to attach 
> >> to */
> >> +  __u32   attach_bpf_fd;  /* eBPF program to attach */
> >> +  __u32   attach_type;/* BPF_ATTACH_TYPE_* */
> >> +  __u64   attach_flags;
> >> +  };
> >
> > there is a 4 byte hole in this struct. Can we pack it differently?
> 
> Okay - I swapped "type" and "flags" to repair this.

That just moves the pad to the end of the structure.
Still likely to cause a problem for 32bit apps on a 64bit kernel.
If you can't think of any flags, why 64 of them?

David



Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Daniel Mack
On 09/05/2016 03:56 PM, Daniel Borkmann wrote:
> On 09/05/2016 02:54 PM, Daniel Mack wrote:
>> On 08/30/2016 01:00 AM, Daniel Borkmann wrote:
>>> On 08/26/2016 09:58 PM, Daniel Mack wrote:
>>
enum bpf_map_type {
 @@ -147,6 +149,13 @@ union bpf_attr {
__aligned_u64   pathname;
__u32   bpf_fd;
};
 +
 +  struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
 +  __u32   target_fd;  /* container object to attach 
 to */
 +  __u32   attach_bpf_fd;  /* eBPF program to attach */
 +  __u32   attach_type;/* BPF_ATTACH_TYPE_* */
 +  __u64   attach_flags;
>>>
>>> Could we just do ...
>>>
>>> __u32 dst_fd;
>>> __u32 src_fd;
>>> __u32 attach_type;
>>>
>>> ... and leave flags out, since unused anyway? Also see below.
>>
>> I'd really like to keep the flags, even if they're unused right now.
>> This only adds 8 bytes during the syscall operation, so it doesn't harm.
>> However, we cannot change the userspace API after the fact, and who
>> knows what this (rather generic) interface will be used for later on.
> 
> With the below suggestion added, then flags doesn't need to be
> added currently as it can be done safely at a later point in time
> with respecting old binaries. See also the syscall handling code
> in kernel/bpf/syscall.c +825 and the CHECK_ATTR() macro. The
> underlying idea of this was taken from perf_event_open() syscall
> back then, see [1] for a summary.
> 
>[1] https://lkml.org/lkml/2014/8/26/116

Yes, I know that's possible, and I like the idea, but I don't think any
new interface should come without flags really, as flags are something
that will most certainly be needed at some point anyway. I didn't have
them in my first shot, but Alexei pointed out that they should be added,
and I agree.

Also, this optimization wouldn't make the transported struct payload any
smaller anyway, because the member of that union used by BPF_PROG_LOAD
is still by far the biggest.

I really don't think it's worth sparing 8 bytes here and then do the
binary compat dance after flags are added, for no real gain.



Thanks,
Daniel



Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Daniel Borkmann

On 09/05/2016 02:54 PM, Daniel Mack wrote:

On 08/30/2016 01:00 AM, Daniel Borkmann wrote:

On 08/26/2016 09:58 PM, Daniel Mack wrote:



   enum bpf_map_type {
@@ -147,6 +149,13 @@ union bpf_attr {
__aligned_u64   pathname;
__u32   bpf_fd;
};
+
+   struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
+   __u32   target_fd;  /* container object to attach 
to */
+   __u32   attach_bpf_fd;  /* eBPF program to attach */
+   __u32   attach_type;/* BPF_ATTACH_TYPE_* */
+   __u64   attach_flags;


Could we just do ...

__u32 dst_fd;
__u32 src_fd;
__u32 attach_type;

... and leave flags out, since unused anyway? Also see below.


I'd really like to keep the flags, even if they're unused right now.
This only adds 8 bytes during the syscall operation, so it doesn't harm.
However, we cannot change the userspace API after the fact, and who
knows what this (rather generic) interface will be used for later on.


With the below suggestion added, then flags doesn't need to be
added currently as it can be done safely at a later point in time
with respecting old binaries. See also the syscall handling code
in kernel/bpf/syscall.c +825 and the CHECK_ATTR() macro. The
underlying idea of this was taken from perf_event_open() syscall
back then, see [1] for a summary.

  [1] https://lkml.org/lkml/2014/8/26/116


#define BPF_PROG_ATTACH_LAST_FIELD attach_type
#define BPF_PROG_DETACH_LAST_FIELD BPF_PROG_ATTACH_LAST_FIELD


...



Correct way would be to:

if (CHECK_ATTR(BPF_PROG_ATTACH))
return -EINVAL;


Will add - thanks!


Daniel





Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Daniel Mack
On 08/27/2016 02:08 AM, Alexei Starovoitov wrote:
> On Fri, Aug 26, 2016 at 09:58:49PM +0200, Daniel Mack wrote:

>> +
>> +struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
>> +__u32   target_fd;  /* container object to attach 
>> to */
>> +__u32   attach_bpf_fd;  /* eBPF program to attach */
>> +__u32   attach_type;/* BPF_ATTACH_TYPE_* */
>> +__u64   attach_flags;
>> +};
> 
> there is a 4 byte hole in this struct. Can we pack it differently?

Okay - I swapped "type" and "flags" to repair this.

>> +switch (attr->attach_type) {
>> +case BPF_ATTACH_TYPE_CGROUP_INET_INGRESS:
>> +case BPF_ATTACH_TYPE_CGROUP_INET_EGRESS: {
>> +struct cgroup *cgrp;
>> +
>> +prog = bpf_prog_get_type(attr->attach_bpf_fd,
>> + BPF_PROG_TYPE_CGROUP_SOCKET_FILTER);
>> +if (IS_ERR(prog))
>> +return PTR_ERR(prog);
>> +
>> +cgrp = cgroup_get_from_fd(attr->target_fd);
>> +if (IS_ERR(cgrp)) {
>> +bpf_prog_put(prog);
>> +return PTR_ERR(cgrp);
>> +}
>> +
>> +cgroup_bpf_update(cgrp, prog, attr->attach_type);
>> +cgroup_put(cgrp);
>> +
>> +break;
>> +}
> 
> this } formatting style is confusing. The above } looks
> like it matches 'switch () {'.
> May be move 'struct cgroup *cgrp' to the top to avoid that?

I kept it local to its users, but you're right, it's not worth it. Will
change.


Thanks,
Daniel




Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-09-05 Thread Daniel Mack
On 08/30/2016 01:00 AM, Daniel Borkmann wrote:
> On 08/26/2016 09:58 PM, Daniel Mack wrote:

>>   enum bpf_map_type {
>> @@ -147,6 +149,13 @@ union bpf_attr {
>>  __aligned_u64   pathname;
>>  __u32   bpf_fd;
>>  };
>> +
>> +struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
>> +__u32   target_fd;  /* container object to attach 
>> to */
>> +__u32   attach_bpf_fd;  /* eBPF program to attach */
>> +__u32   attach_type;/* BPF_ATTACH_TYPE_* */
>> +__u64   attach_flags;
> 
> Could we just do ...
> 
> __u32 dst_fd;
> __u32 src_fd;
> __u32 attach_type;
> 
> ... and leave flags out, since unused anyway? Also see below.

I'd really like to keep the flags, even if they're unused right now.
This only adds 8 bytes during the syscall operation, so it doesn't harm.
However, we cannot change the userspace API after the fact, and who
knows what this (rather generic) interface will be used for later on.

> #define BPF_PROG_ATTACH_LAST_FIELD attach_type
> #define BPF_PROG_DETACH_LAST_FIELD BPF_PROG_ATTACH_LAST_FIELD

...

> 
> Correct way would be to:
> 
>   if (CHECK_ATTR(BPF_PROG_ATTACH))
>   return -EINVAL;

Will add - thanks!


Daniel



Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-08-29 Thread Daniel Borkmann

On 08/26/2016 09:58 PM, Daniel Mack wrote:

Extend the bpf(2) syscall by two new commands, BPF_PROG_ATTACH and
BPF_PROG_DETACH which allow attaching and detaching eBPF programs
to a target.

On the API level, the target could be anything that has an fd in
userspace, hence the name of the field in union bpf_attr is called
'target_fd'.

When called with BPF_ATTACH_TYPE_CGROUP_INET_{E,IN}GRESS, the target is
expected to be a valid file descriptor of a cgroup v2 directory which
has the bpf controller enabled. These are the only use-cases
implemented by this patch at this point, but more can be added.

If a program of the given type already exists in the given cgroup,
the program is swapped automically, so userspace does not have to drop
an existing program first before installing a new one, which would
otherwise leave a gap in which no program is attached.

For more information on the propagation logic to subcgroups, please
refer to the bpf cgroup controller implementation.

The API is guarded by CAP_NET_ADMIN.

Signed-off-by: Daniel Mack 
syscall


^^^ slipped in?


---
  include/uapi/linux/bpf.h |  9 ++
  kernel/bpf/syscall.c | 83 
  2 files changed, 92 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1d5db42..4cc2dcf 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -73,6 +73,8 @@ enum bpf_cmd {
BPF_PROG_LOAD,
BPF_OBJ_PIN,
BPF_OBJ_GET,
+   BPF_PROG_ATTACH,
+   BPF_PROG_DETACH,
  };

  enum bpf_map_type {
@@ -147,6 +149,13 @@ union bpf_attr {
__aligned_u64   pathname;
__u32   bpf_fd;
};
+
+   struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
+   __u32   target_fd;  /* container object to attach 
to */
+   __u32   attach_bpf_fd;  /* eBPF program to attach */
+   __u32   attach_type;/* BPF_ATTACH_TYPE_* */
+   __u64   attach_flags;


Could we just do ...

__u32 dst_fd;
__u32 src_fd;
__u32 attach_type;

... and leave flags out, since unused anyway? Also see below.


+   };
  } __attribute__((aligned(8)));

  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 228f962..cc4d603 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -822,6 +822,79 @@ static int bpf_obj_get(const union bpf_attr *attr)
return bpf_obj_get_user(u64_to_ptr(attr->pathname));
  }

+#ifdef CONFIG_CGROUP_BPF


#define BPF_PROG_ATTACH_LAST_FIELD attach_type
#define BPF_PROG_DETACH_LAST_FIELD BPF_PROG_ATTACH_LAST_FIELD


+static int bpf_prog_attach(const union bpf_attr *attr)
+{
+   struct bpf_prog *prog;
+
+   if (!capable(CAP_NET_ADMIN))
+   return -EPERM;
+
+   /* Flags are unused for now */
+   if (attr->attach_flags != 0)
+   return -EINVAL;


Correct way would be to:

if (CHECK_ATTR(BPF_PROG_ATTACH))
return -EINVAL;


+
+   switch (attr->attach_type) {
+   case BPF_ATTACH_TYPE_CGROUP_INET_INGRESS:
+   case BPF_ATTACH_TYPE_CGROUP_INET_EGRESS: {
+   struct cgroup *cgrp;
+
+   prog = bpf_prog_get_type(attr->attach_bpf_fd,
+BPF_PROG_TYPE_CGROUP_SOCKET_FILTER);
+   if (IS_ERR(prog))
+   return PTR_ERR(prog);
+
+   cgrp = cgroup_get_from_fd(attr->target_fd);
+   if (IS_ERR(cgrp)) {
+   bpf_prog_put(prog);
+   return PTR_ERR(cgrp);
+   }
+
+   cgroup_bpf_update(cgrp, prog, attr->attach_type);
+   cgroup_put(cgrp);
+
+   break;
+   }
+
+   default:
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int bpf_prog_detach(const union bpf_attr *attr)
+{
+   if (!capable(CAP_NET_ADMIN))
+   return -EPERM;
+
+   /* Flags are unused for now */
+   if (attr->attach_flags != 0)
+   return -EINVAL;


if (CHECK_ATTR(BPF_PROG_DETACH))
return -EINVAL;


+   switch (attr->attach_type) {
+   case BPF_ATTACH_TYPE_CGROUP_INET_INGRESS:
+   case BPF_ATTACH_TYPE_CGROUP_INET_EGRESS: {
+   struct cgroup *cgrp;
+
+   cgrp = cgroup_get_from_fd(attr->target_fd);
+   if (IS_ERR(cgrp))
+   return PTR_ERR(cgrp);
+
+   cgroup_bpf_update(cgrp, NULL, attr->attach_type);
+   cgroup_put(cgrp);
+
+   break;
+   }
+
+   default:
+   return -EINVAL;
+   }
+
+   return 0;
+}
+#endif /* CONFIG_CGROUP_BPF */
+
  SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, 
size)
  {
union bpf_attr attr = {};
@@ -888,6 +961,16 @@ 

Re: [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-08-26 Thread Alexei Starovoitov
On Fri, Aug 26, 2016 at 09:58:49PM +0200, Daniel Mack wrote:
> Extend the bpf(2) syscall by two new commands, BPF_PROG_ATTACH and
> BPF_PROG_DETACH which allow attaching and detaching eBPF programs
> to a target.
> 
> On the API level, the target could be anything that has an fd in
> userspace, hence the name of the field in union bpf_attr is called
> 'target_fd'.
> 
> When called with BPF_ATTACH_TYPE_CGROUP_INET_{E,IN}GRESS, the target is
> expected to be a valid file descriptor of a cgroup v2 directory which
> has the bpf controller enabled. These are the only use-cases
> implemented by this patch at this point, but more can be added.
> 
> If a program of the given type already exists in the given cgroup,
> the program is swapped automically, so userspace does not have to drop
> an existing program first before installing a new one, which would
> otherwise leave a gap in which no program is attached.
> 
> For more information on the propagation logic to subcgroups, please
> refer to the bpf cgroup controller implementation.
> 
> The API is guarded by CAP_NET_ADMIN.
> 
> Signed-off-by: Daniel Mack 
> syscall
> ---
>  include/uapi/linux/bpf.h |  9 ++
>  kernel/bpf/syscall.c | 83 
> 
>  2 files changed, 92 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 1d5db42..4cc2dcf 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -73,6 +73,8 @@ enum bpf_cmd {
>   BPF_PROG_LOAD,
>   BPF_OBJ_PIN,
>   BPF_OBJ_GET,
> + BPF_PROG_ATTACH,
> + BPF_PROG_DETACH,
>  };
>  
>  enum bpf_map_type {
> @@ -147,6 +149,13 @@ union bpf_attr {
>   __aligned_u64   pathname;
>   __u32   bpf_fd;
>   };
> +
> + struct { /* anonymous struct used by BPF_PROG_ATTACH/DETACH commands */
> + __u32   target_fd;  /* container object to attach 
> to */
> + __u32   attach_bpf_fd;  /* eBPF program to attach */
> + __u32   attach_type;/* BPF_ATTACH_TYPE_* */
> + __u64   attach_flags;
> + };

there is a 4 byte hole in this struct. Can we pack it differently?

>  } __attribute__((aligned(8)));
>  
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 228f962..cc4d603 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -822,6 +822,79 @@ static int bpf_obj_get(const union bpf_attr *attr)
>   return bpf_obj_get_user(u64_to_ptr(attr->pathname));
>  }
>  
> +#ifdef CONFIG_CGROUP_BPF
> +static int bpf_prog_attach(const union bpf_attr *attr)
> +{
> + struct bpf_prog *prog;
> +
> + if (!capable(CAP_NET_ADMIN))
> + return -EPERM;
> +
> + /* Flags are unused for now */
> + if (attr->attach_flags != 0)
> + return -EINVAL;
> +
> + switch (attr->attach_type) {
> + case BPF_ATTACH_TYPE_CGROUP_INET_INGRESS:
> + case BPF_ATTACH_TYPE_CGROUP_INET_EGRESS: {
> + struct cgroup *cgrp;
> +
> + prog = bpf_prog_get_type(attr->attach_bpf_fd,
> +  BPF_PROG_TYPE_CGROUP_SOCKET_FILTER);
> + if (IS_ERR(prog))
> + return PTR_ERR(prog);
> +
> + cgrp = cgroup_get_from_fd(attr->target_fd);
> + if (IS_ERR(cgrp)) {
> + bpf_prog_put(prog);
> + return PTR_ERR(cgrp);
> + }
> +
> + cgroup_bpf_update(cgrp, prog, attr->attach_type);
> + cgroup_put(cgrp);
> +
> + break;
> + }

this } formatting style is confusing. The above } looks
like it matches 'switch () {'.
May be move 'struct cgroup *cgrp' to the top to avoid that?