Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-03-05 Thread Sargun Dhillon
On Mon, Mar 5, 2018 at 8:10 AM, Tycho Andersen  wrote:
> Hi Andy,
>
> On Thu, Mar 01, 2018 at 10:05:47PM +, Andy Lutomirski wrote:
>> But Tycho: would hooking user notifiers in right here work for you?
>> As I see it, this would be the best justification for seccomp eBPF.
>
> Sorry for the delay; Sargun had declared on irc that he was going to
> implement it, so I didn't look into it. I think the basics will work,
> but I haven't invested the time to look into it given the above.
>
> Sargun, are you still planning to look at this? What's your timeline?
>
> Cheers,
>
> Tycho
Still working on this. I don't really have a timeline. I think I'll
get to share a prototype by the end of the week. I'm trying to come up
with a common mechanism to do this for multiple types of filters.


Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-03-05 Thread Tycho Andersen
Hi Andy,

On Thu, Mar 01, 2018 at 10:05:47PM +, Andy Lutomirski wrote:
> But Tycho: would hooking user notifiers in right here work for you?
> As I see it, this would be the best justification for seccomp eBPF.

Sorry for the delay; Sargun had declared on irc that he was going to
implement it, so I didn't look into it. I think the basics will work,
but I haven't invested the time to look into it given the above.

Sargun, are you still planning to look at this? What's your timeline?

Cheers,

Tycho


Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-03-01 Thread Andy Lutomirski
On Mon, Feb 26, 2018 at 7:27 AM, Sargun Dhillon  wrote:
> This introduces the BPF_PROG_TYPE_SECCOMP bpf program type. It is meant
> to be used for seccomp filters as an alternative to cBPF filters. The
> program type has relatively limited capabilities in terms of helpers,
> but that can be extended later on.
>
> The eBPF code loading is separated from attachment of the filter, so
> a privileged user can load the filter, and pass it back to an
> unprivileged user who can attach it and use it at a later time.
>
> In order to attach the filter itself, you need to supply a flag to the
> seccomp syscall indicating that a eBPF filter is being attached, as
> opposed to a cBPF one. Verification occurs at program load time,
> so the user should only receive errors related to attachment.
> +static const struct bpf_func_proto *
> +seccomp_func_proto(enum bpf_func_id func_id)
> +{

return NULL unconditionally.  If you want different behavior, make it
a separate patch to be reviewed separately.

> +   switch (func_id) {
> +   case BPF_FUNC_get_current_uid_gid:
> +   return _get_current_uid_gid_proto;

NAK.  Doesn't work right in containers.  *Maybe* it would be okay to
have a helper that returns the uid/gid in the namespace of the filter
creator, but that's not what this does.

> +   case BPF_FUNC_ktime_get_ns:
> +   return _ktime_get_ns_proto;

NAK.  This would need a very clear use case that makes it worthwhile.
As it stands, it interferes with CRIU for no obvious good reason.

> +   case BPF_FUNC_get_prandom_u32:
> +   return _get_prandom_u32_proto;

NAK.  This helper is cryptographically insecure in that it surely
allows one user to figure out the entire random number stream of
another and, worse, it could be abused to allow one user to *control*
the stream of another user.  Read the code in bpf_user_rnd_u32.

If there is a legitimate reason to allow random number generation in
seccomp eBPF programs, it can be its own patch, and it needs, at the
very least, the properties that a filter's stream of random bytes is
unpredictable and uncontrollable by anyone else.  Realistically, this
means you need the arch_random stuff or the actual get_random_bytes()
interface.

> +   case BPF_FUNC_get_current_pid_tgid:
> +   return _get_current_pid_tgid_proto;

NAK for the same reason as get_current_uid_gid.

But Tycho: would hooking user notifiers in right here work for you?
As I see it, this would be the best justification for seccomp eBPF.

--Andy


Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-02-26 Thread Kees Cook
On Mon, Feb 26, 2018 at 8:08 PM, Sargun Dhillon  wrote:
> On Mon, Feb 26, 2018 at 7:57 PM, Tycho Andersen  wrote:
>> On Mon, Feb 26, 2018 at 07:49:48PM -0800, Sargun Dhillon wrote:
>>> On Mon, Feb 26, 2018 at 4:54 PM, Tycho Andersen  wrote:
>>> > On Mon, Feb 26, 2018 at 07:27:05AM +, Sargun Dhillon wrote:
>>> >> +config SECCOMP_FILTER_EXTENDED
>>> >> + bool "Extended BPF seccomp filters"
>>> >> + depends on SECCOMP_FILTER && BPF_SYSCALL
>>> >> + depends on !CHECKPOINT_RESTORE
>>> >
>>> > Why not just give -EINVAL or something in case one of these is
>>> > requested, instead of making them incompatible at compile time?
>>> >
>>> > Tycho
>>> There's already code to return -EMEDIUMTYPE if it's a non-classic, or
>>> non-saved filter. Under the normal case, with CHECKPOINT_RESTORE
>>> enabled, you should never be able to get that. I think it makes sense
>>> to preserve this behaviour.
>>
>> Oh, right. So can't we just drop this, and the existing code will
>> DTRT, i.e. give you -EMEDIUMTYPE because the new filters aren't
>> supported, until they are?
>>
>> Tycho
> My suggestion is we merge this as is, so we don't break checkpoint /
> restore, and I will try to get the filter dumping patching in the same
> development cycle as it comes at minimal risk. Otherwise, we risk
> introducing a feature which could break checkpoint/restore, even in
> unprivileged containers since anyone can load a BPF Seccomp filter.

There is no rush to merge such a drastic expansion of the seccomp
attack surface. :) For me, the driving feature is if we can get
Tycho's notifier implemented in eBPF. The speed improvements, as far
as I'm concerned, aren't sufficient to add eBPF to seccomp. They are
certainly a nice benefit, but seccomp must be very conservative about
adding attack surface.

-Kees

-- 
Kees Cook
Pixel Security


Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-02-26 Thread Sargun Dhillon
On Mon, Feb 26, 2018 at 7:57 PM, Tycho Andersen  wrote:
> On Mon, Feb 26, 2018 at 07:49:48PM -0800, Sargun Dhillon wrote:
>> On Mon, Feb 26, 2018 at 4:54 PM, Tycho Andersen  wrote:
>> > On Mon, Feb 26, 2018 at 07:27:05AM +, Sargun Dhillon wrote:
>> >> +config SECCOMP_FILTER_EXTENDED
>> >> + bool "Extended BPF seccomp filters"
>> >> + depends on SECCOMP_FILTER && BPF_SYSCALL
>> >> + depends on !CHECKPOINT_RESTORE
>> >
>> > Why not just give -EINVAL or something in case one of these is
>> > requested, instead of making them incompatible at compile time?
>> >
>> > Tycho
>> There's already code to return -EMEDIUMTYPE if it's a non-classic, or
>> non-saved filter. Under the normal case, with CHECKPOINT_RESTORE
>> enabled, you should never be able to get that. I think it makes sense
>> to preserve this behaviour.
>
> Oh, right. So can't we just drop this, and the existing code will
> DTRT, i.e. give you -EMEDIUMTYPE because the new filters aren't
> supported, until they are?
>
> Tycho
My suggestion is we merge this as is, so we don't break checkpoint /
restore, and I will try to get the filter dumping patching in the same
development cycle as it comes at minimal risk. Otherwise, we risk
introducing a feature which could break checkpoint/restore, even in
unprivileged containers since anyone can load a BPF Seccomp filter.


Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-02-26 Thread Tycho Andersen
On Mon, Feb 26, 2018 at 07:49:48PM -0800, Sargun Dhillon wrote:
> On Mon, Feb 26, 2018 at 4:54 PM, Tycho Andersen  wrote:
> > On Mon, Feb 26, 2018 at 07:27:05AM +, Sargun Dhillon wrote:
> >> +config SECCOMP_FILTER_EXTENDED
> >> + bool "Extended BPF seccomp filters"
> >> + depends on SECCOMP_FILTER && BPF_SYSCALL
> >> + depends on !CHECKPOINT_RESTORE
> >
> > Why not just give -EINVAL or something in case one of these is
> > requested, instead of making them incompatible at compile time?
> >
> > Tycho
> There's already code to return -EMEDIUMTYPE if it's a non-classic, or
> non-saved filter. Under the normal case, with CHECKPOINT_RESTORE
> enabled, you should never be able to get that. I think it makes sense
> to preserve this behaviour.

Oh, right. So can't we just drop this, and the existing code will
DTRT, i.e. give you -EMEDIUMTYPE because the new filters aren't
supported, until they are?

Tycho


Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-02-26 Thread Sargun Dhillon
On Mon, Feb 26, 2018 at 4:54 PM, Tycho Andersen  wrote:
> On Mon, Feb 26, 2018 at 07:27:05AM +, Sargun Dhillon wrote:
>> +config SECCOMP_FILTER_EXTENDED
>> + bool "Extended BPF seccomp filters"
>> + depends on SECCOMP_FILTER && BPF_SYSCALL
>> + depends on !CHECKPOINT_RESTORE
>
> Why not just give -EINVAL or something in case one of these is
> requested, instead of making them incompatible at compile time?
>
> Tycho
There's already code to return -EMEDIUMTYPE if it's a non-classic, or
non-saved filter. Under the normal case, with CHECKPOINT_RESTORE
enabled, you should never be able to get that. I think it makes sense
to preserve this behaviour.

My rough plan is to introduce a mechanism to dump filters like you can
cBPF filters. If you look at my v1, there was a patch that did this.
Once this gets in, I can prepare that patch, and we can lift this
restriction.


Re: [net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-02-26 Thread Tycho Andersen
On Mon, Feb 26, 2018 at 07:27:05AM +, Sargun Dhillon wrote:
> +config SECCOMP_FILTER_EXTENDED
> + bool "Extended BPF seccomp filters"
> + depends on SECCOMP_FILTER && BPF_SYSCALL
> + depends on !CHECKPOINT_RESTORE

Why not just give -EINVAL or something in case one of these is
requested, instead of making them incompatible at compile time?

Tycho


[net-next v3 1/2] bpf, seccomp: Add eBPF filter capabilities

2018-02-25 Thread Sargun Dhillon
This introduces the BPF_PROG_TYPE_SECCOMP bpf program type. It is meant
to be used for seccomp filters as an alternative to cBPF filters. The
program type has relatively limited capabilities in terms of helpers,
but that can be extended later on.

The eBPF code loading is separated from attachment of the filter, so
a privileged user can load the filter, and pass it back to an
unprivileged user who can attach it and use it at a later time.

In order to attach the filter itself, you need to supply a flag to the
seccomp syscall indicating that a eBPF filter is being attached, as
opposed to a cBPF one. Verification occurs at program load time,
so the user should only receive errors related to attachment.

Signed-off-by: Sargun Dhillon 
---
 arch/Kconfig |   8 +++
 include/linux/bpf_types.h|   3 +
 include/linux/seccomp.h  |   3 +-
 include/uapi/linux/bpf.h |   2 +
 include/uapi/linux/seccomp.h |   7 +-
 kernel/bpf/syscall.c |   1 +
 kernel/seccomp.c | 159 ---
 7 files changed, 156 insertions(+), 27 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 76c0b54443b1..8490d35e59d6 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -401,6 +401,14 @@ config SECCOMP_FILTER
 
  See Documentation/prctl/seccomp_filter.txt for details.
 
+config SECCOMP_FILTER_EXTENDED
+   bool "Extended BPF seccomp filters"
+   depends on SECCOMP_FILTER && BPF_SYSCALL
+   depends on !CHECKPOINT_RESTORE
+   help
+ Enables seccomp filters to be written in eBPF, as opposed
+ to just cBPF filters.
+
 config HAVE_GCC_PLUGINS
bool
help
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 19b8349a3809..945c65c4e461 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -22,6 +22,9 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_PERF_EVENT, perf_event)
 #ifdef CONFIG_CGROUP_BPF
 BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_DEVICE, cg_dev)
 #endif
+#ifdef CONFIG_SECCOMP_FILTER_EXTENDED
+BPF_PROG_TYPE(BPF_PROG_TYPE_SECCOMP, seccomp)
+#endif
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index c723a5c4e3ff..a7df3ba6cf25 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -5,7 +5,8 @@
 #include 
 
 #define SECCOMP_FILTER_FLAG_MASK   (SECCOMP_FILTER_FLAG_TSYNC | \
-SECCOMP_FILTER_FLAG_LOG)
+SECCOMP_FILTER_FLAG_LOG | \
+SECCOMP_FILTER_FLAG_EXTENDED)
 
 #ifdef CONFIG_SECCOMP
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index db6bdc375126..5f96cb7ed954 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1,3 +1,4 @@
+
 /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
 /* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
  *
@@ -133,6 +134,7 @@ enum bpf_prog_type {
BPF_PROG_TYPE_SOCK_OPS,
BPF_PROG_TYPE_SK_SKB,
BPF_PROG_TYPE_CGROUP_DEVICE,
+   BPF_PROG_TYPE_SECCOMP,
 };
 
 enum bpf_attach_type {
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 2a0bd9dd104d..730af6c7ec2e 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -16,10 +16,11 @@
 #define SECCOMP_SET_MODE_FILTER1
 #define SECCOMP_GET_ACTION_AVAIL   2
 
-/* Valid flags for SECCOMP_SET_MODE_FILTER */
-#define SECCOMP_FILTER_FLAG_TSYNC  1
-#define SECCOMP_FILTER_FLAG_LOG2
 
+/* Valid flags for SECCOMP_SET_MODE_FILTER */
+#define SECCOMP_FILTER_FLAG_TSYNC  (1 << 0)
+#define SECCOMP_FILTER_FLAG_LOG(1 << 1)
+#define SECCOMP_FILTER_FLAG_EXTENDED   (1 << 2)
 /*
  * All BPF programs must return a 32-bit value.
  * The bottom 16-bits are for optional return data.
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e24aa3241387..86d6ec8b916d 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1202,6 +1202,7 @@ static int bpf_prog_load(union bpf_attr *attr)
 
if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
type != BPF_PROG_TYPE_CGROUP_SKB &&
+   type != BPF_PROG_TYPE_SECCOMP &&
!capable(CAP_SYS_ADMIN))
return -EPERM;
 
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index dc77548167ef..d95c24181a6c 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /**
  * struct seccomp_filter - container for seccomp BPF programs
@@ -367,17 +368,6 @@ static struct seccomp_filter 
*seccomp_prepare_filter(struct sock_fprog *fprog)
 
BUG_ON(INT_MAX / fprog->len < sizeof(struct sock_filter));
 
-   /*
-* Installing a seccomp filter requires that the task has
-* CAP_SYS_ADMIN in its namespace or be running with