Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On Mon, Sep 14, 2015 at 06:48:43PM +0200, Daniel Borkmann wrote: > On 09/14/2015 06:00 PM, Tycho Andersen wrote: > >On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote: > >>I think due to the given insns restrictions on classic seccomp, this > >>could work for "most cases" (see below) for the time being until pointer > >>sanitation is resolved and that seccomp-only restriction from the dump > >>could be removed, > > > >Ok, thanks. > > > >>BUT there's one more stone in the road which you still > >>need to take care of with this whole 'giving classic seccomp-BPF -> eBPF > >>transforms an fd, dumping and restoring that via bpf(2)' approach: > >> > >>If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, > >>and dump that via your bpf(2) interface based on the current patches, what > >>you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic > >>JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: > >>add JIT support for loads from struct seccomp_data."). > >> > >>So in that case, bpf_prepare_filter() will not call into > >>bpf_migrate_filter() > >>as there's simply no need for it, because the classic code could already > >>be JITed there. I guess other archs where JIT support for eBPF in not yet > >>within near sight might sooner or later support this insn for their classic > >>JITs, too ... > > > >Thanks for pointing this out. > > > >What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is > >always eBPF? As near as I can tell there is no way to determine if a > >struct bpf_prog is classic or eBPF, so we'd need to add a bit to > >indicate whether or not the prog has been converted so that > >BPF_PROG_DUMP knows when to convert it. > > As I said, you have bpf_prog_was_classic() function to determine exactly > this (so without your type re-assignment you have a way to distinguish it). I don't think this is the same thing, though. IIUC, when the classic jit succeeds, bpf_prog_was_classic() will still return true even though prog->insnsi points to classic instructions instead of eBPF ones, and (I think) this situation is impossible to distinguish. Anyway, it sounds like this doesn't matter, as we have... > Wouldn't it be much easier to rip this set apart into multiple ones, solving > one individual thing at a time, f.e. starting out simple and 1) only add > native eBPF support to seccomp, after that 2) add a method to dump native-only > eBPF programs for criu, then 3) think about a right interface for classic > BPF seccomp dumping, etc, etc? Currently, it tries to solve everything at > once, and with some early assumptions that have non-trivial side-effects. The primary motivation for this set is your bullet 3, c/r of programs with classic bpf programs (i.e. what seccomp supports now). Initially, I thought it was best to try and dump the eBPFs directly, but it seems there are a lot of complications I wasn't aware of. Perhaps I'll look at a bpf_prog_store_orig_filter() style approach. Thanks, Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/14/2015 06:00 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote: I think due to the given insns restrictions on classic seccomp, this could work for "most cases" (see below) for the time being until pointer sanitation is resolved and that seccomp-only restriction from the dump could be removed, Ok, thanks. BUT there's one more stone in the road which you still need to take care of with this whole 'giving classic seccomp-BPF -> eBPF transforms an fd, dumping and restoring that via bpf(2)' approach: If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, and dump that via your bpf(2) interface based on the current patches, what you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: add JIT support for loads from struct seccomp_data."). So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter() as there's simply no need for it, because the classic code could already be JITed there. I guess other archs where JIT support for eBPF in not yet within near sight might sooner or later support this insn for their classic JITs, too ... Thanks for pointing this out. What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is always eBPF? As near as I can tell there is no way to determine if a struct bpf_prog is classic or eBPF, so we'd need to add a bit to indicate whether or not the prog has been converted so that BPF_PROG_DUMP knows when to convert it. As I said, you have bpf_prog_was_classic() function to determine exactly this (so without your type re-assignment you have a way to distinguish it). Wouldn't it be much easier to rip this set apart into multiple ones, solving one individual thing at a time, f.e. starting out simple and 1) only add native eBPF support to seccomp, after that 2) add a method to dump native-only eBPF programs for criu, then 3) think about a right interface for classic BPF seccomp dumping, etc, etc? Currently, it tries to solve everything at once, and with some early assumptions that have non-trivial side-effects. Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
Hi Daniel, On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote: > I think due to the given insns restrictions on classic seccomp, this > could work for "most cases" (see below) for the time being until pointer > sanitation is resolved and that seccomp-only restriction from the dump > could be removed, Ok, thanks. > BUT there's one more stone in the road which you still > need to take care of with this whole 'giving classic seccomp-BPF -> eBPF > transforms an fd, dumping and restoring that via bpf(2)' approach: > > If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, > and dump that via your bpf(2) interface based on the current patches, what > you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic > JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: > add JIT support for loads from struct seccomp_data."). > > So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter() > as there's simply no need for it, because the classic code could already > be JITed there. I guess other archs where JIT support for eBPF in not yet > within near sight might sooner or later support this insn for their classic > JITs, too ... Thanks for pointing this out. What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is always eBPF? As near as I can tell there is no way to determine if a struct bpf_prog is classic or eBPF, so we'd need to add a bit to indicate whether or not the prog has been converted so that BPF_PROG_DUMP knows when to convert it. Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
Hi Daniel, On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote: > I think due to the given insns restrictions on classic seccomp, this > could work for "most cases" (see below) for the time being until pointer > sanitation is resolved and that seccomp-only restriction from the dump > could be removed, Ok, thanks. > BUT there's one more stone in the road which you still > need to take care of with this whole 'giving classic seccomp-BPF -> eBPF > transforms an fd, dumping and restoring that via bpf(2)' approach: > > If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, > and dump that via your bpf(2) interface based on the current patches, what > you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic > JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: > add JIT support for loads from struct seccomp_data."). > > So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter() > as there's simply no need for it, because the classic code could already > be JITed there. I guess other archs where JIT support for eBPF in not yet > within near sight might sooner or later support this insn for their classic > JITs, too ... Thanks for pointing this out. What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is always eBPF? As near as I can tell there is no way to determine if a struct bpf_prog is classic or eBPF, so we'd need to add a bit to indicate whether or not the prog has been converted so that BPF_PROG_DUMP knows when to convert it. Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/14/2015 06:00 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote: I think due to the given insns restrictions on classic seccomp, this could work for "most cases" (see below) for the time being until pointer sanitation is resolved and that seccomp-only restriction from the dump could be removed, Ok, thanks. BUT there's one more stone in the road which you still need to take care of with this whole 'giving classic seccomp-BPF -> eBPF transforms an fd, dumping and restoring that via bpf(2)' approach: If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, and dump that via your bpf(2) interface based on the current patches, what you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: add JIT support for loads from struct seccomp_data."). So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter() as there's simply no need for it, because the classic code could already be JITed there. I guess other archs where JIT support for eBPF in not yet within near sight might sooner or later support this insn for their classic JITs, too ... Thanks for pointing this out. What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is always eBPF? As near as I can tell there is no way to determine if a struct bpf_prog is classic or eBPF, so we'd need to add a bit to indicate whether or not the prog has been converted so that BPF_PROG_DUMP knows when to convert it. As I said, you have bpf_prog_was_classic() function to determine exactly this (so without your type re-assignment you have a way to distinguish it). Wouldn't it be much easier to rip this set apart into multiple ones, solving one individual thing at a time, f.e. starting out simple and 1) only add native eBPF support to seccomp, after that 2) add a method to dump native-only eBPF programs for criu, then 3) think about a right interface for classic BPF seccomp dumping, etc, etc? Currently, it tries to solve everything at once, and with some early assumptions that have non-trivial side-effects. Thanks, Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On Mon, Sep 14, 2015 at 06:48:43PM +0200, Daniel Borkmann wrote: > On 09/14/2015 06:00 PM, Tycho Andersen wrote: > >On Fri, Sep 11, 2015 at 08:28:19PM +0200, Daniel Borkmann wrote: > >>I think due to the given insns restrictions on classic seccomp, this > >>could work for "most cases" (see below) for the time being until pointer > >>sanitation is resolved and that seccomp-only restriction from the dump > >>could be removed, > > > >Ok, thanks. > > > >>BUT there's one more stone in the road which you still > >>need to take care of with this whole 'giving classic seccomp-BPF -> eBPF > >>transforms an fd, dumping and restoring that via bpf(2)' approach: > >> > >>If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, > >>and dump that via your bpf(2) interface based on the current patches, what > >>you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic > >>JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: > >>add JIT support for loads from struct seccomp_data."). > >> > >>So in that case, bpf_prepare_filter() will not call into > >>bpf_migrate_filter() > >>as there's simply no need for it, because the classic code could already > >>be JITed there. I guess other archs where JIT support for eBPF in not yet > >>within near sight might sooner or later support this insn for their classic > >>JITs, too ... > > > >Thanks for pointing this out. > > > >What if we legislate that the output of bpf(BPF_PROG_DUMP, ...) is > >always eBPF? As near as I can tell there is no way to determine if a > >struct bpf_prog is classic or eBPF, so we'd need to add a bit to > >indicate whether or not the prog has been converted so that > >BPF_PROG_DUMP knows when to convert it. > > As I said, you have bpf_prog_was_classic() function to determine exactly > this (so without your type re-assignment you have a way to distinguish it). I don't think this is the same thing, though. IIUC, when the classic jit succeeds, bpf_prog_was_classic() will still return true even though prog->insnsi points to classic instructions instead of eBPF ones, and (I think) this situation is impossible to distinguish. Anyway, it sounds like this doesn't matter, as we have... > Wouldn't it be much easier to rip this set apart into multiple ones, solving > one individual thing at a time, f.e. starting out simple and 1) only add > native eBPF support to seccomp, after that 2) add a method to dump native-only > eBPF programs for criu, then 3) think about a right interface for classic > BPF seccomp dumping, etc, etc? Currently, it tries to solve everything at > once, and with some early assumptions that have non-trivial side-effects. The primary motivation for this set is your bullet 3, c/r of programs with classic bpf programs (i.e. what seccomp supports now). Initially, I thought it was best to try and dump the eBPFs directly, but it seems there are a lot of complications I wasn't aware of. Perhaps I'll look at a bpf_prog_store_orig_filter() style approach. Thanks, Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/11/2015 07:33 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 06:03:59PM +0200, Daniel Borkmann wrote: On 09/11/2015 04:44 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: On 09/11/2015 02:20 AM, Tycho Andersen wrote: In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho Andersen CC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; So, if you do this, then this breaks the assumption of eBPF JITs that, currently, all classic converted BPF programs always have a prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). Currently, JITs make use of this information to determine whether A and X mappings for such programs should or should not be cleared in the prologue (s390 currently). In the seccomp_prepare_filter() stage, we're already past that, so it will not cause an issue, but we certainly would need to be very careful in future, if bpf_prog_was_classic() is then used at a later stage when we already have a generated bpf_prog somewhere, as then this assumption will break. The only reason we need to do this is to allow BPF_DUMP_PROG to work, since we were restricting it to only allow dumping of seccomp programs, since those don't have maps. Instead, perhaps we could allow dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers already today, at least in networking case, not seccomp. So, since you want to export [classic -> eBPF] only for seccomp, put fds on them and dump these via bpf(2), you could allow that (with a big comment stating why it's safe), but mid-term we really need to sanitize all this stuff properly as this is needed for other types, too. Sorry, just to be clear, you're suggesting that the patch is ok modulo a comment describing the jit issues? I think due to the given insns restrictions on classic seccomp, this could work for "most cases" (see below) for the time being until pointer sanitation is resolved and that seccomp-only restriction from the dump could be removed, BUT there's one more stone in the road which you still need to take care of with this whole 'giving classic seccomp-BPF -> eBPF transforms an fd, dumping and restoring that via bpf(2)' approach: If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, and dump that via your bpf(2) interface based on the current patches, what you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: add JIT support for loads from struct seccomp_data."). So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter() as there's simply no need for it, because the classic code could already be JITed there. I guess other archs where JIT support for eBPF in not yet within near sight might sooner or later support this insn for their classic JITs, too ... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On Fri, Sep 11, 2015 at 06:03:59PM +0200, Daniel Borkmann wrote: > On 09/11/2015 04:44 PM, Tycho Andersen wrote: > >On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: > >>On 09/11/2015 02:20 AM, Tycho Andersen wrote: > >>>In the next patch, we're going to add a way to access the underlying > >>>filters via bpf fds. This means that we need to ref-count both the > >>>struct seccomp_filter objects and the struct bpf_prog objects separately, > >>>in case a process dies but a filter is still referred to by another > >>>process. > >>> > >>>Additionally, we mark classic converted seccomp filters as seccomp eBPF > >>>programs, since they are a subset of what is supported in seccomp eBPF. > >>> > >>>Signed-off-by: Tycho Andersen > >>>CC: Kees Cook > >>>CC: Will Drewry > >>>CC: Oleg Nesterov > >>>CC: Andy Lutomirski > >>>CC: Pavel Emelyanov > >>>CC: Serge E. Hallyn > >>>CC: Alexei Starovoitov > >>>CC: Daniel Borkmann > >>>--- > >>> kernel/seccomp.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>>diff --git a/kernel/seccomp.c b/kernel/seccomp.c > >>>index 245df6b..afaeddf 100644 > >>>--- a/kernel/seccomp.c > >>>+++ b/kernel/seccomp.c > >>>@@ -378,6 +378,8 @@ static struct seccomp_filter > >>>*seccomp_prepare_filter(struct sock_fprog *fprog) > >>> } > >>> > >>> atomic_set(>usage, 1); > >>>+ atomic_set(>prog->aux->refcnt, 1); > >>>+ sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; > >> > >>So, if you do this, then this breaks the assumption of eBPF JITs > >>that, currently, all classic converted BPF programs always have a > >>prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). > >> > >>Currently, JITs make use of this information to determine whether > >>A and X mappings for such programs should or should not be cleared > >>in the prologue (s390 currently). > >> > >>In the seccomp_prepare_filter() stage, we're already past that, so > >>it will not cause an issue, but we certainly would need to be very > >>careful in future, if bpf_prog_was_classic() is then used at a later > >>stage when we already have a generated bpf_prog somewhere, as then > >>this assumption will break. > > > >The only reason we need to do this is to allow BPF_DUMP_PROG to work, > >since we were restricting it to only allow dumping of seccomp > >programs, since those don't have maps. Instead, perhaps we could allow > >dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? > > There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers > already today, at least in networking case, not seccomp. So, since > you want to export [classic -> eBPF] only for seccomp, put fds on them > and dump these via bpf(2), you could allow that (with a big comment > stating why it's safe), but mid-term we really need to sanitize all > this stuff properly as this is needed for other types, too. Sorry, just to be clear, you're suggesting that the patch is ok modulo a comment describing the jit issues? Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/11/2015 04:44 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: On 09/11/2015 02:20 AM, Tycho Andersen wrote: In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho Andersen CC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; So, if you do this, then this breaks the assumption of eBPF JITs that, currently, all classic converted BPF programs always have a prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). Currently, JITs make use of this information to determine whether A and X mappings for such programs should or should not be cleared in the prologue (s390 currently). In the seccomp_prepare_filter() stage, we're already past that, so it will not cause an issue, but we certainly would need to be very careful in future, if bpf_prog_was_classic() is then used at a later stage when we already have a generated bpf_prog somewhere, as then this assumption will break. The only reason we need to do this is to allow BPF_DUMP_PROG to work, since we were restricting it to only allow dumping of seccomp programs, since those don't have maps. Instead, perhaps we could allow dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers already today, at least in networking case, not seccomp. So, since you want to export [classic -> eBPF] only for seccomp, put fds on them and dump these via bpf(2), you could allow that (with a big comment stating why it's safe), but mid-term we really need to sanitize all this stuff properly as this is needed for other types, too. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: > On 09/11/2015 02:20 AM, Tycho Andersen wrote: > >In the next patch, we're going to add a way to access the underlying > >filters via bpf fds. This means that we need to ref-count both the > >struct seccomp_filter objects and the struct bpf_prog objects separately, > >in case a process dies but a filter is still referred to by another > >process. > > > >Additionally, we mark classic converted seccomp filters as seccomp eBPF > >programs, since they are a subset of what is supported in seccomp eBPF. > > > >Signed-off-by: Tycho Andersen > >CC: Kees Cook > >CC: Will Drewry > >CC: Oleg Nesterov > >CC: Andy Lutomirski > >CC: Pavel Emelyanov > >CC: Serge E. Hallyn > >CC: Alexei Starovoitov > >CC: Daniel Borkmann > >--- > > kernel/seccomp.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > >diff --git a/kernel/seccomp.c b/kernel/seccomp.c > >index 245df6b..afaeddf 100644 > >--- a/kernel/seccomp.c > >+++ b/kernel/seccomp.c > >@@ -378,6 +378,8 @@ static struct seccomp_filter > >*seccomp_prepare_filter(struct sock_fprog *fprog) > > } > > > > atomic_set(>usage, 1); > >+atomic_set(>prog->aux->refcnt, 1); > >+sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; > > So, if you do this, then this breaks the assumption of eBPF JITs > that, currently, all classic converted BPF programs always have a > prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). > > Currently, JITs make use of this information to determine whether > A and X mappings for such programs should or should not be cleared > in the prologue (s390 currently). > > In the seccomp_prepare_filter() stage, we're already past that, so > it will not cause an issue, but we certainly would need to be very > careful in future, if bpf_prog_was_classic() is then used at a later > stage when we already have a generated bpf_prog somewhere, as then > this assumption will break. The only reason we need to do this is to allow BPF_DUMP_PROG to work, since we were restricting it to only allow dumping of seccomp programs, since those don't have maps. Instead, perhaps we could allow dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/11/2015 02:20 AM, Tycho Andersen wrote: In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho Andersen CC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; So, if you do this, then this breaks the assumption of eBPF JITs that, currently, all classic converted BPF programs always have a prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). Currently, JITs make use of this information to determine whether A and X mappings for such programs should or should not be cleared in the prologue (s390 currently). In the seccomp_prepare_filter() stage, we're already past that, so it will not cause an issue, but we certainly would need to be very careful in future, if bpf_prog_was_classic() is then used at a later stage when we already have a generated bpf_prog somewhere, as then this assumption will break. return sfilter; } @@ -470,7 +472,7 @@ void get_seccomp_filter(struct task_struct *tsk) static inline void seccomp_filter_free(struct seccomp_filter *filter) { if (filter) { - bpf_prog_free(filter->prog); + bpf_prog_put(filter->prog); kfree(filter); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On Fri, Sep 11, 2015 at 06:03:59PM +0200, Daniel Borkmann wrote: > On 09/11/2015 04:44 PM, Tycho Andersen wrote: > >On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: > >>On 09/11/2015 02:20 AM, Tycho Andersen wrote: > >>>In the next patch, we're going to add a way to access the underlying > >>>filters via bpf fds. This means that we need to ref-count both the > >>>struct seccomp_filter objects and the struct bpf_prog objects separately, > >>>in case a process dies but a filter is still referred to by another > >>>process. > >>> > >>>Additionally, we mark classic converted seccomp filters as seccomp eBPF > >>>programs, since they are a subset of what is supported in seccomp eBPF. > >>> > >>>Signed-off-by: Tycho Andersen> >>>CC: Kees Cook > >>>CC: Will Drewry > >>>CC: Oleg Nesterov > >>>CC: Andy Lutomirski > >>>CC: Pavel Emelyanov > >>>CC: Serge E. Hallyn > >>>CC: Alexei Starovoitov > >>>CC: Daniel Borkmann > >>>--- > >>> kernel/seccomp.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>>diff --git a/kernel/seccomp.c b/kernel/seccomp.c > >>>index 245df6b..afaeddf 100644 > >>>--- a/kernel/seccomp.c > >>>+++ b/kernel/seccomp.c > >>>@@ -378,6 +378,8 @@ static struct seccomp_filter > >>>*seccomp_prepare_filter(struct sock_fprog *fprog) > >>> } > >>> > >>> atomic_set(>usage, 1); > >>>+ atomic_set(>prog->aux->refcnt, 1); > >>>+ sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; > >> > >>So, if you do this, then this breaks the assumption of eBPF JITs > >>that, currently, all classic converted BPF programs always have a > >>prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). > >> > >>Currently, JITs make use of this information to determine whether > >>A and X mappings for such programs should or should not be cleared > >>in the prologue (s390 currently). > >> > >>In the seccomp_prepare_filter() stage, we're already past that, so > >>it will not cause an issue, but we certainly would need to be very > >>careful in future, if bpf_prog_was_classic() is then used at a later > >>stage when we already have a generated bpf_prog somewhere, as then > >>this assumption will break. > > > >The only reason we need to do this is to allow BPF_DUMP_PROG to work, > >since we were restricting it to only allow dumping of seccomp > >programs, since those don't have maps. Instead, perhaps we could allow > >dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? > > There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers > already today, at least in networking case, not seccomp. So, since > you want to export [classic -> eBPF] only for seccomp, put fds on them > and dump these via bpf(2), you could allow that (with a big comment > stating why it's safe), but mid-term we really need to sanitize all > this stuff properly as this is needed for other types, too. Sorry, just to be clear, you're suggesting that the patch is ok modulo a comment describing the jit issues? Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/11/2015 07:33 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 06:03:59PM +0200, Daniel Borkmann wrote: On 09/11/2015 04:44 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: On 09/11/2015 02:20 AM, Tycho Andersen wrote: In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho AndersenCC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; So, if you do this, then this breaks the assumption of eBPF JITs that, currently, all classic converted BPF programs always have a prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). Currently, JITs make use of this information to determine whether A and X mappings for such programs should or should not be cleared in the prologue (s390 currently). In the seccomp_prepare_filter() stage, we're already past that, so it will not cause an issue, but we certainly would need to be very careful in future, if bpf_prog_was_classic() is then used at a later stage when we already have a generated bpf_prog somewhere, as then this assumption will break. The only reason we need to do this is to allow BPF_DUMP_PROG to work, since we were restricting it to only allow dumping of seccomp programs, since those don't have maps. Instead, perhaps we could allow dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers already today, at least in networking case, not seccomp. So, since you want to export [classic -> eBPF] only for seccomp, put fds on them and dump these via bpf(2), you could allow that (with a big comment stating why it's safe), but mid-term we really need to sanitize all this stuff properly as this is needed for other types, too. Sorry, just to be clear, you're suggesting that the patch is ok modulo a comment describing the jit issues? I think due to the given insns restrictions on classic seccomp, this could work for "most cases" (see below) for the time being until pointer sanitation is resolved and that seccomp-only restriction from the dump could be removed, BUT there's one more stone in the road which you still need to take care of with this whole 'giving classic seccomp-BPF -> eBPF transforms an fd, dumping and restoring that via bpf(2)' approach: If you have JIT enabled on ARM32, and add a classic seccomp-BPF filter, and dump that via your bpf(2) interface based on the current patches, what you'll get is not eBPF opcodes but classic (!) BPF opcodes as ARM32 classic JIT supports compilation of seccomp, since commit 24e737c1ebac ("ARM: net: add JIT support for loads from struct seccomp_data."). So in that case, bpf_prepare_filter() will not call into bpf_migrate_filter() as there's simply no need for it, because the classic code could already be JITed there. I guess other archs where JIT support for eBPF in not yet within near sight might sooner or later support this insn for their classic JITs, too ... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/11/2015 04:44 PM, Tycho Andersen wrote: On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: On 09/11/2015 02:20 AM, Tycho Andersen wrote: In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho AndersenCC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; So, if you do this, then this breaks the assumption of eBPF JITs that, currently, all classic converted BPF programs always have a prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). Currently, JITs make use of this information to determine whether A and X mappings for such programs should or should not be cleared in the prologue (s390 currently). In the seccomp_prepare_filter() stage, we're already past that, so it will not cause an issue, but we certainly would need to be very careful in future, if bpf_prog_was_classic() is then used at a later stage when we already have a generated bpf_prog somewhere, as then this assumption will break. The only reason we need to do this is to allow BPF_DUMP_PROG to work, since we were restricting it to only allow dumping of seccomp programs, since those don't have maps. Instead, perhaps we could allow dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? There are possibilities that BPF_PROG_TYPE_UNSPEC is calling helpers already today, at least in networking case, not seccomp. So, since you want to export [classic -> eBPF] only for seccomp, put fds on them and dump these via bpf(2), you could allow that (with a big comment stating why it's safe), but mid-term we really need to sanitize all this stuff properly as this is needed for other types, too. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On 09/11/2015 02:20 AM, Tycho Andersen wrote: In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho AndersenCC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; So, if you do this, then this breaks the assumption of eBPF JITs that, currently, all classic converted BPF programs always have a prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). Currently, JITs make use of this information to determine whether A and X mappings for such programs should or should not be cleared in the prologue (s390 currently). In the seccomp_prepare_filter() stage, we're already past that, so it will not cause an issue, but we certainly would need to be very careful in future, if bpf_prog_was_classic() is then used at a later stage when we already have a generated bpf_prog somewhere, as then this assumption will break. return sfilter; } @@ -470,7 +472,7 @@ void get_seccomp_filter(struct task_struct *tsk) static inline void seccomp_filter_free(struct seccomp_filter *filter) { if (filter) { - bpf_prog_free(filter->prog); + bpf_prog_put(filter->prog); kfree(filter); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
On Fri, Sep 11, 2015 at 03:02:36PM +0200, Daniel Borkmann wrote: > On 09/11/2015 02:20 AM, Tycho Andersen wrote: > >In the next patch, we're going to add a way to access the underlying > >filters via bpf fds. This means that we need to ref-count both the > >struct seccomp_filter objects and the struct bpf_prog objects separately, > >in case a process dies but a filter is still referred to by another > >process. > > > >Additionally, we mark classic converted seccomp filters as seccomp eBPF > >programs, since they are a subset of what is supported in seccomp eBPF. > > > >Signed-off-by: Tycho Andersen> >CC: Kees Cook > >CC: Will Drewry > >CC: Oleg Nesterov > >CC: Andy Lutomirski > >CC: Pavel Emelyanov > >CC: Serge E. Hallyn > >CC: Alexei Starovoitov > >CC: Daniel Borkmann > >--- > > kernel/seccomp.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > >diff --git a/kernel/seccomp.c b/kernel/seccomp.c > >index 245df6b..afaeddf 100644 > >--- a/kernel/seccomp.c > >+++ b/kernel/seccomp.c > >@@ -378,6 +378,8 @@ static struct seccomp_filter > >*seccomp_prepare_filter(struct sock_fprog *fprog) > > } > > > > atomic_set(>usage, 1); > >+atomic_set(>prog->aux->refcnt, 1); > >+sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; > > So, if you do this, then this breaks the assumption of eBPF JITs > that, currently, all classic converted BPF programs always have a > prog->type of BPF_PROG_TYPE_UNSPEC (see: bpf_prog_was_classic()). > > Currently, JITs make use of this information to determine whether > A and X mappings for such programs should or should not be cleared > in the prologue (s390 currently). > > In the seccomp_prepare_filter() stage, we're already past that, so > it will not cause an issue, but we certainly would need to be very > careful in future, if bpf_prog_was_classic() is then used at a later > stage when we already have a generated bpf_prog somewhere, as then > this assumption will break. The only reason we need to do this is to allow BPF_DUMP_PROG to work, since we were restricting it to only allow dumping of seccomp programs, since those don't have maps. Instead, perhaps we could allow dumping of BPF_PROG_TYPE_SECCOMP and BPF_PROG_TYPE_UNSPEC? Tycho -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho Andersen CC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; return sfilter; } @@ -470,7 +472,7 @@ void get_seccomp_filter(struct task_struct *tsk) static inline void seccomp_filter_free(struct seccomp_filter *filter) { if (filter) { - bpf_prog_free(filter->prog); + bpf_prog_put(filter->prog); kfree(filter); } } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/5] seccomp: make underlying bpf ref counted as well
In the next patch, we're going to add a way to access the underlying filters via bpf fds. This means that we need to ref-count both the struct seccomp_filter objects and the struct bpf_prog objects separately, in case a process dies but a filter is still referred to by another process. Additionally, we mark classic converted seccomp filters as seccomp eBPF programs, since they are a subset of what is supported in seccomp eBPF. Signed-off-by: Tycho AndersenCC: Kees Cook CC: Will Drewry CC: Oleg Nesterov CC: Andy Lutomirski CC: Pavel Emelyanov CC: Serge E. Hallyn CC: Alexei Starovoitov CC: Daniel Borkmann --- kernel/seccomp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 245df6b..afaeddf 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -378,6 +378,8 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog) } atomic_set(>usage, 1); + atomic_set(>prog->aux->refcnt, 1); + sfilter->prog->type = BPF_PROG_TYPE_SECCOMP; return sfilter; } @@ -470,7 +472,7 @@ void get_seccomp_filter(struct task_struct *tsk) static inline void seccomp_filter_free(struct seccomp_filter *filter) { if (filter) { - bpf_prog_free(filter->prog); + bpf_prog_put(filter->prog); kfree(filter); } } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/