Re: [net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
On Mon, Aug 29, 2016 at 02:49:17PM -0700, Alexei Starovoitov wrote: > On 8/29/16 12:24 PM, Tejun Heo wrote: > >Hello, Sargun. > > > >On Mon, Aug 29, 2016 at 11:49:07AM -0700, Sargun Dhillon wrote: > >>It would be a separate hook per LSM hook. Why wouldn't we want a separate > >>bpf > >>hook per lsm hook? I think if one program has to handle them all, the first > >>program would be looking up the hook program in a bpf prog array. If you > >>think > >>it's better to have this logic in the BPF program, that makes sense. > >> > >>I had a version of this patch that allowed you to attach a prog array > >>instead, > >>but I think that it's cleaner attaching a program per lsm hook. In addition, > >>there's a performance impact that comes from these hooks, so I wouldn't > >>want to > >>execute unneccessary code if it's avoidable. > > > >Hmm... it doesn't really matter how the backend part looks like and if > >we need to implement per-call hooks to lower runtime overhead, sure. > >I was mostly worried about the approach propagating through the > >userland visible interface. > > > >>The prog array approach also makes stacking filters difficult. If people > >>want > >>multiple filters per hook, the orchestrator would have to rewrite the > >>existing > >>filters to be cooperative. > > > >I'm not really sure "stacking" in the kernel side is a good idea. > >Please see below. > > > >>>I'm not convinced about the approach. It's an approach which pretty > >>>much requires future extensions while being rigid. Not a good > >>>combination. > >> > >>Do you have an alternative recommendation? Maybe just a set of 5 u64s > >>as the context object along with the hook ID? > > > >cgroup fs doesn't seem like the right interface for this but if it > >were I'd go for named hook IDs instead of opaque numbers. > > > >>>Unless this is properly delegatable, IOW, it's safe to fully delegate > >>>to a lesser security domain for all operations including program > >>>loading and assignment (I can't see how that'd be the case), making it > >>>an explicit controller doens't work in terms of userland interface. > >>>It's fine for bpf / lsm / whatever to attach to cgroups by extending > >>>struct cgroup itself or implementing an implicit controller but to be > >>>visible as an explicit controller it must be able to follow cgroup > >>>interface rules including delegation. If not, it's best to come > >>>through the interface which enforces the required permission checks > >>>and then talk to cgroup from there. This was also an issue with > >>>network cgroup bpf programs that Daniel Mack is working on. Please > >>>chat with him. > >> > >>Program assignment is possible by lesser security domains. Program loading > >>is > >>limited to CAP_SYS_ADMIN in init_user_ns. We could make it accessible to > >>CAP_SYS_ADMIN in any userns, but it the reasoning behind this is that > >>Checmate > >>BPF programs can leak kernel pointers. > > > >That doesn't make much sense to me. Delegation doesn't mean much if a > >delegatee can't load its own program (and I don't see how one can > >delegate kernel pointer access to !root). Also, unless there's > >per-program fine control on who can load it, it seems pretty dangerous > >to let anyone load any program. > > > >>Could we potentially restrict it to only CAP_MAC_OVERRIDE, while still > >>meeting > >>cgroup delegation requirements? > > > >Wouldn't it make far more sense to pass cgroup fd to bpf syscall so > >that "load this program" and "attach this program to the cgroup > >identified by this fd" through the same interface and permission > >checks? cgroup participating in bpf operations is all fine but > >splitting the userland interface across two domains seems like a bad > >idea. > > > >>Filters which are higher up in the heirarchy will still be enforced during > >>delegation. This was an explicit design, as the "Orchestrator in > >>Orchestrator" > >>use case needs to be supported. > > > >Given that program loading is restricted to root, wouldn't it be an a > >lot more efficient approach to let userland multiplex multiple > >programs? Walking the tree executing bpf programs each time one of > >these operations runs can be pretty expensive. Imagine a tree like > >the following. > > > > A - B - C > > \ D > > > >Let's say program is currently loaded on D. If someone wants to add a > >program on B, userland can load the program on B, combine B's and D's > >program and compile them into a single program and load it on D. The > >only thing kernel would need to do in terms of hierarchy is finding > >what's the closest program to execute. In the above example, C would > >need to use B's program and that can be determined on program > >assignment time rather than on each operation. > > I think that's exactly what Daniel's patches are doing and imo > it makes sense to keep this style for lsm as well > and also apply the concept of hook_id. > Daniel adds two commands to bpf syscall to
Re: [net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
On Aug 29, 2016 3:19 PM, "Mickaël Salaün"wrote: > > > On 29/08/2016 23:49, Alexei Starovoitov wrote: > > On 8/29/16 12:24 PM, Tejun Heo wrote: > >> Hello, Sargun. > >> > >> On Mon, Aug 29, 2016 at 11:49:07AM -0700, Sargun Dhillon wrote: > >>> It would be a separate hook per LSM hook. Why wouldn't we want a > >>> separate bpf > >>> hook per lsm hook? I think if one program has to handle them all, the > >>> first > >>> program would be looking up the hook program in a bpf prog array. If > >>> you think > >>> it's better to have this logic in the BPF program, that makes sense. > >>> > >>> I had a version of this patch that allowed you to attach a prog array > >>> instead, > >>> but I think that it's cleaner attaching a program per lsm hook. In > >>> addition, > >>> there's a performance impact that comes from these hooks, so I > >>> wouldn't want to > >>> execute unneccessary code if it's avoidable. > >> > >> Hmm... it doesn't really matter how the backend part looks like and if > >> we need to implement per-call hooks to lower runtime overhead, sure. > >> I was mostly worried about the approach propagating through the > >> userland visible interface. > >> > >>> The prog array approach also makes stacking filters difficult. If > >>> people want > >>> multiple filters per hook, the orchestrator would have to rewrite the > >>> existing > >>> filters to be cooperative. > >> > >> I'm not really sure "stacking" in the kernel side is a good idea. > >> Please see below. > >> > I'm not convinced about the approach. It's an approach which pretty > much requires future extensions while being rigid. Not a good > combination. > >>> > >>> Do you have an alternative recommendation? Maybe just a set of 5 u64s > >>> as the context object along with the hook ID? > >> > >> cgroup fs doesn't seem like the right interface for this but if it > >> were I'd go for named hook IDs instead of opaque numbers. > >> > Unless this is properly delegatable, IOW, it's safe to fully delegate > to a lesser security domain for all operations including program > loading and assignment (I can't see how that'd be the case), making it > an explicit controller doens't work in terms of userland interface. > It's fine for bpf / lsm / whatever to attach to cgroups by extending > struct cgroup itself or implementing an implicit controller but to be > visible as an explicit controller it must be able to follow cgroup > interface rules including delegation. If not, it's best to come > through the interface which enforces the required permission checks > and then talk to cgroup from there. This was also an issue with > network cgroup bpf programs that Daniel Mack is working on. Please > chat with him. > >>> > >>> Program assignment is possible by lesser security domains. Program > >>> loading is > >>> limited to CAP_SYS_ADMIN in init_user_ns. We could make it accessible to > >>> CAP_SYS_ADMIN in any userns, but it the reasoning behind this is that > >>> Checmate > >>> BPF programs can leak kernel pointers. > >> > >> That doesn't make much sense to me. Delegation doesn't mean much if a > >> delegatee can't load its own program (and I don't see how one can > >> delegate kernel pointer access to !root). Also, unless there's > >> per-program fine control on who can load it, it seems pretty dangerous > >> to let anyone load any program. > >> > >>> Could we potentially restrict it to only CAP_MAC_OVERRIDE, while > >>> still meeting > >>> cgroup delegation requirements? > >> > >> Wouldn't it make far more sense to pass cgroup fd to bpf syscall so > >> that "load this program" and "attach this program to the cgroup > >> identified by this fd" through the same interface and permission > >> checks? cgroup participating in bpf operations is all fine but > >> splitting the userland interface across two domains seems like a bad > >> idea. > >> > >>> Filters which are higher up in the heirarchy will still be enforced > >>> during > >>> delegation. This was an explicit design, as the "Orchestrator in > >>> Orchestrator" > >>> use case needs to be supported. > >> > >> Given that program loading is restricted to root, wouldn't it be an a > >> lot more efficient approach to let userland multiplex multiple > >> programs? Walking the tree executing bpf programs each time one of > >> these operations runs can be pretty expensive. Imagine a tree like > >> the following. > >> > >> A - B - C > >> \ D > >> > >> Let's say program is currently loaded on D. If someone wants to add a > >> program on B, userland can load the program on B, combine B's and D's > >> program and compile them into a single program and load it on D. The > >> only thing kernel would need to do in terms of hierarchy is finding > >> what's the closest program to execute. In the above example, C would > >> need to use B's program and that can be determined on program > >> assignment time
Re: [net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
On 29/08/2016 23:49, Alexei Starovoitov wrote: > On 8/29/16 12:24 PM, Tejun Heo wrote: >> Hello, Sargun. >> >> On Mon, Aug 29, 2016 at 11:49:07AM -0700, Sargun Dhillon wrote: >>> It would be a separate hook per LSM hook. Why wouldn't we want a >>> separate bpf >>> hook per lsm hook? I think if one program has to handle them all, the >>> first >>> program would be looking up the hook program in a bpf prog array. If >>> you think >>> it's better to have this logic in the BPF program, that makes sense. >>> >>> I had a version of this patch that allowed you to attach a prog array >>> instead, >>> but I think that it's cleaner attaching a program per lsm hook. In >>> addition, >>> there's a performance impact that comes from these hooks, so I >>> wouldn't want to >>> execute unneccessary code if it's avoidable. >> >> Hmm... it doesn't really matter how the backend part looks like and if >> we need to implement per-call hooks to lower runtime overhead, sure. >> I was mostly worried about the approach propagating through the >> userland visible interface. >> >>> The prog array approach also makes stacking filters difficult. If >>> people want >>> multiple filters per hook, the orchestrator would have to rewrite the >>> existing >>> filters to be cooperative. >> >> I'm not really sure "stacking" in the kernel side is a good idea. >> Please see below. >> I'm not convinced about the approach. It's an approach which pretty much requires future extensions while being rigid. Not a good combination. >>> >>> Do you have an alternative recommendation? Maybe just a set of 5 u64s >>> as the context object along with the hook ID? >> >> cgroup fs doesn't seem like the right interface for this but if it >> were I'd go for named hook IDs instead of opaque numbers. >> Unless this is properly delegatable, IOW, it's safe to fully delegate to a lesser security domain for all operations including program loading and assignment (I can't see how that'd be the case), making it an explicit controller doens't work in terms of userland interface. It's fine for bpf / lsm / whatever to attach to cgroups by extending struct cgroup itself or implementing an implicit controller but to be visible as an explicit controller it must be able to follow cgroup interface rules including delegation. If not, it's best to come through the interface which enforces the required permission checks and then talk to cgroup from there. This was also an issue with network cgroup bpf programs that Daniel Mack is working on. Please chat with him. >>> >>> Program assignment is possible by lesser security domains. Program >>> loading is >>> limited to CAP_SYS_ADMIN in init_user_ns. We could make it accessible to >>> CAP_SYS_ADMIN in any userns, but it the reasoning behind this is that >>> Checmate >>> BPF programs can leak kernel pointers. >> >> That doesn't make much sense to me. Delegation doesn't mean much if a >> delegatee can't load its own program (and I don't see how one can >> delegate kernel pointer access to !root). Also, unless there's >> per-program fine control on who can load it, it seems pretty dangerous >> to let anyone load any program. >> >>> Could we potentially restrict it to only CAP_MAC_OVERRIDE, while >>> still meeting >>> cgroup delegation requirements? >> >> Wouldn't it make far more sense to pass cgroup fd to bpf syscall so >> that "load this program" and "attach this program to the cgroup >> identified by this fd" through the same interface and permission >> checks? cgroup participating in bpf operations is all fine but >> splitting the userland interface across two domains seems like a bad >> idea. >> >>> Filters which are higher up in the heirarchy will still be enforced >>> during >>> delegation. This was an explicit design, as the "Orchestrator in >>> Orchestrator" >>> use case needs to be supported. >> >> Given that program loading is restricted to root, wouldn't it be an a >> lot more efficient approach to let userland multiplex multiple >> programs? Walking the tree executing bpf programs each time one of >> these operations runs can be pretty expensive. Imagine a tree like >> the following. >> >> A - B - C >> \ D >> >> Let's say program is currently loaded on D. If someone wants to add a >> program on B, userland can load the program on B, combine B's and D's >> program and compile them into a single program and load it on D. The >> only thing kernel would need to do in terms of hierarchy is finding >> what's the closest program to execute. In the above example, C would >> need to use B's program and that can be determined on program >> assignment time rather than on each operation. > > I think that's exactly what Daniel's patches are doing and imo > it makes sense to keep this style for lsm as well > and also apply the concept of hook_id. > Daniel adds two commands to bpf syscall to attach/detach from cgroup > with hook_id. >
Re: [net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
On 8/29/16 12:24 PM, Tejun Heo wrote: Hello, Sargun. On Mon, Aug 29, 2016 at 11:49:07AM -0700, Sargun Dhillon wrote: It would be a separate hook per LSM hook. Why wouldn't we want a separate bpf hook per lsm hook? I think if one program has to handle them all, the first program would be looking up the hook program in a bpf prog array. If you think it's better to have this logic in the BPF program, that makes sense. I had a version of this patch that allowed you to attach a prog array instead, but I think that it's cleaner attaching a program per lsm hook. In addition, there's a performance impact that comes from these hooks, so I wouldn't want to execute unneccessary code if it's avoidable. Hmm... it doesn't really matter how the backend part looks like and if we need to implement per-call hooks to lower runtime overhead, sure. I was mostly worried about the approach propagating through the userland visible interface. The prog array approach also makes stacking filters difficult. If people want multiple filters per hook, the orchestrator would have to rewrite the existing filters to be cooperative. I'm not really sure "stacking" in the kernel side is a good idea. Please see below. I'm not convinced about the approach. It's an approach which pretty much requires future extensions while being rigid. Not a good combination. Do you have an alternative recommendation? Maybe just a set of 5 u64s as the context object along with the hook ID? cgroup fs doesn't seem like the right interface for this but if it were I'd go for named hook IDs instead of opaque numbers. Unless this is properly delegatable, IOW, it's safe to fully delegate to a lesser security domain for all operations including program loading and assignment (I can't see how that'd be the case), making it an explicit controller doens't work in terms of userland interface. It's fine for bpf / lsm / whatever to attach to cgroups by extending struct cgroup itself or implementing an implicit controller but to be visible as an explicit controller it must be able to follow cgroup interface rules including delegation. If not, it's best to come through the interface which enforces the required permission checks and then talk to cgroup from there. This was also an issue with network cgroup bpf programs that Daniel Mack is working on. Please chat with him. Program assignment is possible by lesser security domains. Program loading is limited to CAP_SYS_ADMIN in init_user_ns. We could make it accessible to CAP_SYS_ADMIN in any userns, but it the reasoning behind this is that Checmate BPF programs can leak kernel pointers. That doesn't make much sense to me. Delegation doesn't mean much if a delegatee can't load its own program (and I don't see how one can delegate kernel pointer access to !root). Also, unless there's per-program fine control on who can load it, it seems pretty dangerous to let anyone load any program. Could we potentially restrict it to only CAP_MAC_OVERRIDE, while still meeting cgroup delegation requirements? Wouldn't it make far more sense to pass cgroup fd to bpf syscall so that "load this program" and "attach this program to the cgroup identified by this fd" through the same interface and permission checks? cgroup participating in bpf operations is all fine but splitting the userland interface across two domains seems like a bad idea. Filters which are higher up in the heirarchy will still be enforced during delegation. This was an explicit design, as the "Orchestrator in Orchestrator" use case needs to be supported. Given that program loading is restricted to root, wouldn't it be an a lot more efficient approach to let userland multiplex multiple programs? Walking the tree executing bpf programs each time one of these operations runs can be pretty expensive. Imagine a tree like the following. A - B - C \ D Let's say program is currently loaded on D. If someone wants to add a program on B, userland can load the program on B, combine B's and D's program and compile them into a single program and load it on D. The only thing kernel would need to do in terms of hierarchy is finding what's the closest program to execute. In the above example, C would need to use B's program and that can be determined on program assignment time rather than on each operation. I think that's exactly what Daniel's patches are doing and imo it makes sense to keep this style for lsm as well and also apply the concept of hook_id. Daniel adds two commands to bpf syscall to attach/detach from cgroup with hook_id. Initially two hooks will be for socket rx and tx. Then all interesting lsm hooks can be added one by one. Daniel's prog type will be bpf_prog_type_cgroup_socket_filter. LSM's prog type will be bpf_prog_type_lsm. And verifier can check type safety since the lsm hook_id will be passed at the program load time. See another thread we had with Mickael. landlock and checmate are very similar and
Re: [net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
Hello, Sargun. On Mon, Aug 29, 2016 at 11:49:07AM -0700, Sargun Dhillon wrote: > It would be a separate hook per LSM hook. Why wouldn't we want a separate bpf > hook per lsm hook? I think if one program has to handle them all, the first > program would be looking up the hook program in a bpf prog array. If you > think > it's better to have this logic in the BPF program, that makes sense. > > I had a version of this patch that allowed you to attach a prog array > instead, > but I think that it's cleaner attaching a program per lsm hook. In addition, > there's a performance impact that comes from these hooks, so I wouldn't want > to > execute unneccessary code if it's avoidable. Hmm... it doesn't really matter how the backend part looks like and if we need to implement per-call hooks to lower runtime overhead, sure. I was mostly worried about the approach propagating through the userland visible interface. > The prog array approach also makes stacking filters difficult. If people want > multiple filters per hook, the orchestrator would have to rewrite the > existing > filters to be cooperative. I'm not really sure "stacking" in the kernel side is a good idea. Please see below. > > I'm not convinced about the approach. It's an approach which pretty > > much requires future extensions while being rigid. Not a good > > combination. > > Do you have an alternative recommendation? Maybe just a set of 5 u64s > as the context object along with the hook ID? cgroup fs doesn't seem like the right interface for this but if it were I'd go for named hook IDs instead of opaque numbers. > > Unless this is properly delegatable, IOW, it's safe to fully delegate > > to a lesser security domain for all operations including program > > loading and assignment (I can't see how that'd be the case), making it > > an explicit controller doens't work in terms of userland interface. > > It's fine for bpf / lsm / whatever to attach to cgroups by extending > > struct cgroup itself or implementing an implicit controller but to be > > visible as an explicit controller it must be able to follow cgroup > > interface rules including delegation. If not, it's best to come > > through the interface which enforces the required permission checks > > and then talk to cgroup from there. This was also an issue with > > network cgroup bpf programs that Daniel Mack is working on. Please > > chat with him. > > Program assignment is possible by lesser security domains. Program loading is > limited to CAP_SYS_ADMIN in init_user_ns. We could make it accessible to > CAP_SYS_ADMIN in any userns, but it the reasoning behind this is that Checmate > BPF programs can leak kernel pointers. That doesn't make much sense to me. Delegation doesn't mean much if a delegatee can't load its own program (and I don't see how one can delegate kernel pointer access to !root). Also, unless there's per-program fine control on who can load it, it seems pretty dangerous to let anyone load any program. > Could we potentially restrict it to only CAP_MAC_OVERRIDE, while still > meeting > cgroup delegation requirements? Wouldn't it make far more sense to pass cgroup fd to bpf syscall so that "load this program" and "attach this program to the cgroup identified by this fd" through the same interface and permission checks? cgroup participating in bpf operations is all fine but splitting the userland interface across two domains seems like a bad idea. > Filters which are higher up in the heirarchy will still be enforced during > delegation. This was an explicit design, as the "Orchestrator in > Orchestrator" > use case needs to be supported. Given that program loading is restricted to root, wouldn't it be an a lot more efficient approach to let userland multiplex multiple programs? Walking the tree executing bpf programs each time one of these operations runs can be pretty expensive. Imagine a tree like the following. A - B - C \ D Let's say program is currently loaded on D. If someone wants to add a program on B, userland can load the program on B, combine B's and D's program and compile them into a single program and load it on D. The only thing kernel would need to do in terms of hierarchy is finding what's the closest program to execute. In the above example, C would need to use B's program and that can be determined on program assignment time rather than on each operation. Thanks. -- tejun
Re: [net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
On Mon, Aug 29, 2016 at 01:01:18PM -0400, Tejun Heo wrote: > Hello, > > On Mon, Aug 29, 2016 at 04:47:07AM -0700, Sargun Dhillon wrote: > > This patch adds a minor LSM, Checmate. Checmate is a flexible programmable, > > extensible minor LSM that's coupled with cgroups and BPF. It is designed to > > enforce container-specific policies. It is also a cgroupv2 controller. By > > itself, it doesn't do very much, but in conjunction with a orchestrator > > complex policies can be installed on the cgroup hierarchy. > > > > These cgroup programs are tied to the kernel ABI version. If one tries > > to load a BPF program compiled against a different kernel version, > > an error will be thrown. > > First of all, please talk with people working on network cgroup bpf > and landlock. I don't think it's a good idea to have N different ways > to implement cgroup-aware bpf mechanism. There can be multiple > consumers but there gotta be a common mechanism instead of several > independent controllers. > I've talked to Daniel Mack, and Alexei. I agree with you that it makes sense not to have an infinite number of these cgroup + bpf + lsm subsystems in the kernel. I think that making sure we don't sacrifice capability is important. > > diff --git a/include/linux/checmate.h b/include/linux/checmate.h > > new file mode 100644 > > index 000..4c4db4a > > --- /dev/null > > +++ b/include/linux/checmate.h > > @@ -0,0 +1,108 @@ > > +#ifndef _LINUX_CHECMATE_H_ > > +#define _LINUX_CHECMATE_H_ 1 > > +#include > > + > > +enum checmate_hook_num { > > + /* CONFIG_SECURITY_NET hooks */ > > + CHECMATE_HOOK_UNIX_STREAM_CONNECT, > > + CHECMATE_HOOK_UNIX_MAY_SEND, > > + CHECMATE_HOOK_SOCKET_CREATE, > > + CHECMATE_HOOK_SOCKET_POST_CREATE, > > + CHECMATE_HOOK_SOCKET_BIND, > > + CHECMATE_HOOK_SOCKET_CONNECT, > > + CHECMATE_HOOK_SOCKET_LISTEN, > > + CHECMATE_HOOK_SOCKET_ACCEPT, > > + CHECMATE_HOOK_SOCKET_SENDMSG, > > + CHECMATE_HOOK_SOCKET_RECVMSG, > > + CHECMATE_HOOK_SOCKET_GETSOCKNAME, > > + CHECMATE_HOOK_SOCKET_GETPEERNAME, > > + CHECMATE_HOOK_SOCKET_GETSOCKOPT, > > + CHECMATE_HOOK_SOCKET_SETSOCKOPT, > > + CHECMATE_HOOK_SOCKET_SHUTDOWN, > > + CHECMATE_HOOK_SOCKET_SOCK_RCV_SKB, > > + CHECMATE_HOOK_SK_FREE_SECURITY, > > + __CHECMATE_HOOK_MAX, > > +}; > > Do we really want a separate hook for each call? A logical extension > of this would be having a separate hook per syscall which feels kinda > horrible. > It would be a separate hook per LSM hook. Why wouldn't we want a separate bpf hook per lsm hook? I think if one program has to handle them all, the first program would be looking up the hook program in a bpf prog array. If you think it's better to have this logic in the BPF program, that makes sense. I had a version of this patch that allowed you to attach a prog array instead, but I think that it's cleaner attaching a program per lsm hook. In addition, there's a performance impact that comes from these hooks, so I wouldn't want to execute unneccessary code if it's avoidable. The prog array approach also makes stacking filters difficult. If people want multiple filters per hook, the orchestrator would have to rewrite the existing filters to be cooperative. > > +/* CONFIG_SECURITY_NET contexts */ > > +struct checmate_unix_stream_connect_ctx { > > + struct sock *sock; > > + struct sock *other; > > + struct sock *newsk; > > +}; > ... > > +struct checmate_sk_free_security_ctx { > > + struct sock *sk; > > +}; > ... > > +struct checmate_ctx { > > + int hook; > > + union { > > +/* CONFIG_SECURITY_NET contexts */ > > + struct checmate_unix_stream_connect_ctx unix_stream_connect; > > + struct checmate_unix_may_send_ctx unix_may_send; > > + struct checmate_socket_create_ctx socket_create; > > + struct checmate_socket_bind_ctx socket_bind; > > + struct checmate_socket_connect_ctx socket_connect; > > + struct checmate_socket_listen_ctx socket_listen; > > + struct checmate_socket_accept_ctx socket_accept; > > + struct checmate_socket_sendmsg_ctx socket_sendmsg; > > + struct checmate_socket_recvmsg_ctx socket_recvmsg; > > + struct checmate_socket_sock_rcv_skb_ctx socket_sock_rcv_skb; > > + struct checmate_sk_free_security_ctxsk_free_security; > > + }; > > +}; > > I'm not convinced about the approach. It's an approach which pretty > much requires future extensions while being rigid. Not a good > combination. > Do you have an alternative recommendation? Maybe just a set of 5 u64s as the context object along with the hook ID? > > +/* > > Please use /** for function comments. > > > + * checmate_instance_cleanup_rcu - Cleans up a Checmate program instance > > + * @rp: rcu_head pointer to a Checmate instance > > + */ > > +static void checmate_instance_cleanup_rcu(struct rcu_head *rp) > > +{ > > + struct
Re: [net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
Hello, On Mon, Aug 29, 2016 at 04:47:07AM -0700, Sargun Dhillon wrote: > This patch adds a minor LSM, Checmate. Checmate is a flexible programmable, > extensible minor LSM that's coupled with cgroups and BPF. It is designed to > enforce container-specific policies. It is also a cgroupv2 controller. By > itself, it doesn't do very much, but in conjunction with a orchestrator > complex policies can be installed on the cgroup hierarchy. > > These cgroup programs are tied to the kernel ABI version. If one tries > to load a BPF program compiled against a different kernel version, > an error will be thrown. First of all, please talk with people working on network cgroup bpf and landlock. I don't think it's a good idea to have N different ways to implement cgroup-aware bpf mechanism. There can be multiple consumers but there gotta be a common mechanism instead of several independent controllers. > diff --git a/include/linux/checmate.h b/include/linux/checmate.h > new file mode 100644 > index 000..4c4db4a > --- /dev/null > +++ b/include/linux/checmate.h > @@ -0,0 +1,108 @@ > +#ifndef _LINUX_CHECMATE_H_ > +#define _LINUX_CHECMATE_H_ 1 > +#include > + > +enum checmate_hook_num { > + /* CONFIG_SECURITY_NET hooks */ > + CHECMATE_HOOK_UNIX_STREAM_CONNECT, > + CHECMATE_HOOK_UNIX_MAY_SEND, > + CHECMATE_HOOK_SOCKET_CREATE, > + CHECMATE_HOOK_SOCKET_POST_CREATE, > + CHECMATE_HOOK_SOCKET_BIND, > + CHECMATE_HOOK_SOCKET_CONNECT, > + CHECMATE_HOOK_SOCKET_LISTEN, > + CHECMATE_HOOK_SOCKET_ACCEPT, > + CHECMATE_HOOK_SOCKET_SENDMSG, > + CHECMATE_HOOK_SOCKET_RECVMSG, > + CHECMATE_HOOK_SOCKET_GETSOCKNAME, > + CHECMATE_HOOK_SOCKET_GETPEERNAME, > + CHECMATE_HOOK_SOCKET_GETSOCKOPT, > + CHECMATE_HOOK_SOCKET_SETSOCKOPT, > + CHECMATE_HOOK_SOCKET_SHUTDOWN, > + CHECMATE_HOOK_SOCKET_SOCK_RCV_SKB, > + CHECMATE_HOOK_SK_FREE_SECURITY, > + __CHECMATE_HOOK_MAX, > +}; Do we really want a separate hook for each call? A logical extension of this would be having a separate hook per syscall which feels kinda horrible. > +/* CONFIG_SECURITY_NET contexts */ > +struct checmate_unix_stream_connect_ctx { > + struct sock *sock; > + struct sock *other; > + struct sock *newsk; > +}; ... > +struct checmate_sk_free_security_ctx { > + struct sock *sk; > +}; ... > +struct checmate_ctx { > + int hook; > + union { > +/* CONFIG_SECURITY_NET contexts */ > + struct checmate_unix_stream_connect_ctx unix_stream_connect; > + struct checmate_unix_may_send_ctx unix_may_send; > + struct checmate_socket_create_ctx socket_create; > + struct checmate_socket_bind_ctx socket_bind; > + struct checmate_socket_connect_ctx socket_connect; > + struct checmate_socket_listen_ctx socket_listen; > + struct checmate_socket_accept_ctx socket_accept; > + struct checmate_socket_sendmsg_ctx socket_sendmsg; > + struct checmate_socket_recvmsg_ctx socket_recvmsg; > + struct checmate_socket_sock_rcv_skb_ctx socket_sock_rcv_skb; > + struct checmate_sk_free_security_ctxsk_free_security; > + }; > +}; I'm not convinced about the approach. It's an approach which pretty much requires future extensions while being rigid. Not a good combination. > +/* Please use /** for function comments. > + * checmate_instance_cleanup_rcu - Cleans up a Checmate program instance > + * @rp: rcu_head pointer to a Checmate instance > + */ > +static void checmate_instance_cleanup_rcu(struct rcu_head *rp) > +{ > + struct checmate_instance *instance; > + > + instance = container_of(rp, struct checmate_instance, rcu); > + bpf_prog_put(instance->prog); > + kfree(instance); > +} > +static struct cftype checmate_files[] = { > +#ifdef CONFIG_SECURITY_NETWORK > + CHECMATE_CFTYPE(CHECMATE_HOOK_UNIX_STREAM_CONNECT, > + "unix_stream_connect"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_UNIX_MAY_SEND, > + "unix_may_send"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_CREATE, "socket_create"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_BIND, "socket_bind"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_CONNECT, "socket_connect"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_LISTEN, "socket_listen"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_ACCEPT, "socket_accept"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_SENDMSG, "socket_sendmsg"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_RECVMSG, "socket_recvmsg"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_SHUTDOWN, "socket_shutdown"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SOCKET_SOCK_RCV_SKB, > + "socket_sock_rcv_skb"), > + CHECMATE_CFTYPE(CHECMATE_HOOK_SK_FREE_SECURITY, "sk_free_security"), > +#endif /* CONFIG_SECURITY_NETWORK */ > + {} > +}; I don't think this is a good interface approach. > +struct
[net-next RFC v2 4/9] bpf, security: Add Checmate security LSM and BPF program type
This patch adds a minor LSM, Checmate. Checmate is a flexible programmable, extensible minor LSM that's coupled with cgroups and BPF. It is designed to enforce container-specific policies. It is also a cgroupv2 controller. By itself, it doesn't do very much, but in conjunction with a orchestrator complex policies can be installed on the cgroup hierarchy. These cgroup programs are tied to the kernel ABI version. If one tries to load a BPF program compiled against a different kernel version, an error will be thrown. Signed-off-by: Sargun Dhillon--- include/linux/cgroup_subsys.h| 4 + include/linux/checmate.h | 108 +++ include/uapi/linux/bpf.h | 1 + kernel/bpf/syscall.c | 2 +- security/Kconfig | 1 + security/Makefile| 2 + security/checmate/Kconfig| 11 + security/checmate/Makefile | 3 + security/checmate/checmate_bpf.c | 68 + security/checmate/checmate_lsm.c | 610 +++ 10 files changed, 809 insertions(+), 1 deletion(-) create mode 100644 include/linux/checmate.h create mode 100644 security/checmate/Kconfig create mode 100644 security/checmate/Makefile create mode 100644 security/checmate/checmate_bpf.c create mode 100644 security/checmate/checmate_lsm.c diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h index 0df0336a..fbb7aa7 100644 --- a/include/linux/cgroup_subsys.h +++ b/include/linux/cgroup_subsys.h @@ -56,6 +56,10 @@ SUBSYS(hugetlb) SUBSYS(pids) #endif +#if IS_ENABLED(CONFIG_SECURITY_CHECMATE) +SUBSYS(checmate) +#endif + /* * The following subsystems are not supported on the default hierarchy. */ diff --git a/include/linux/checmate.h b/include/linux/checmate.h new file mode 100644 index 000..4c4db4a --- /dev/null +++ b/include/linux/checmate.h @@ -0,0 +1,108 @@ +#ifndef _LINUX_CHECMATE_H_ +#define _LINUX_CHECMATE_H_ 1 +#include + +enum checmate_hook_num { + /* CONFIG_SECURITY_NET hooks */ + CHECMATE_HOOK_UNIX_STREAM_CONNECT, + CHECMATE_HOOK_UNIX_MAY_SEND, + CHECMATE_HOOK_SOCKET_CREATE, + CHECMATE_HOOK_SOCKET_POST_CREATE, + CHECMATE_HOOK_SOCKET_BIND, + CHECMATE_HOOK_SOCKET_CONNECT, + CHECMATE_HOOK_SOCKET_LISTEN, + CHECMATE_HOOK_SOCKET_ACCEPT, + CHECMATE_HOOK_SOCKET_SENDMSG, + CHECMATE_HOOK_SOCKET_RECVMSG, + CHECMATE_HOOK_SOCKET_GETSOCKNAME, + CHECMATE_HOOK_SOCKET_GETPEERNAME, + CHECMATE_HOOK_SOCKET_GETSOCKOPT, + CHECMATE_HOOK_SOCKET_SETSOCKOPT, + CHECMATE_HOOK_SOCKET_SHUTDOWN, + CHECMATE_HOOK_SOCKET_SOCK_RCV_SKB, + CHECMATE_HOOK_SK_FREE_SECURITY, + __CHECMATE_HOOK_MAX, +}; + +/* CONFIG_SECURITY_NET contexts */ +struct checmate_unix_stream_connect_ctx { + struct sock *sock; + struct sock *other; + struct sock *newsk; +}; + +struct checmate_unix_may_send_ctx { + struct socket *sock; + struct socket *other; +}; + +struct checmate_socket_create_ctx { + int family; + int type; + int protocol; + int kern; +}; + +struct checmate_socket_bind_ctx { + struct socket *sock; + struct sockaddr *address; + int addrlen; +}; + +struct checmate_socket_connect_ctx { + struct socket *sock; + struct sockaddr *address; + int addrlen; +}; + +struct checmate_socket_listen_ctx { + struct socket *sock; + int backlog; +}; + +struct checmate_socket_accept_ctx { + struct socket *sock; + struct socket *newsock; +}; + +struct checmate_socket_sendmsg_ctx { + struct socket *sock; + struct msghdr *msg; + int size; +}; + +struct checmate_socket_recvmsg_ctx { + struct socket *sock; + struct msghdr *msg; + int size; + int flags; +}; + +struct checmate_socket_sock_rcv_skb_ctx { + struct sock *sk; + struct sk_buff *skb; +}; + +struct checmate_sk_free_security_ctx { + struct sock *sk; +}; + +struct checmate_ctx { + int hook; + union { +/* CONFIG_SECURITY_NET contexts */ + struct checmate_unix_stream_connect_ctx unix_stream_connect; + struct checmate_unix_may_send_ctx unix_may_send; + struct checmate_socket_create_ctx socket_create; + struct checmate_socket_bind_ctx socket_bind; + struct checmate_socket_connect_ctx socket_connect; + struct checmate_socket_listen_ctx socket_listen; + struct checmate_socket_accept_ctx socket_accept; + struct checmate_socket_sendmsg_ctx socket_sendmsg; + struct checmate_socket_recvmsg_ctx socket_recvmsg; + struct checmate_socket_sock_rcv_skb_ctx socket_sock_rcv_skb; + struct checmate_sk_free_security_ctxsk_free_security; + }; +}; + +#endif /* _LINUX_CHECMATE_H_ */ diff --git a/include/uapi/linux/bpf.h