Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
From: Eric Dumazet Date: Thu, 09 Aug 2012 23:29:03 +0200 > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > index 76dde25..ec410e0 100644 > --- a/net/ipv4/ip_output.c > +++ b/net/ipv4/ip_output.c > @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct > sk_buff *skb, __be32 daddr, > arg->csumoffset) = csum_fold(csum_add(nskb->csum, > arg->csum)); > nskb->ip_summed = CHECKSUM_NONE; > + skb_orphan(nskb); > skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb)); > ip_push_pending_frames(sk, ); > } > This is definitely the best fix, please submit this formally. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On 8/9/2012 3:05 PM, Eric Dumazet wrote: > On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote: >> On 8/9/2012 2:29 PM, Eric Dumazet wrote: >>> smack_sk_alloc_security() uses smk_of_current() : What can be the >>> meaning of smk_of_current() in the context of 'kernel' sockets... >> Yes, and all of it's callers - to date - have had an appropriate >> value of current. It is using the API in the way it is supposed to. >> It is assuming a properly formed socket. You want to give it a >> cobbled together partial socket structure without task context. >> Your predecessor did not have this problem. > My predecessor ? You mean before the patch ? > > tcp socket was preallocated by at kernel boot time. > > What is the 'user' owning this socket ? > > You guys focus on an implementation detail of TCP stack. > You should never use this fake socket. > > I repeat : There are no true socket for these control packets. > > If you want them, then you'll have to add fields in timewait socket. > > I can decide to rewrite the whole thing just building a TCP packet on > its own, and send it without any fake socket. > > Some guy 15 years ago tried to reuse some high level functions, able to > build super packets and use sophisticated tricks, while we only want so > send a 40 or 60 bytes packet. OK, fine. You have an optimization. I'm good with that. Just don't expect that the entire software stack you are taking advantage of is going to change to accommodate your special case. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote: > On 8/9/2012 2:29 PM, Eric Dumazet wrote: > > smack_sk_alloc_security() uses smk_of_current() : What can be the > > meaning of smk_of_current() in the context of 'kernel' sockets... > > Yes, and all of it's callers - to date - have had an appropriate > value of current. It is using the API in the way it is supposed to. > It is assuming a properly formed socket. You want to give it a > cobbled together partial socket structure without task context. > Your predecessor did not have this problem. My predecessor ? You mean before the patch ? tcp socket was preallocated by at kernel boot time. What is the 'user' owning this socket ? You guys focus on an implementation detail of TCP stack. You should never use this fake socket. I repeat : There are no true socket for these control packets. If you want them, then you'll have to add fields in timewait socket. I can decide to rewrite the whole thing just building a TCP packet on its own, and send it without any fake socket. Some guy 15 years ago tried to reuse some high level functions, able to build super packets and use sophisticated tricks, while we only want so send a 40 or 60 bytes packet. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On 8/9/2012 2:29 PM, Eric Dumazet wrote: > On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote: >> NAK. >> >> I personally think commit be9f4a44e7d41cee should be reverted until it >> is fixed. Let me explain what all I believe it broke and how. >> > Suggesting to revert this commit while we have known working fixes is a > bit of strange reaction. A couple of potential short term workarounds have been identified, but no one is happy with them for the long term. That does not qualify as a "working fix" in engineering terms. > I understand you are upset, but I believe we tried to fix it. > >> Old callchain of the creation of the 'equivalent' socket previous to >> the patch in question just for reference: >> >> inet_ctl_sock_create >> sock_create_kern >> __sock_create >> pf->create (inet_create) >> sk_alloc >> sk_prot_alloc >> security_sk_alloc() >> >> >> This WAS working properly. All of it. > Nobody denies it. But acknowledge my patch uncovered a fundamental > issue. > > What kind of 'security module' can decide to let RST packets being sent > or not, on a global scale ? (one socket for the whole machine) The short answer is "any security module that wants to". And before we go any further, I'm a little surprised that SELinux doesn't do this already. > > smack_sk_alloc_security() uses smk_of_current() : What can be the > meaning of smk_of_current() in the context of 'kernel' sockets... Yes, and all of it's callers - to date - have had an appropriate value of current. It is using the API in the way it is supposed to. It is assuming a properly formed socket. You want to give it a cobbled together partial socket structure without task context. Your predecessor did not have this problem. > > Your patch tries to maintain this status quo. > > In fact I suggest the following one liner patch, unless you can really > demonstrate what can be the meaning of providing a fake socket for these > packets. > > This mess only happened because ip_append_data()/ip_push_pending_frames() > are so complex and use an underlying socket. > > But this socket should not be ever used outside of its scope. > > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > index 76dde25..ec410e0 100644 > --- a/net/ipv4/ip_output.c > +++ b/net/ipv4/ip_output.c > @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct > sk_buff *skb, __be32 daddr, > arg->csumoffset) = csum_fold(csum_add(nskb->csum, > arg->csum)); > nskb->ip_summed = CHECKSUM_NONE; > + skb_orphan(nskb); > skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb)); > ip_push_pending_frames(sk, ); > } > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote: > NAK. > > I personally think commit be9f4a44e7d41cee should be reverted until it > is fixed. Let me explain what all I believe it broke and how. > Suggesting to revert this commit while we have known working fixes is a bit of strange reaction. I understand you are upset, but I believe we tried to fix it. > Old callchain of the creation of the 'equivalent' socket previous to > the patch in question just for reference: > > inet_ctl_sock_create > sock_create_kern > __sock_create > pf->create (inet_create) > sk_alloc > sk_prot_alloc > security_sk_alloc() > > > This WAS working properly. All of it. Nobody denies it. But acknowledge my patch uncovered a fundamental issue. What kind of 'security module' can decide to let RST packets being sent or not, on a global scale ? (one socket for the whole machine) smack_sk_alloc_security() uses smk_of_current() : What can be the meaning of smk_of_current() in the context of 'kernel' sockets... Your patch tries to maintain this status quo. In fact I suggest the following one liner patch, unless you can really demonstrate what can be the meaning of providing a fake socket for these packets. This mess only happened because ip_append_data()/ip_push_pending_frames() are so complex and use an underlying socket. But this socket should not be ever used outside of its scope. diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 76dde25..ec410e0 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr, arg->csumoffset) = csum_fold(csum_add(nskb->csum, arg->csum)); nskb->ip_summed = CHECKSUM_NONE; + skb_orphan(nskb); skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb)); ip_push_pending_frames(sk, ); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 4:06 PM, Eric Paris wrote: > I'm going to work right now on exposing the equivalent struct sock LSM > interface so we can call that as well. But it's going to take me a > bit. Before you go too far down this path, can you elaborate on what exactly you mean by the above? I'm asking because I'm not convinced the labeling, either the old way or the new way, was 100% correct and I think we're going to need to change things regardless. I'm just not sure what the right solution is just yet. -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
NAK. I personally think commit be9f4a44e7d41cee should be reverted until it is fixed. Let me explain what all I believe it broke and how. Old callchain of the creation of the 'equivalent' socket previous to the patch in question just for reference: inet_ctl_sock_create sock_create_kern __sock_create pf->create (inet_create) sk_alloc sk_prot_alloc security_sk_alloc() This WAS working properly. All of it. The equivalent struct sock was being created and allocated in inet_create(), which called to sk_alloc->sk_prot_alloc->security_sk_alloc(). We all agree that failing to call security_sk_alloc() is the first regression introduced. The second regression was the labeling issue. There was a call to security_socket_post_create (from __sock_create) which was properly setting the labels on both the socket and sock. This new patch broke that as well. We don't expose an equivalent security_sock_post_create() interface in the LSM currently, and until we do, this can't be fixed. It's why I say it should be reverted. I have a patch I'm testing right now which takes care of the first part the way I like (and yes, I'm doing the allocation on the correct number node). It basically looks like so: + for_each_possible_cpu(cpu) { + sock = _cpu(unicast_sock, cpu); + rc = security_sk_alloc(>sk, PF_INET, GFP_KERNEL, cpu_to_node(cpu)); + if (rc) + return rc; + } I'm going to work right now on exposing the equivalent struct sock LSM interface so we can call that as well. But it's going to take me a bit. Attached is the patch just to (hopefully untested) shut up the panic. -Eric On Thu, Aug 9, 2012 at 10:50 AM, Eric Dumazet wrote: > From: Eric Dumazet > > commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a > selinux regression, reported and bisected by John Stultz > > selinux_ip_postroute_compat() expect to find a valid sk->sk_security > pointer, but this field is NULL for unicast_sock > > Fix this by adding a new 'kernel' parameter to security_sk_alloc(), > set to true if socket might already have a valid sk->sk_security > pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first > call to security_sk_alloc() will populate sk->sk_security pointer, > subsequent ones will reuse existing context. > > Reported-by: John Stultz > Bisected-by: John Stultz > Signed-off-by: Eric Dumazet > Cc: Paul Moore > Cc: Eric Paris > Cc: "Serge E. Hallyn" > --- > include/linux/security.h |6 +++--- > net/core/sock.c|2 +- > net/ipv4/ip_output.c |4 +++- > security/security.c|4 ++-- > security/selinux/hooks.c |5 - > security/smack/smack_lsm.c | 10 -- > 6 files changed, 21 insertions(+), 10 deletions(-) > > diff --git a/include/linux/security.h b/include/linux/security.h > index 4e5a73c..4d8e454 100644 > --- a/include/linux/security.h > +++ b/include/linux/security.h > @@ -1601,7 +1601,7 @@ struct security_operations { > int (*socket_sock_rcv_skb) (struct sock *sk, struct sk_buff *skb); > int (*socket_getpeersec_stream) (struct socket *sock, char __user > *optval, int __user *optlen, unsigned len); > int (*socket_getpeersec_dgram) (struct socket *sock, struct sk_buff > *skb, u32 *secid); > - int (*sk_alloc_security) (struct sock *sk, int family, gfp_t > priority); > + int (*sk_alloc_security) (struct sock *sk, int family, gfp_t > priority, bool kernel); > void (*sk_free_security) (struct sock *sk); > void (*sk_clone_security) (const struct sock *sk, struct sock *newsk); > void (*sk_getsecid) (struct sock *sk, u32 *secid); > @@ -2539,7 +2539,7 @@ int security_sock_rcv_skb(struct sock *sk, struct > sk_buff *skb); > int security_socket_getpeersec_stream(struct socket *sock, char __user > *optval, > int __user *optlen, unsigned len); > int security_socket_getpeersec_dgram(struct socket *sock, struct sk_buff > *skb, u32 *secid); > -int security_sk_alloc(struct sock *sk, int family, gfp_t priority); > +int security_sk_alloc(struct sock *sk, int family, gfp_t priority, bool > kernel); > void security_sk_free(struct sock *sk); > void security_sk_clone(const struct sock *sk, struct sock *newsk); > void security_sk_classify_flow(struct sock *sk, struct flowi *fl); > @@ -2667,7 +2667,7 @@ static inline int > security_socket_getpeersec_dgram(struct socket *sock, struct s > return -ENOPROTOOPT; > } > > -static inline int security_sk_alloc(struct sock *sk, int family, gfp_t > priority) > +static inline int security_sk_alloc(struct sock *sk, int family, gfp_t > priority, bool kernel) > { > return 0; > } > diff --git a/net/core/sock.c b/net/core/sock.c > index 8f67ced..e00cadf 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -1186,7 +1186,7 @@ static struct
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 12:05 -0400, Eric Paris wrote: > On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet wrote: > > On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: > > > >> Is is possible to do the call to security_sk_alloc() in the ip_init() > >> function > >> or does the per-cpu nature of the socket make this a pain? > >> > > > > Its a pain, if we want NUMA affinity. > > > > Here, each cpu should get memory from its closest node. > > I really really don't like it. I won't say NAK, but it is the first > and only place in the kernel where I believe we allocate an object and > don't allocate the security blob until some random later point in > time. ... > If it is such a performance issue to have the security blob in > the same numa node, isn't adding a number of branches and putting this > function call on every output at least as bad? Aren't we discouraged > from GFP_ATOMIC? In __init we can use GFP_KERNEL. What a big deal. Its done _once_ time per cpu, and this is so small blob of memory you'll have to show us one single failure out of one million boots. If the security_sk_alloc() fails, we dont care. We are about sending a RESET or ACK packet. They can be lost by the network, or even skb allocation can fail. Nobody ever noticed and complained. Every time we accept() a new socket (and call security_sk_alloc()), its done under soft irq, thus GFP_ATOMIC, and you didn't complain yet, while a socket needs about 2 Kbytes of memory... > > This still doesn't fix these sockets entirely. We now have the > security blob allocated, but it was never set to something useful. > Paul, are you looking into this? This is a bandaide, not a fix > Please do so, on a followup patch, dont pretend I must fix all this stuff. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 12:05 PM, Eric Paris wrote: > Paul, are you looking into this? This is a bandaide, not a fix Yep, I mentioned this a few times in the other thread. The problem is there is not going to be an easy fix for the labeling so I'd rather we see this patch, or something like it, go in now to resolve the kernel panic, and fix the labeling later. -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet wrote: > On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: > >> Is is possible to do the call to security_sk_alloc() in the ip_init() >> function >> or does the per-cpu nature of the socket make this a pain? >> > > Its a pain, if we want NUMA affinity. > > Here, each cpu should get memory from its closest node. I really really don't like it. I won't say NAK, but it is the first and only place in the kernel where I believe we allocate an object and don't allocate the security blob until some random later point in time. If it is such a performance issue to have the security blob in the same numa node, isn't adding a number of branches and putting this function call on every output at least as bad? Aren't we discouraged from GFP_ATOMIC? In __init we can use GFP_KERNEL. This still doesn't fix these sockets entirely. We now have the security blob allocated, but it was never set to something useful. Paul, are you looking into this? This is a bandaide, not a fix -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet wrote: > On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: > >> Is is possible to do the call to security_sk_alloc() in the ip_init() >> function >> or does the per-cpu nature of the socket make this a pain? >> > > Its a pain, if we want NUMA affinity. > > Here, each cpu should get memory from its closest node. Okay, makes sense. Acked-by: Paul Moore -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: > Is is possible to do the call to security_sk_alloc() in the ip_init() > function > or does the per-cpu nature of the socket make this a pain? > Its a pain, if we want NUMA affinity. Here, each cpu should get memory from its closest node. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thursday, August 09, 2012 04:50:33 PM Eric Dumazet wrote: > From: Eric Dumazet > > commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a > selinux regression, reported and bisected by John Stultz > > selinux_ip_postroute_compat() expect to find a valid sk->sk_security > pointer, but this field is NULL for unicast_sock > > Fix this by adding a new 'kernel' parameter to security_sk_alloc(), > set to true if socket might already have a valid sk->sk_security > pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first > call to security_sk_alloc() will populate sk->sk_security pointer, > subsequent ones will reuse existing context. > > Reported-by: John Stultz > Bisected-by: John Stultz > Signed-off-by: Eric Dumazet > Cc: Paul Moore > Cc: Eric Paris > Cc: "Serge E. Hallyn" ... > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > index 76dde25..b233d6e 100644 > --- a/net/ipv4/ip_output.c > +++ b/net/ipv4/ip_output.c > @@ -1524,6 +1524,8 @@ void ip_send_unicast_reply(struct net *net, struct > sk_buff *skb, __be32 daddr, sk->sk_priority = skb->priority; > sk->sk_protocol = ip_hdr(skb)->protocol; > sk->sk_bound_dev_if = arg->bound_dev_if; > + if (security_sk_alloc(sk, PF_INET, GFP_ATOMIC, true)) > + goto out; > sock_net_set(sk, net); > __skb_queue_head_init(>sk_write_queue); > sk->sk_sndbuf = sysctl_wmem_default; Is is possible to do the call to security_sk_alloc() in the ip_init() function or does the per-cpu nature of the socket make this a pain? -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thursday, August 09, 2012 04:50:33 PM Eric Dumazet wrote: From: Eric Dumazet eduma...@google.com commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a selinux regression, reported and bisected by John Stultz selinux_ip_postroute_compat() expect to find a valid sk-sk_security pointer, but this field is NULL for unicast_sock Fix this by adding a new 'kernel' parameter to security_sk_alloc(), set to true if socket might already have a valid sk-sk_security pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first call to security_sk_alloc() will populate sk-sk_security pointer, subsequent ones will reuse existing context. Reported-by: John Stultz johns...@us.ibm.com Bisected-by: John Stultz johns...@us.ibm.com Signed-off-by: Eric Dumazet eduma...@google.com Cc: Paul Moore p...@paul-moore.com Cc: Eric Paris epa...@parisplace.org Cc: Serge E. Hallyn se...@hallyn.com ... diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 76dde25..b233d6e 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1524,6 +1524,8 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr, sk-sk_priority = skb-priority; sk-sk_protocol = ip_hdr(skb)-protocol; sk-sk_bound_dev_if = arg-bound_dev_if; + if (security_sk_alloc(sk, PF_INET, GFP_ATOMIC, true)) + goto out; sock_net_set(sk, net); __skb_queue_head_init(sk-sk_write_queue); sk-sk_sndbuf = sysctl_wmem_default; Is is possible to do the call to security_sk_alloc() in the ip_init() function or does the per-cpu nature of the socket make this a pain? -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: Is is possible to do the call to security_sk_alloc() in the ip_init() function or does the per-cpu nature of the socket make this a pain? Its a pain, if we want NUMA affinity. Here, each cpu should get memory from its closest node. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: Is is possible to do the call to security_sk_alloc() in the ip_init() function or does the per-cpu nature of the socket make this a pain? Its a pain, if we want NUMA affinity. Here, each cpu should get memory from its closest node. Okay, makes sense. Acked-by: Paul Moore p...@paul-moore.com -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: Is is possible to do the call to security_sk_alloc() in the ip_init() function or does the per-cpu nature of the socket make this a pain? Its a pain, if we want NUMA affinity. Here, each cpu should get memory from its closest node. I really really don't like it. I won't say NAK, but it is the first and only place in the kernel where I believe we allocate an object and don't allocate the security blob until some random later point in time. If it is such a performance issue to have the security blob in the same numa node, isn't adding a number of branches and putting this function call on every output at least as bad? Aren't we discouraged from GFP_ATOMIC? In __init we can use GFP_KERNEL. This still doesn't fix these sockets entirely. We now have the security blob allocated, but it was never set to something useful. Paul, are you looking into this? This is a bandaide, not a fix -Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 12:05 PM, Eric Paris epa...@parisplace.org wrote: Paul, are you looking into this? This is a bandaide, not a fix Yep, I mentioned this a few times in the other thread. The problem is there is not going to be an easy fix for the labeling so I'd rather we see this patch, or something like it, go in now to resolve the kernel panic, and fix the labeling later. -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 12:05 -0400, Eric Paris wrote: On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote: Is is possible to do the call to security_sk_alloc() in the ip_init() function or does the per-cpu nature of the socket make this a pain? Its a pain, if we want NUMA affinity. Here, each cpu should get memory from its closest node. I really really don't like it. I won't say NAK, but it is the first and only place in the kernel where I believe we allocate an object and don't allocate the security blob until some random later point in time. ... If it is such a performance issue to have the security blob in the same numa node, isn't adding a number of branches and putting this function call on every output at least as bad? Aren't we discouraged from GFP_ATOMIC? In __init we can use GFP_KERNEL. What a big deal. Its done _once_ time per cpu, and this is so small blob of memory you'll have to show us one single failure out of one million boots. If the security_sk_alloc() fails, we dont care. We are about sending a RESET or ACK packet. They can be lost by the network, or even skb allocation can fail. Nobody ever noticed and complained. Every time we accept() a new socket (and call security_sk_alloc()), its done under soft irq, thus GFP_ATOMIC, and you didn't complain yet, while a socket needs about 2 Kbytes of memory... This still doesn't fix these sockets entirely. We now have the security blob allocated, but it was never set to something useful. Paul, are you looking into this? This is a bandaide, not a fix Please do so, on a followup patch, dont pretend I must fix all this stuff. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
NAK. I personally think commit be9f4a44e7d41cee should be reverted until it is fixed. Let me explain what all I believe it broke and how. Old callchain of the creation of the 'equivalent' socket previous to the patch in question just for reference: inet_ctl_sock_create sock_create_kern __sock_create pf-create (inet_create) sk_alloc sk_prot_alloc security_sk_alloc() This WAS working properly. All of it. The equivalent struct sock was being created and allocated in inet_create(), which called to sk_alloc-sk_prot_alloc-security_sk_alloc(). We all agree that failing to call security_sk_alloc() is the first regression introduced. The second regression was the labeling issue. There was a call to security_socket_post_create (from __sock_create) which was properly setting the labels on both the socket and sock. This new patch broke that as well. We don't expose an equivalent security_sock_post_create() interface in the LSM currently, and until we do, this can't be fixed. It's why I say it should be reverted. I have a patch I'm testing right now which takes care of the first part the way I like (and yes, I'm doing the allocation on the correct number node). It basically looks like so: + for_each_possible_cpu(cpu) { + sock = per_cpu(unicast_sock, cpu); + rc = security_sk_alloc(sock-sk, PF_INET, GFP_KERNEL, cpu_to_node(cpu)); + if (rc) + return rc; + } I'm going to work right now on exposing the equivalent struct sock LSM interface so we can call that as well. But it's going to take me a bit. Attached is the patch just to (hopefully untested) shut up the panic. -Eric On Thu, Aug 9, 2012 at 10:50 AM, Eric Dumazet eric.duma...@gmail.com wrote: From: Eric Dumazet eduma...@google.com commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a selinux regression, reported and bisected by John Stultz selinux_ip_postroute_compat() expect to find a valid sk-sk_security pointer, but this field is NULL for unicast_sock Fix this by adding a new 'kernel' parameter to security_sk_alloc(), set to true if socket might already have a valid sk-sk_security pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first call to security_sk_alloc() will populate sk-sk_security pointer, subsequent ones will reuse existing context. Reported-by: John Stultz johns...@us.ibm.com Bisected-by: John Stultz johns...@us.ibm.com Signed-off-by: Eric Dumazet eduma...@google.com Cc: Paul Moore p...@paul-moore.com Cc: Eric Paris epa...@parisplace.org Cc: Serge E. Hallyn se...@hallyn.com --- include/linux/security.h |6 +++--- net/core/sock.c|2 +- net/ipv4/ip_output.c |4 +++- security/security.c|4 ++-- security/selinux/hooks.c |5 - security/smack/smack_lsm.c | 10 -- 6 files changed, 21 insertions(+), 10 deletions(-) diff --git a/include/linux/security.h b/include/linux/security.h index 4e5a73c..4d8e454 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -1601,7 +1601,7 @@ struct security_operations { int (*socket_sock_rcv_skb) (struct sock *sk, struct sk_buff *skb); int (*socket_getpeersec_stream) (struct socket *sock, char __user *optval, int __user *optlen, unsigned len); int (*socket_getpeersec_dgram) (struct socket *sock, struct sk_buff *skb, u32 *secid); - int (*sk_alloc_security) (struct sock *sk, int family, gfp_t priority); + int (*sk_alloc_security) (struct sock *sk, int family, gfp_t priority, bool kernel); void (*sk_free_security) (struct sock *sk); void (*sk_clone_security) (const struct sock *sk, struct sock *newsk); void (*sk_getsecid) (struct sock *sk, u32 *secid); @@ -2539,7 +2539,7 @@ int security_sock_rcv_skb(struct sock *sk, struct sk_buff *skb); int security_socket_getpeersec_stream(struct socket *sock, char __user *optval, int __user *optlen, unsigned len); int security_socket_getpeersec_dgram(struct socket *sock, struct sk_buff *skb, u32 *secid); -int security_sk_alloc(struct sock *sk, int family, gfp_t priority); +int security_sk_alloc(struct sock *sk, int family, gfp_t priority, bool kernel); void security_sk_free(struct sock *sk); void security_sk_clone(const struct sock *sk, struct sock *newsk); void security_sk_classify_flow(struct sock *sk, struct flowi *fl); @@ -2667,7 +2667,7 @@ static inline int security_socket_getpeersec_dgram(struct socket *sock, struct s return -ENOPROTOOPT; } -static inline int security_sk_alloc(struct sock *sk, int family, gfp_t priority) +static inline int security_sk_alloc(struct sock *sk, int family, gfp_t priority, bool kernel) { return 0; } diff --git a/net/core/sock.c b/net/core/sock.c index 8f67ced..e00cadf 100644 ---
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, Aug 9, 2012 at 4:06 PM, Eric Paris epa...@parisplace.org wrote: I'm going to work right now on exposing the equivalent struct sock LSM interface so we can call that as well. But it's going to take me a bit. Before you go too far down this path, can you elaborate on what exactly you mean by the above? I'm asking because I'm not convinced the labeling, either the old way or the new way, was 100% correct and I think we're going to need to change things regardless. I'm just not sure what the right solution is just yet. -- paul moore www.paul-moore.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote: NAK. I personally think commit be9f4a44e7d41cee should be reverted until it is fixed. Let me explain what all I believe it broke and how. Suggesting to revert this commit while we have known working fixes is a bit of strange reaction. I understand you are upset, but I believe we tried to fix it. Old callchain of the creation of the 'equivalent' socket previous to the patch in question just for reference: inet_ctl_sock_create sock_create_kern __sock_create pf-create (inet_create) sk_alloc sk_prot_alloc security_sk_alloc() This WAS working properly. All of it. Nobody denies it. But acknowledge my patch uncovered a fundamental issue. What kind of 'security module' can decide to let RST packets being sent or not, on a global scale ? (one socket for the whole machine) smack_sk_alloc_security() uses smk_of_current() : What can be the meaning of smk_of_current() in the context of 'kernel' sockets... Your patch tries to maintain this status quo. In fact I suggest the following one liner patch, unless you can really demonstrate what can be the meaning of providing a fake socket for these packets. This mess only happened because ip_append_data()/ip_push_pending_frames() are so complex and use an underlying socket. But this socket should not be ever used outside of its scope. diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 76dde25..ec410e0 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr, arg-csumoffset) = csum_fold(csum_add(nskb-csum, arg-csum)); nskb-ip_summed = CHECKSUM_NONE; + skb_orphan(nskb); skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb)); ip_push_pending_frames(sk, fl4); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On 8/9/2012 2:29 PM, Eric Dumazet wrote: On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote: NAK. I personally think commit be9f4a44e7d41cee should be reverted until it is fixed. Let me explain what all I believe it broke and how. Suggesting to revert this commit while we have known working fixes is a bit of strange reaction. A couple of potential short term workarounds have been identified, but no one is happy with them for the long term. That does not qualify as a working fix in engineering terms. I understand you are upset, but I believe we tried to fix it. Old callchain of the creation of the 'equivalent' socket previous to the patch in question just for reference: inet_ctl_sock_create sock_create_kern __sock_create pf-create (inet_create) sk_alloc sk_prot_alloc security_sk_alloc() This WAS working properly. All of it. Nobody denies it. But acknowledge my patch uncovered a fundamental issue. What kind of 'security module' can decide to let RST packets being sent or not, on a global scale ? (one socket for the whole machine) The short answer is any security module that wants to. And before we go any further, I'm a little surprised that SELinux doesn't do this already. smack_sk_alloc_security() uses smk_of_current() : What can be the meaning of smk_of_current() in the context of 'kernel' sockets... Yes, and all of it's callers - to date - have had an appropriate value of current. It is using the API in the way it is supposed to. It is assuming a properly formed socket. You want to give it a cobbled together partial socket structure without task context. Your predecessor did not have this problem. Your patch tries to maintain this status quo. In fact I suggest the following one liner patch, unless you can really demonstrate what can be the meaning of providing a fake socket for these packets. This mess only happened because ip_append_data()/ip_push_pending_frames() are so complex and use an underlying socket. But this socket should not be ever used outside of its scope. diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 76dde25..ec410e0 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr, arg-csumoffset) = csum_fold(csum_add(nskb-csum, arg-csum)); nskb-ip_summed = CHECKSUM_NONE; + skb_orphan(nskb); skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb)); ip_push_pending_frames(sk, fl4); } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote: On 8/9/2012 2:29 PM, Eric Dumazet wrote: smack_sk_alloc_security() uses smk_of_current() : What can be the meaning of smk_of_current() in the context of 'kernel' sockets... Yes, and all of it's callers - to date - have had an appropriate value of current. It is using the API in the way it is supposed to. It is assuming a properly formed socket. You want to give it a cobbled together partial socket structure without task context. Your predecessor did not have this problem. My predecessor ? You mean before the patch ? tcp socket was preallocated by at kernel boot time. What is the 'user' owning this socket ? You guys focus on an implementation detail of TCP stack. You should never use this fake socket. I repeat : There are no true socket for these control packets. If you want them, then you'll have to add fields in timewait socket. I can decide to rewrite the whole thing just building a TCP packet on its own, and send it without any fake socket. Some guy 15 years ago tried to reuse some high level functions, able to build super packets and use sophisticated tricks, while we only want so send a 40 or 60 bytes packet. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
On 8/9/2012 3:05 PM, Eric Dumazet wrote: On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote: On 8/9/2012 2:29 PM, Eric Dumazet wrote: smack_sk_alloc_security() uses smk_of_current() : What can be the meaning of smk_of_current() in the context of 'kernel' sockets... Yes, and all of it's callers - to date - have had an appropriate value of current. It is using the API in the way it is supposed to. It is assuming a properly formed socket. You want to give it a cobbled together partial socket structure without task context. Your predecessor did not have this problem. My predecessor ? You mean before the patch ? tcp socket was preallocated by at kernel boot time. What is the 'user' owning this socket ? You guys focus on an implementation detail of TCP stack. You should never use this fake socket. I repeat : There are no true socket for these control packets. If you want them, then you'll have to add fields in timewait socket. I can decide to rewrite the whole thing just building a TCP packet on its own, and send it without any fake socket. Some guy 15 years ago tried to reuse some high level functions, able to build super packets and use sophisticated tricks, while we only want so send a 40 or 60 bytes packet. OK, fine. You have an optimization. I'm good with that. Just don't expect that the entire software stack you are taking advantage of is going to change to accommodate your special case. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock
From: Eric Dumazet eric.duma...@gmail.com Date: Thu, 09 Aug 2012 23:29:03 +0200 diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 76dde25..ec410e0 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, __be32 daddr, arg-csumoffset) = csum_fold(csum_add(nskb-csum, arg-csum)); nskb-ip_summed = CHECKSUM_NONE; + skb_orphan(nskb); skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb)); ip_push_pending_frames(sk, fl4); } This is definitely the best fix, please submit this formally. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/