Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread David Miller
From: Eric Dumazet 
Date: Thu, 09 Aug 2012 23:29:03 +0200

> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 76dde25..ec410e0 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct 
> sk_buff *skb, __be32 daddr,
> arg->csumoffset) = csum_fold(csum_add(nskb->csum,
>   arg->csum));
>   nskb->ip_summed = CHECKSUM_NONE;
> + skb_orphan(nskb);
>   skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
>   ip_push_pending_frames(sk, );
>   }
> 

This is definitely the best fix, please submit this formally.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Casey Schaufler
On 8/9/2012 3:05 PM, Eric Dumazet wrote:
> On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote:
>> On 8/9/2012 2:29 PM, Eric Dumazet wrote:
>>> smack_sk_alloc_security() uses smk_of_current() : What can be the
>>> meaning of smk_of_current() in the context of 'kernel' sockets...
>> Yes, and all of it's callers - to date - have had an appropriate
>> value of current. It is using the API in the way it is supposed to.
>> It is assuming a properly formed socket. You want to give it a
>> cobbled together partial socket structure without task context.
>> Your predecessor did not have this problem.
> My predecessor ? You mean before the patch ?
>
> tcp socket was preallocated by at kernel boot time.
>
> What is the 'user' owning this socket ?
>
> You guys focus on an implementation detail of TCP stack.
> You should never use this fake socket.
>
> I repeat : There are no true socket for these control packets.
>
> If you want them, then you'll have to add fields in timewait socket.
>
> I can decide to rewrite the whole thing just building a TCP packet on
> its own, and send it without any fake socket.
>
> Some guy 15 years ago tried to reuse some high level functions, able to
> build super packets and use sophisticated tricks, while we only want so
> send a 40 or 60 bytes packet.

OK, fine. You have an optimization. I'm good with that. Just don't
expect that the entire software stack you are taking advantage of
is going to change to accommodate your special case.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote:
> On 8/9/2012 2:29 PM, Eric Dumazet wrote:

> > smack_sk_alloc_security() uses smk_of_current() : What can be the
> > meaning of smk_of_current() in the context of 'kernel' sockets...
> 
> Yes, and all of it's callers - to date - have had an appropriate
> value of current. It is using the API in the way it is supposed to.
> It is assuming a properly formed socket. You want to give it a
> cobbled together partial socket structure without task context.
> Your predecessor did not have this problem.

My predecessor ? You mean before the patch ?

tcp socket was preallocated by at kernel boot time.

What is the 'user' owning this socket ?

You guys focus on an implementation detail of TCP stack.
You should never use this fake socket.

I repeat : There are no true socket for these control packets.

If you want them, then you'll have to add fields in timewait socket.

I can decide to rewrite the whole thing just building a TCP packet on
its own, and send it without any fake socket.

Some guy 15 years ago tried to reuse some high level functions, able to
build super packets and use sophisticated tricks, while we only want so
send a 40 or 60 bytes packet.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Casey Schaufler
On 8/9/2012 2:29 PM, Eric Dumazet wrote:
> On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote:
>> NAK.
>>
>> I personally think commit be9f4a44e7d41cee should be reverted until it
>> is fixed.  Let me explain what all I believe it broke and how.
>>
> Suggesting to revert this commit while we have known working fixes is a
> bit of strange reaction.

A couple of potential short term workarounds have been identified,
but no one is happy with them for the long term. That does not
qualify as a "working fix" in engineering terms.

> I understand you are upset, but I believe we tried to fix it.
>
>> Old callchain of the creation of the 'equivalent' socket previous to
>> the patch in question just for reference:
>>
>> inet_ctl_sock_create
>>   sock_create_kern
>> __sock_create
>>   pf->create (inet_create)
>> sk_alloc
>>   sk_prot_alloc
>> security_sk_alloc()
>>
>>
>> This WAS working properly.  All of it. 
> Nobody denies it. But acknowledge my patch uncovered a fundamental
> issue.
>
> What kind of 'security module' can decide to let RST packets being sent
> or not, on a global scale ? (one socket for the whole machine)

The short answer is "any security module that wants to".

And before we go any further, I'm a little surprised that
SELinux doesn't do this already.

>
> smack_sk_alloc_security() uses smk_of_current() : What can be the
> meaning of smk_of_current() in the context of 'kernel' sockets...

Yes, and all of it's callers - to date - have had an appropriate
value of current. It is using the API in the way it is supposed to.
It is assuming a properly formed socket. You want to give it a
cobbled together partial socket structure without task context.
Your predecessor did not have this problem.

>
> Your patch tries to maintain this status quo.
>
> In fact I suggest the following one liner patch, unless you can really
> demonstrate what can be the meaning of providing a fake socket for these
> packets.
>
> This mess only happened because ip_append_data()/ip_push_pending_frames()
> are so complex and use an underlying socket.
>
> But this socket should not be ever used outside of its scope.
>
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 76dde25..ec410e0 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct 
> sk_buff *skb, __be32 daddr,
> arg->csumoffset) = csum_fold(csum_add(nskb->csum,
>   arg->csum));
>   nskb->ip_summed = CHECKSUM_NONE;
> + skb_orphan(nskb);
>   skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
>   ip_push_pending_frames(sk, );
>   }
>
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote:
> NAK.
> 
> I personally think commit be9f4a44e7d41cee should be reverted until it
> is fixed.  Let me explain what all I believe it broke and how.
> 

Suggesting to revert this commit while we have known working fixes is a
bit of strange reaction.

I understand you are upset, but I believe we tried to fix it.

> Old callchain of the creation of the 'equivalent' socket previous to
> the patch in question just for reference:
> 
> inet_ctl_sock_create
>   sock_create_kern
> __sock_create
>   pf->create (inet_create)
> sk_alloc
>   sk_prot_alloc
> security_sk_alloc()
> 
> 
> This WAS working properly.  All of it. 

Nobody denies it. But acknowledge my patch uncovered a fundamental
issue.

What kind of 'security module' can decide to let RST packets being sent
or not, on a global scale ? (one socket for the whole machine)

smack_sk_alloc_security() uses smk_of_current() : What can be the
meaning of smk_of_current() in the context of 'kernel' sockets...

Your patch tries to maintain this status quo.

In fact I suggest the following one liner patch, unless you can really
demonstrate what can be the meaning of providing a fake socket for these
packets.

This mess only happened because ip_append_data()/ip_push_pending_frames()
are so complex and use an underlying socket.

But this socket should not be ever used outside of its scope.

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 76dde25..ec410e0 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct 
sk_buff *skb, __be32 daddr,
  arg->csumoffset) = csum_fold(csum_add(nskb->csum,
arg->csum));
nskb->ip_summed = CHECKSUM_NONE;
+   skb_orphan(nskb);
skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
ip_push_pending_frames(sk, );
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thu, Aug 9, 2012 at 4:06 PM, Eric Paris  wrote:
> I'm going to work right now on exposing the equivalent struct sock LSM
> interface so we can call that as well.  But it's going to take me a
> bit.

Before you go too far down this path, can you elaborate on what
exactly you mean by the above?

I'm asking because I'm not convinced the labeling, either the old way
or the new way, was 100% correct and I think we're going to need to
change things regardless.  I'm just not sure what the right solution
is just yet.

-- 
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Paris
NAK.

I personally think commit be9f4a44e7d41cee should be reverted until it
is fixed.  Let me explain what all I believe it broke and how.

Old callchain of the creation of the 'equivalent' socket previous to
the patch in question just for reference:

inet_ctl_sock_create
  sock_create_kern
__sock_create
  pf->create (inet_create)
sk_alloc
  sk_prot_alloc
security_sk_alloc()


This WAS working properly.  All of it.  The equivalent struct sock was
being created and allocated in inet_create(), which called to
sk_alloc->sk_prot_alloc->security_sk_alloc().  We all agree that
failing to call security_sk_alloc() is the first regression
introduced.

The second regression was the labeling issue.  There was a call to
security_socket_post_create (from __sock_create) which was properly
setting the labels on both the socket and sock.  This new patch broke
that as well.  We don't expose an equivalent
security_sock_post_create() interface in the LSM currently, and until
we do, this can't be fixed.  It's why I say it should be reverted.

I have a patch I'm testing right now which takes care of the first
part the way I like (and yes, I'm doing the allocation on the correct
number node).  It basically looks like so:

+   for_each_possible_cpu(cpu) {
+   sock = _cpu(unicast_sock, cpu);
+   rc = security_sk_alloc(>sk, PF_INET, GFP_KERNEL,
cpu_to_node(cpu));
+   if (rc)
+   return rc;
+   }

I'm going to work right now on exposing the equivalent struct sock LSM
interface so we can call that as well.  But it's going to take me a
bit.  Attached is the patch just to (hopefully untested) shut up the
panic.

-Eric

On Thu, Aug 9, 2012 at 10:50 AM, Eric Dumazet  wrote:
> From: Eric Dumazet 
>
> commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a
> selinux regression, reported and bisected by John Stultz
>
> selinux_ip_postroute_compat() expect to find a valid sk->sk_security
> pointer, but this field is NULL for unicast_sock
>
> Fix this by adding a new 'kernel' parameter to security_sk_alloc(),
> set to true if socket might already have a valid sk->sk_security
> pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first
> call to security_sk_alloc() will populate sk->sk_security pointer,
> subsequent ones will reuse existing context.
>
> Reported-by: John Stultz 
> Bisected-by: John Stultz 
> Signed-off-by: Eric Dumazet 
> Cc: Paul Moore 
> Cc: Eric Paris 
> Cc: "Serge E. Hallyn" 
> ---
>  include/linux/security.h   |6 +++---
>  net/core/sock.c|2 +-
>  net/ipv4/ip_output.c   |4 +++-
>  security/security.c|4 ++--
>  security/selinux/hooks.c   |5 -
>  security/smack/smack_lsm.c |   10 --
>  6 files changed, 21 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 4e5a73c..4d8e454 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -1601,7 +1601,7 @@ struct security_operations {
> int (*socket_sock_rcv_skb) (struct sock *sk, struct sk_buff *skb);
> int (*socket_getpeersec_stream) (struct socket *sock, char __user 
> *optval, int __user *optlen, unsigned len);
> int (*socket_getpeersec_dgram) (struct socket *sock, struct sk_buff 
> *skb, u32 *secid);
> -   int (*sk_alloc_security) (struct sock *sk, int family, gfp_t 
> priority);
> +   int (*sk_alloc_security) (struct sock *sk, int family, gfp_t 
> priority, bool kernel);
> void (*sk_free_security) (struct sock *sk);
> void (*sk_clone_security) (const struct sock *sk, struct sock *newsk);
> void (*sk_getsecid) (struct sock *sk, u32 *secid);
> @@ -2539,7 +2539,7 @@ int security_sock_rcv_skb(struct sock *sk, struct 
> sk_buff *skb);
>  int security_socket_getpeersec_stream(struct socket *sock, char __user 
> *optval,
>   int __user *optlen, unsigned len);
>  int security_socket_getpeersec_dgram(struct socket *sock, struct sk_buff 
> *skb, u32 *secid);
> -int security_sk_alloc(struct sock *sk, int family, gfp_t priority);
> +int security_sk_alloc(struct sock *sk, int family, gfp_t priority, bool 
> kernel);
>  void security_sk_free(struct sock *sk);
>  void security_sk_clone(const struct sock *sk, struct sock *newsk);
>  void security_sk_classify_flow(struct sock *sk, struct flowi *fl);
> @@ -2667,7 +2667,7 @@ static inline int 
> security_socket_getpeersec_dgram(struct socket *sock, struct s
> return -ENOPROTOOPT;
>  }
>
> -static inline int security_sk_alloc(struct sock *sk, int family, gfp_t 
> priority)
> +static inline int security_sk_alloc(struct sock *sk, int family, gfp_t 
> priority, bool kernel)
>  {
> return 0;
>  }
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 8f67ced..e00cadf 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -1186,7 +1186,7 @@ static struct 

Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 12:05 -0400, Eric Paris wrote:
> On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet  wrote:
> > On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:
> >
> >> Is is possible to do the call to security_sk_alloc() in the ip_init() 
> >> function
> >> or does the per-cpu nature of the socket make this a pain?
> >>
> >
> > Its a pain, if we want NUMA affinity.
> >
> > Here, each cpu should get memory from its closest node.
> 
> I really really don't like it.  I won't say NAK, but it is the first
> and only place in the kernel where I believe we allocate an object and
> don't allocate the security blob until some random later point in
> time.

...

>   If it is such a performance issue to have the security blob in
> the same numa node, isn't adding a number of branches and putting this
> function call on every output at least as bad?  Aren't we discouraged
> from GFP_ATOMIC?  In __init we can use GFP_KERNEL.

What a big deal. Its done _once_ time per cpu, and this is so small blob
of memory you'll have to show us one single failure out of one million
boots.

If the security_sk_alloc() fails, we dont care. We are about sending a
RESET or ACK packet. They can be lost by the network, or even skb
allocation can fail. Nobody ever noticed and complained.

Every time we accept() a new socket (and call security_sk_alloc()), its
done under soft irq, thus GFP_ATOMIC, and you didn't complain yet, while
a socket needs about 2 Kbytes of memory...

> 
> This still doesn't fix these sockets entirely.  We now have the
> security blob allocated, but it was never set to something useful.
> Paul, are you looking into this?  This is a bandaide, not a fix
> 

Please do so, on a followup patch, dont pretend I must fix all this
stuff.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thu, Aug 9, 2012 at 12:05 PM, Eric Paris  wrote:
> Paul, are you looking into this?  This is a bandaide, not a fix

Yep, I mentioned this a few times in the other thread.  The problem is
there is not going to be an easy fix for the labeling so I'd rather we
see this patch, or something like it, go in now to resolve the kernel
panic, and fix the labeling later.

-- 
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Paris
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet  wrote:
> On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:
>
>> Is is possible to do the call to security_sk_alloc() in the ip_init() 
>> function
>> or does the per-cpu nature of the socket make this a pain?
>>
>
> Its a pain, if we want NUMA affinity.
>
> Here, each cpu should get memory from its closest node.

I really really don't like it.  I won't say NAK, but it is the first
and only place in the kernel where I believe we allocate an object and
don't allocate the security blob until some random later point in
time.  If it is such a performance issue to have the security blob in
the same numa node, isn't adding a number of branches and putting this
function call on every output at least as bad?  Aren't we discouraged
from GFP_ATOMIC?  In __init we can use GFP_KERNEL.

This still doesn't fix these sockets entirely.  We now have the
security blob allocated, but it was never set to something useful.
Paul, are you looking into this?  This is a bandaide, not a fix

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet  wrote:
> On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:
>
>> Is is possible to do the call to security_sk_alloc() in the ip_init() 
>> function
>> or does the per-cpu nature of the socket make this a pain?
>>
>
> Its a pain, if we want NUMA affinity.
>
> Here, each cpu should get memory from its closest node.

Okay, makes sense.

Acked-by: Paul Moore 

-- 
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:

> Is is possible to do the call to security_sk_alloc() in the ip_init() 
> function 
> or does the per-cpu nature of the socket make this a pain?
> 

Its a pain, if we want NUMA affinity.

Here, each cpu should get memory from its closest node.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thursday, August 09, 2012 04:50:33 PM Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a
> selinux regression, reported and bisected by John Stultz
> 
> selinux_ip_postroute_compat() expect to find a valid sk->sk_security
> pointer, but this field is NULL for unicast_sock
> 
> Fix this by adding a new 'kernel' parameter to security_sk_alloc(),
> set to true if socket might already have a valid sk->sk_security
> pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first
> call to security_sk_alloc() will populate sk->sk_security pointer,
> subsequent ones will reuse existing context.
> 
> Reported-by: John Stultz 
> Bisected-by: John Stultz 
> Signed-off-by: Eric Dumazet 
> Cc: Paul Moore 
> Cc: Eric Paris 
> Cc: "Serge E. Hallyn" 

...

> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 76dde25..b233d6e 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1524,6 +1524,8 @@ void ip_send_unicast_reply(struct net *net, struct
> sk_buff *skb, __be32 daddr, sk->sk_priority = skb->priority;
>   sk->sk_protocol = ip_hdr(skb)->protocol;
>   sk->sk_bound_dev_if = arg->bound_dev_if;
> + if (security_sk_alloc(sk, PF_INET, GFP_ATOMIC, true))
> + goto out;
>   sock_net_set(sk, net);
>   __skb_queue_head_init(>sk_write_queue);
>   sk->sk_sndbuf = sysctl_wmem_default;

Is is possible to do the call to security_sk_alloc() in the ip_init() function 
or does the per-cpu nature of the socket make this a pain?

-- 
paul moore
www.paul-moore.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thursday, August 09, 2012 04:50:33 PM Eric Dumazet wrote:
 From: Eric Dumazet eduma...@google.com
 
 commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a
 selinux regression, reported and bisected by John Stultz
 
 selinux_ip_postroute_compat() expect to find a valid sk-sk_security
 pointer, but this field is NULL for unicast_sock
 
 Fix this by adding a new 'kernel' parameter to security_sk_alloc(),
 set to true if socket might already have a valid sk-sk_security
 pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first
 call to security_sk_alloc() will populate sk-sk_security pointer,
 subsequent ones will reuse existing context.
 
 Reported-by: John Stultz johns...@us.ibm.com
 Bisected-by: John Stultz johns...@us.ibm.com
 Signed-off-by: Eric Dumazet eduma...@google.com
 Cc: Paul Moore p...@paul-moore.com
 Cc: Eric Paris epa...@parisplace.org
 Cc: Serge E. Hallyn se...@hallyn.com

...

 diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
 index 76dde25..b233d6e 100644
 --- a/net/ipv4/ip_output.c
 +++ b/net/ipv4/ip_output.c
 @@ -1524,6 +1524,8 @@ void ip_send_unicast_reply(struct net *net, struct
 sk_buff *skb, __be32 daddr, sk-sk_priority = skb-priority;
   sk-sk_protocol = ip_hdr(skb)-protocol;
   sk-sk_bound_dev_if = arg-bound_dev_if;
 + if (security_sk_alloc(sk, PF_INET, GFP_ATOMIC, true))
 + goto out;
   sock_net_set(sk, net);
   __skb_queue_head_init(sk-sk_write_queue);
   sk-sk_sndbuf = sysctl_wmem_default;

Is is possible to do the call to security_sk_alloc() in the ip_init() function 
or does the per-cpu nature of the socket make this a pain?

-- 
paul moore
www.paul-moore.com

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:

 Is is possible to do the call to security_sk_alloc() in the ip_init() 
 function 
 or does the per-cpu nature of the socket make this a pain?
 

Its a pain, if we want NUMA affinity.

Here, each cpu should get memory from its closest node.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet eric.duma...@gmail.com wrote:
 On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:

 Is is possible to do the call to security_sk_alloc() in the ip_init() 
 function
 or does the per-cpu nature of the socket make this a pain?


 Its a pain, if we want NUMA affinity.

 Here, each cpu should get memory from its closest node.

Okay, makes sense.

Acked-by: Paul Moore p...@paul-moore.com

-- 
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Paris
On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet eric.duma...@gmail.com wrote:
 On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:

 Is is possible to do the call to security_sk_alloc() in the ip_init() 
 function
 or does the per-cpu nature of the socket make this a pain?


 Its a pain, if we want NUMA affinity.

 Here, each cpu should get memory from its closest node.

I really really don't like it.  I won't say NAK, but it is the first
and only place in the kernel where I believe we allocate an object and
don't allocate the security blob until some random later point in
time.  If it is such a performance issue to have the security blob in
the same numa node, isn't adding a number of branches and putting this
function call on every output at least as bad?  Aren't we discouraged
from GFP_ATOMIC?  In __init we can use GFP_KERNEL.

This still doesn't fix these sockets entirely.  We now have the
security blob allocated, but it was never set to something useful.
Paul, are you looking into this?  This is a bandaide, not a fix

-Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thu, Aug 9, 2012 at 12:05 PM, Eric Paris epa...@parisplace.org wrote:
 Paul, are you looking into this?  This is a bandaide, not a fix

Yep, I mentioned this a few times in the other thread.  The problem is
there is not going to be an easy fix for the labeling so I'd rather we
see this patch, or something like it, go in now to resolve the kernel
panic, and fix the labeling later.

-- 
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 12:05 -0400, Eric Paris wrote:
 On Thu, Aug 9, 2012 at 11:36 AM, Eric Dumazet eric.duma...@gmail.com wrote:
  On Thu, 2012-08-09 at 11:07 -0400, Paul Moore wrote:
 
  Is is possible to do the call to security_sk_alloc() in the ip_init() 
  function
  or does the per-cpu nature of the socket make this a pain?
 
 
  Its a pain, if we want NUMA affinity.
 
  Here, each cpu should get memory from its closest node.
 
 I really really don't like it.  I won't say NAK, but it is the first
 and only place in the kernel where I believe we allocate an object and
 don't allocate the security blob until some random later point in
 time.

...

   If it is such a performance issue to have the security blob in
 the same numa node, isn't adding a number of branches and putting this
 function call on every output at least as bad?  Aren't we discouraged
 from GFP_ATOMIC?  In __init we can use GFP_KERNEL.

What a big deal. Its done _once_ time per cpu, and this is so small blob
of memory you'll have to show us one single failure out of one million
boots.

If the security_sk_alloc() fails, we dont care. We are about sending a
RESET or ACK packet. They can be lost by the network, or even skb
allocation can fail. Nobody ever noticed and complained.

Every time we accept() a new socket (and call security_sk_alloc()), its
done under soft irq, thus GFP_ATOMIC, and you didn't complain yet, while
a socket needs about 2 Kbytes of memory...

 
 This still doesn't fix these sockets entirely.  We now have the
 security blob allocated, but it was never set to something useful.
 Paul, are you looking into this?  This is a bandaide, not a fix
 

Please do so, on a followup patch, dont pretend I must fix all this
stuff.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Paris
NAK.

I personally think commit be9f4a44e7d41cee should be reverted until it
is fixed.  Let me explain what all I believe it broke and how.

Old callchain of the creation of the 'equivalent' socket previous to
the patch in question just for reference:

inet_ctl_sock_create
  sock_create_kern
__sock_create
  pf-create (inet_create)
sk_alloc
  sk_prot_alloc
security_sk_alloc()


This WAS working properly.  All of it.  The equivalent struct sock was
being created and allocated in inet_create(), which called to
sk_alloc-sk_prot_alloc-security_sk_alloc().  We all agree that
failing to call security_sk_alloc() is the first regression
introduced.

The second regression was the labeling issue.  There was a call to
security_socket_post_create (from __sock_create) which was properly
setting the labels on both the socket and sock.  This new patch broke
that as well.  We don't expose an equivalent
security_sock_post_create() interface in the LSM currently, and until
we do, this can't be fixed.  It's why I say it should be reverted.

I have a patch I'm testing right now which takes care of the first
part the way I like (and yes, I'm doing the allocation on the correct
number node).  It basically looks like so:

+   for_each_possible_cpu(cpu) {
+   sock = per_cpu(unicast_sock, cpu);
+   rc = security_sk_alloc(sock-sk, PF_INET, GFP_KERNEL,
cpu_to_node(cpu));
+   if (rc)
+   return rc;
+   }

I'm going to work right now on exposing the equivalent struct sock LSM
interface so we can call that as well.  But it's going to take me a
bit.  Attached is the patch just to (hopefully untested) shut up the
panic.

-Eric

On Thu, Aug 9, 2012 at 10:50 AM, Eric Dumazet eric.duma...@gmail.com wrote:
 From: Eric Dumazet eduma...@google.com

 commit be9f4a44e7d41cee (ipv4: tcp: remove per net tcp_sock) added a
 selinux regression, reported and bisected by John Stultz

 selinux_ip_postroute_compat() expect to find a valid sk-sk_security
 pointer, but this field is NULL for unicast_sock

 Fix this by adding a new 'kernel' parameter to security_sk_alloc(),
 set to true if socket might already have a valid sk-sk_security
 pointer. ip_send_unicast_reply() uses a percpu fake socket, so the first
 call to security_sk_alloc() will populate sk-sk_security pointer,
 subsequent ones will reuse existing context.

 Reported-by: John Stultz johns...@us.ibm.com
 Bisected-by: John Stultz johns...@us.ibm.com
 Signed-off-by: Eric Dumazet eduma...@google.com
 Cc: Paul Moore p...@paul-moore.com
 Cc: Eric Paris epa...@parisplace.org
 Cc: Serge E. Hallyn se...@hallyn.com
 ---
  include/linux/security.h   |6 +++---
  net/core/sock.c|2 +-
  net/ipv4/ip_output.c   |4 +++-
  security/security.c|4 ++--
  security/selinux/hooks.c   |5 -
  security/smack/smack_lsm.c |   10 --
  6 files changed, 21 insertions(+), 10 deletions(-)

 diff --git a/include/linux/security.h b/include/linux/security.h
 index 4e5a73c..4d8e454 100644
 --- a/include/linux/security.h
 +++ b/include/linux/security.h
 @@ -1601,7 +1601,7 @@ struct security_operations {
 int (*socket_sock_rcv_skb) (struct sock *sk, struct sk_buff *skb);
 int (*socket_getpeersec_stream) (struct socket *sock, char __user 
 *optval, int __user *optlen, unsigned len);
 int (*socket_getpeersec_dgram) (struct socket *sock, struct sk_buff 
 *skb, u32 *secid);
 -   int (*sk_alloc_security) (struct sock *sk, int family, gfp_t 
 priority);
 +   int (*sk_alloc_security) (struct sock *sk, int family, gfp_t 
 priority, bool kernel);
 void (*sk_free_security) (struct sock *sk);
 void (*sk_clone_security) (const struct sock *sk, struct sock *newsk);
 void (*sk_getsecid) (struct sock *sk, u32 *secid);
 @@ -2539,7 +2539,7 @@ int security_sock_rcv_skb(struct sock *sk, struct 
 sk_buff *skb);
  int security_socket_getpeersec_stream(struct socket *sock, char __user 
 *optval,
   int __user *optlen, unsigned len);
  int security_socket_getpeersec_dgram(struct socket *sock, struct sk_buff 
 *skb, u32 *secid);
 -int security_sk_alloc(struct sock *sk, int family, gfp_t priority);
 +int security_sk_alloc(struct sock *sk, int family, gfp_t priority, bool 
 kernel);
  void security_sk_free(struct sock *sk);
  void security_sk_clone(const struct sock *sk, struct sock *newsk);
  void security_sk_classify_flow(struct sock *sk, struct flowi *fl);
 @@ -2667,7 +2667,7 @@ static inline int 
 security_socket_getpeersec_dgram(struct socket *sock, struct s
 return -ENOPROTOOPT;
  }

 -static inline int security_sk_alloc(struct sock *sk, int family, gfp_t 
 priority)
 +static inline int security_sk_alloc(struct sock *sk, int family, gfp_t 
 priority, bool kernel)
  {
 return 0;
  }
 diff --git a/net/core/sock.c b/net/core/sock.c
 index 8f67ced..e00cadf 100644
 --- 

Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Paul Moore
On Thu, Aug 9, 2012 at 4:06 PM, Eric Paris epa...@parisplace.org wrote:
 I'm going to work right now on exposing the equivalent struct sock LSM
 interface so we can call that as well.  But it's going to take me a
 bit.

Before you go too far down this path, can you elaborate on what
exactly you mean by the above?

I'm asking because I'm not convinced the labeling, either the old way
or the new way, was 100% correct and I think we're going to need to
change things regardless.  I'm just not sure what the right solution
is just yet.

-- 
paul moore
www.paul-moore.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote:
 NAK.
 
 I personally think commit be9f4a44e7d41cee should be reverted until it
 is fixed.  Let me explain what all I believe it broke and how.
 

Suggesting to revert this commit while we have known working fixes is a
bit of strange reaction.

I understand you are upset, but I believe we tried to fix it.

 Old callchain of the creation of the 'equivalent' socket previous to
 the patch in question just for reference:
 
 inet_ctl_sock_create
   sock_create_kern
 __sock_create
   pf-create (inet_create)
 sk_alloc
   sk_prot_alloc
 security_sk_alloc()
 
 
 This WAS working properly.  All of it. 

Nobody denies it. But acknowledge my patch uncovered a fundamental
issue.

What kind of 'security module' can decide to let RST packets being sent
or not, on a global scale ? (one socket for the whole machine)

smack_sk_alloc_security() uses smk_of_current() : What can be the
meaning of smk_of_current() in the context of 'kernel' sockets...

Your patch tries to maintain this status quo.

In fact I suggest the following one liner patch, unless you can really
demonstrate what can be the meaning of providing a fake socket for these
packets.

This mess only happened because ip_append_data()/ip_push_pending_frames()
are so complex and use an underlying socket.

But this socket should not be ever used outside of its scope.

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 76dde25..ec410e0 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct 
sk_buff *skb, __be32 daddr,
  arg-csumoffset) = csum_fold(csum_add(nskb-csum,
arg-csum));
nskb-ip_summed = CHECKSUM_NONE;
+   skb_orphan(nskb);
skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
ip_push_pending_frames(sk, fl4);
}


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Casey Schaufler
On 8/9/2012 2:29 PM, Eric Dumazet wrote:
 On Thu, 2012-08-09 at 16:06 -0400, Eric Paris wrote:
 NAK.

 I personally think commit be9f4a44e7d41cee should be reverted until it
 is fixed.  Let me explain what all I believe it broke and how.

 Suggesting to revert this commit while we have known working fixes is a
 bit of strange reaction.

A couple of potential short term workarounds have been identified,
but no one is happy with them for the long term. That does not
qualify as a working fix in engineering terms.

 I understand you are upset, but I believe we tried to fix it.

 Old callchain of the creation of the 'equivalent' socket previous to
 the patch in question just for reference:

 inet_ctl_sock_create
   sock_create_kern
 __sock_create
   pf-create (inet_create)
 sk_alloc
   sk_prot_alloc
 security_sk_alloc()


 This WAS working properly.  All of it. 
 Nobody denies it. But acknowledge my patch uncovered a fundamental
 issue.

 What kind of 'security module' can decide to let RST packets being sent
 or not, on a global scale ? (one socket for the whole machine)

The short answer is any security module that wants to.

And before we go any further, I'm a little surprised that
SELinux doesn't do this already.


 smack_sk_alloc_security() uses smk_of_current() : What can be the
 meaning of smk_of_current() in the context of 'kernel' sockets...

Yes, and all of it's callers - to date - have had an appropriate
value of current. It is using the API in the way it is supposed to.
It is assuming a properly formed socket. You want to give it a
cobbled together partial socket structure without task context.
Your predecessor did not have this problem.


 Your patch tries to maintain this status quo.

 In fact I suggest the following one liner patch, unless you can really
 demonstrate what can be the meaning of providing a fake socket for these
 packets.

 This mess only happened because ip_append_data()/ip_push_pending_frames()
 are so complex and use an underlying socket.

 But this socket should not be ever used outside of its scope.

 diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
 index 76dde25..ec410e0 100644
 --- a/net/ipv4/ip_output.c
 +++ b/net/ipv4/ip_output.c
 @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct 
 sk_buff *skb, __be32 daddr,
 arg-csumoffset) = csum_fold(csum_add(nskb-csum,
   arg-csum));
   nskb-ip_summed = CHECKSUM_NONE;
 + skb_orphan(nskb);
   skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
   ip_push_pending_frames(sk, fl4);
   }




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Eric Dumazet
On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote:
 On 8/9/2012 2:29 PM, Eric Dumazet wrote:

  smack_sk_alloc_security() uses smk_of_current() : What can be the
  meaning of smk_of_current() in the context of 'kernel' sockets...
 
 Yes, and all of it's callers - to date - have had an appropriate
 value of current. It is using the API in the way it is supposed to.
 It is assuming a properly formed socket. You want to give it a
 cobbled together partial socket structure without task context.
 Your predecessor did not have this problem.

My predecessor ? You mean before the patch ?

tcp socket was preallocated by at kernel boot time.

What is the 'user' owning this socket ?

You guys focus on an implementation detail of TCP stack.
You should never use this fake socket.

I repeat : There are no true socket for these control packets.

If you want them, then you'll have to add fields in timewait socket.

I can decide to rewrite the whole thing just building a TCP packet on
its own, and send it without any fake socket.

Some guy 15 years ago tried to reuse some high level functions, able to
build super packets and use sophisticated tricks, while we only want so
send a 40 or 60 bytes packet.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread Casey Schaufler
On 8/9/2012 3:05 PM, Eric Dumazet wrote:
 On Thu, 2012-08-09 at 14:53 -0700, Casey Schaufler wrote:
 On 8/9/2012 2:29 PM, Eric Dumazet wrote:
 smack_sk_alloc_security() uses smk_of_current() : What can be the
 meaning of smk_of_current() in the context of 'kernel' sockets...
 Yes, and all of it's callers - to date - have had an appropriate
 value of current. It is using the API in the way it is supposed to.
 It is assuming a properly formed socket. You want to give it a
 cobbled together partial socket structure without task context.
 Your predecessor did not have this problem.
 My predecessor ? You mean before the patch ?

 tcp socket was preallocated by at kernel boot time.

 What is the 'user' owning this socket ?

 You guys focus on an implementation detail of TCP stack.
 You should never use this fake socket.

 I repeat : There are no true socket for these control packets.

 If you want them, then you'll have to add fields in timewait socket.

 I can decide to rewrite the whole thing just building a TCP packet on
 its own, and send it without any fake socket.

 Some guy 15 years ago tried to reuse some high level functions, able to
 build super packets and use sophisticated tricks, while we only want so
 send a 40 or 60 bytes packet.

OK, fine. You have an optimization. I'm good with that. Just don't
expect that the entire software stack you are taking advantage of
is going to change to accommodate your special case.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipv4: tcp: security_sk_alloc() needed for unicast_sock

2012-08-09 Thread David Miller
From: Eric Dumazet eric.duma...@gmail.com
Date: Thu, 09 Aug 2012 23:29:03 +0200

 diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
 index 76dde25..ec410e0 100644
 --- a/net/ipv4/ip_output.c
 +++ b/net/ipv4/ip_output.c
 @@ -1536,6 +1536,7 @@ void ip_send_unicast_reply(struct net *net, struct 
 sk_buff *skb, __be32 daddr,
 arg-csumoffset) = csum_fold(csum_add(nskb-csum,
   arg-csum));
   nskb-ip_summed = CHECKSUM_NONE;
 + skb_orphan(nskb);
   skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb));
   ip_push_pending_frames(sk, fl4);
   }
 

This is definitely the best fix, please submit this formally.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/