Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread James Morris
On Thu, 30 Nov 2017, Eric Dumazet wrote:

> On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote:
> > On 11/29/2017 4:31 PM, James Morris wrote:
> > > On Wed, 29 Nov 2017, Casey Schaufler wrote:
> > > 
> > > > I see that there is a proposed fix later in the thread, but I
> > > > don't see
> > > > the patch. Could you send it to me, so I can try it on my
> > > > problem?
> > > 
> > > Forwarded off-list.
> > 
> > The patch does fix the problem I was seeing in Smack.
> 
> Can you guys test the following more complete patch ?
> 
> It should cover IPv4 and IPv6, and also the corner cases.


Tested-by: James Morris 



-- 
James Morris




Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread Casey Schaufler
On 11/30/2017 9:57 AM, Eric Dumazet wrote:
> On Thu, 2017-11-30 at 10:30 -0700, David Ahern wrote:
>> On 11/30/17 8:44 AM, David Ahern wrote:
>>> On 11/30/17 3:50 AM, Eric Dumazet wrote:
 @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
  
    th = (const struct tcphdr *)skb->data;
    iph = ip_hdr(skb);
 -  /* This is tricky : We move IPCB at its correct location
 into TCP_SKB_CB()
 -   * barrier() makes sure compiler wont play
 fool^Waliasing games.
 -   */
 -  memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
 -  sizeof(struct inet_skb_parm));
 -  barrier();
 -
 -  TCP_SKB_CB(skb)->seq = ntohl(th->seq);
 -  TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th-
> syn + th->fin +
 -  skb->len - th->doff * 4);
 -  TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
 -  TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
 -  TCP_SKB_CB(skb)->tcp_tw_isn = 0;
 -  TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
 -  TCP_SKB_CB(skb)->sacked  = 0;
 -  TCP_SKB_CB(skb)->has_rxtstamp =
 -  skb->tstamp || skb_hwtstamps(skb)-
> hwtstamp;
 -
  lookup:
    sk = __inet_lookup_skb(_hashinfo, skb,
 __tcp_hdrlen(th), th->source,
       th->dest, sdif, );
>>> I believe moving the above is going to affect lookups with VRF. Let
>>> me
>>> take a look before this gets committed.
>>>
>> Eric:
>>
>> Can you add this to the patch? Fixes socket lookups with VRF which
>> stashes a flag in the cb.

I've done my testing and it works both ways for Smack.


>>
>> Thanks,
>>
>> diff --git a/include/net/tcp.h b/include/net/tcp.h
>> index 4e09398009c1..6c020015d556 100644
>> --- a/include/net/tcp.h
>> +++ b/include/net/tcp.h
>> @@ -849,7 +849,7 @@ static inline bool inet_exact_dif_match(struct
>> net
>> *net, struct sk_buff *skb)
>>  {
>>  #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
>> if (!net->ipv4.sysctl_tcp_l3mdev_accept &&
>> -   skb && ipv4_l3mdev_skb(TCP_SKB_CB(skb)->header.h4.flags))
>> +   skb && ipv4_l3mdev_skb(IPCB(skb)->flags))
>> return true;
>>  #endif
>> return false;
>
> I wonder if this should not be in a separate patch ?
>
> Bug was added in 971f10eca186cab238c49daa91f703c5a001b0b1 ("tcp: better
> TCP_SKB_CB layout to reduce cache line misses")  in linux 3.18
>
> While VRF was added later.
>
> If you agree, I will prepare a patch series, with different Fixes tag
> so that David can decide which path needs to be backported into each
> stable version.
>
> Thanks.
>
>



Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread David Ahern
On 11/30/17 10:57 AM, Eric Dumazet wrote:
> I wonder if this should not be in a separate patch ?
> 
> Bug was added in 971f10eca186cab238c49daa91f703c5a001b0b1 ("tcp: better
> TCP_SKB_CB layout to reduce cache line misses")  in linux 3.18
> 
> While VRF was added later.
> 
> If you agree, I will prepare a patch series, with different Fixes tag
> so that David can decide which path needs to be backported into each
> stable version.
> 

That's sound fine to me.


Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread Eric Dumazet
On Thu, 2017-11-30 at 10:30 -0700, David Ahern wrote:
> On 11/30/17 8:44 AM, David Ahern wrote:
> > On 11/30/17 3:50 AM, Eric Dumazet wrote:
> > > @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > >  
> > >   th = (const struct tcphdr *)skb->data;
> > >   iph = ip_hdr(skb);
> > > - /* This is tricky : We move IPCB at its correct location
> > > into TCP_SKB_CB()
> > > -  * barrier() makes sure compiler wont play
> > > fool^Waliasing games.
> > > -  */
> > > - memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> > > - sizeof(struct inet_skb_parm));
> > > - barrier();
> > > -
> > > - TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> > > - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th-
> > > >syn + th->fin +
> > > - skb->len - th->doff * 4);
> > > - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> > > - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> > > - TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> > > - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> > > - TCP_SKB_CB(skb)->sacked  = 0;
> > > - TCP_SKB_CB(skb)->has_rxtstamp =
> > > - skb->tstamp || skb_hwtstamps(skb)-
> > > >hwtstamp;
> > > -
> > >  lookup:
> > >   sk = __inet_lookup_skb(_hashinfo, skb,
> > > __tcp_hdrlen(th), th->source,
> > >      th->dest, sdif, );
> > 
> > I believe moving the above is going to affect lookups with VRF. Let
> > me
> > take a look before this gets committed.
> > 
> 
> Eric:
> 
> Can you add this to the patch? Fixes socket lookups with VRF which
> stashes a flag in the cb.
> 
> Thanks,
> 
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 4e09398009c1..6c020015d556 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -849,7 +849,7 @@ static inline bool inet_exact_dif_match(struct
> net
> *net, struct sk_buff *skb)
>  {
>  #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
> if (!net->ipv4.sysctl_tcp_l3mdev_accept &&
> -   skb && ipv4_l3mdev_skb(TCP_SKB_CB(skb)->header.h4.flags))
> +   skb && ipv4_l3mdev_skb(IPCB(skb)->flags))
> return true;
>  #endif
> return false;


I wonder if this should not be in a separate patch ?

Bug was added in 971f10eca186cab238c49daa91f703c5a001b0b1 ("tcp: better
TCP_SKB_CB layout to reduce cache line misses")  in linux 3.18

While VRF was added later.

If you agree, I will prepare a patch series, with different Fixes tag
so that David can decide which path needs to be backported into each
stable version.

Thanks.



Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread David Ahern
On 11/30/17 8:44 AM, David Ahern wrote:
> On 11/30/17 3:50 AM, Eric Dumazet wrote:
>> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
>>  
>>  th = (const struct tcphdr *)skb->data;
>>  iph = ip_hdr(skb);
>> -/* This is tricky : We move IPCB at its correct location into 
>> TCP_SKB_CB()
>> - * barrier() makes sure compiler wont play fool^Waliasing games.
>> - */
>> -memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
>> -sizeof(struct inet_skb_parm));
>> -barrier();
>> -
>> -TCP_SKB_CB(skb)->seq = ntohl(th->seq);
>> -TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
>> -skb->len - th->doff * 4);
>> -TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
>> -TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
>> -TCP_SKB_CB(skb)->tcp_tw_isn = 0;
>> -TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
>> -TCP_SKB_CB(skb)->sacked  = 0;
>> -TCP_SKB_CB(skb)->has_rxtstamp =
>> -skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
>> -
>>  lookup:
>>  sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source,
>> th->dest, sdif, );
> 
> I believe moving the above is going to affect lookups with VRF. Let me
> take a look before this gets committed.
> 

Eric:

Can you add this to the patch? Fixes socket lookups with VRF which
stashes a flag in the cb.

Thanks,

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 4e09398009c1..6c020015d556 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -849,7 +849,7 @@ static inline bool inet_exact_dif_match(struct net
*net, struct sk_buff *skb)
 {
 #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV)
if (!net->ipv4.sysctl_tcp_l3mdev_accept &&
-   skb && ipv4_l3mdev_skb(TCP_SKB_CB(skb)->header.h4.flags))
+   skb && ipv4_l3mdev_skb(IPCB(skb)->flags))
return true;
 #endif
return false;


Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread Paul Moore
On Thu, Nov 30, 2017 at 7:47 AM, Paul Moore  wrote:
> On Thu, Nov 30, 2017 at 5:50 AM, Eric Dumazet  wrote:
>> On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote:
>>> On 11/29/2017 4:31 PM, James Morris wrote:
>>> > On Wed, 29 Nov 2017, Casey Schaufler wrote:
>>> >
>>> > > I see that there is a proposed fix later in the thread, but I
>>> > > don't see
>>> > > the patch. Could you send it to me, so I can try it on my
>>> > > problem?
>>> >
>>> > Forwarded off-list.
>>>
>>> The patch does fix the problem I was seeing in Smack.
>>
>> Can you guys test the following more complete patch ?
>>
>> It should cover IPv4 and IPv6, and also the corner cases.
>>
>> ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8
>> 42844/ that I spotted while cooking this patch )
>
> Building a test kernel now, although it make take me a few hours to
> test it due to some commitments this morning.

I just realized I forgot to enable KASAN in the build, but I can
verify that the patch doesn't break anything in the selinux-testsuite.

Tested-by: Paul Moore 

>> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
>> index 
>> c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d
>>  100644
>> --- a/net/ipv4/tcp_ipv4.c
>> +++ b/net/ipv4/tcp_ipv4.c
>> @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb)
>>  }
>>  EXPORT_SYMBOL(tcp_filter);
>>
>> +static void tcp_v4_restore_cb(struct sk_buff *skb)
>> +{
>> +   memmove(IPCB(skb), _SKB_CB(skb)->header.h4,
>> +   sizeof(struct inet_skb_parm));
>> +}
>> +
>> +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
>> +  const struct tcphdr *th)
>> +{
>> +   /* This is tricky : We move IPCB at its correct location into 
>> TCP_SKB_CB()
>> +* barrier() makes sure compiler wont play fool^Waliasing games.
>> +*/
>> +   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
>> +   sizeof(struct inet_skb_parm));
>> +   barrier();
>> +
>> +   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
>> +   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin 
>> +
>> +   skb->len - th->doff * 4);
>> +   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
>> +   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
>> +   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
>> +   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
>> +   TCP_SKB_CB(skb)->sacked  = 0;
>> +   TCP_SKB_CB(skb)->has_rxtstamp =
>> +   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
>> +}
>> +
>>  /*
>>   * From tcp_input.c
>>   */
>> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
>>
>> th = (const struct tcphdr *)skb->data;
>> iph = ip_hdr(skb);
>> -   /* This is tricky : We move IPCB at its correct location into 
>> TCP_SKB_CB()
>> -* barrier() makes sure compiler wont play fool^Waliasing games.
>> -*/
>> -   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
>> -   sizeof(struct inet_skb_parm));
>> -   barrier();
>> -
>> -   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
>> -   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin 
>> +
>> -   skb->len - th->doff * 4);
>> -   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
>> -   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
>> -   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
>> -   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
>> -   TCP_SKB_CB(skb)->sacked  = 0;
>> -   TCP_SKB_CB(skb)->has_rxtstamp =
>> -   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
>> -
>>  lookup:
>> sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), 
>> th->source,
>>th->dest, sdif, );
>> @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb)
>> sock_hold(sk);
>> refcounted = true;
>> nsk = NULL;
>> -   if (!tcp_filter(sk, skb))
>> +   if (!tcp_filter(sk, skb)) {
>> +   th = (const struct tcphdr *)skb->data;
>> +   iph = ip_hdr(skb);
>> +   tcp_v4_fill_cb(skb, iph, th);
>> nsk = tcp_check_req(sk, skb, req, false);
>> +   }
>> if (!nsk) {
>> reqsk_put(req);
>> goto discard_and_relse;
>> }
>> if (nsk == sk) {
>> reqsk_put(req);
>> +   tcp_v4_restore_cb(skb);
>> } else if (tcp_child_process(sk, nsk, skb)) {
>> tcp_v4_send_reset(nsk, skb);
>> goto discard_and_relse;
>> @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
>> goto 

Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread David Ahern
On 11/30/17 3:50 AM, Eric Dumazet wrote:
> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  
>   th = (const struct tcphdr *)skb->data;
>   iph = ip_hdr(skb);
> - /* This is tricky : We move IPCB at its correct location into 
> TCP_SKB_CB()
> -  * barrier() makes sure compiler wont play fool^Waliasing games.
> -  */
> - memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> - sizeof(struct inet_skb_parm));
> - barrier();
> -
> - TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
> - skb->len - th->doff * 4);
> - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> - TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> - TCP_SKB_CB(skb)->sacked  = 0;
> - TCP_SKB_CB(skb)->has_rxtstamp =
> - skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
> -
>  lookup:
>   sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source,
>  th->dest, sdif, );

I believe moving the above is going to affect lookups with VRF. Let me
take a look before this gets committed.



Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread Casey Schaufler
On 11/30/2017 2:50 AM, Eric Dumazet wrote:
> On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote:
>> On 11/29/2017 4:31 PM, James Morris wrote:
>>> On Wed, 29 Nov 2017, Casey Schaufler wrote:
>>>
 I see that there is a proposed fix later in the thread, but I
 don't see
 the patch. Could you send it to me, so I can try it on my
 problem?
>>> Forwarded off-list.
>> The patch does fix the problem I was seeing in Smack.
> Can you guys test the following more complete patch ?

My tests are passing. Thank you.

Tested-by: Casey Schaufler 

>
> It should cover IPv4 and IPv6, and also the corner cases.
>
> ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8
> 42844/ that I spotted while cooking this patch )
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 
> c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d
>  100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb)
>  }
>  EXPORT_SYMBOL(tcp_filter);
>  
> +static void tcp_v4_restore_cb(struct sk_buff *skb)
> +{
> + memmove(IPCB(skb), _SKB_CB(skb)->header.h4,
> + sizeof(struct inet_skb_parm));
> +}
> +
> +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
> +const struct tcphdr *th)
> +{
> + /* This is tricky : We move IPCB at its correct location into 
> TCP_SKB_CB()
> +  * barrier() makes sure compiler wont play fool^Waliasing games.
> +  */
> + memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> + sizeof(struct inet_skb_parm));
> + barrier();
> +
> + TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
> + skb->len - th->doff * 4);
> + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> + TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> + TCP_SKB_CB(skb)->sacked  = 0;
> + TCP_SKB_CB(skb)->has_rxtstamp =
> + skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
> +}
> +
>  /*
>   *   From tcp_input.c
>   */
> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  
>   th = (const struct tcphdr *)skb->data;
>   iph = ip_hdr(skb);
> - /* This is tricky : We move IPCB at its correct location into 
> TCP_SKB_CB()
> -  * barrier() makes sure compiler wont play fool^Waliasing games.
> -  */
> - memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> - sizeof(struct inet_skb_parm));
> - barrier();
> -
> - TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
> - skb->len - th->doff * 4);
> - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> - TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> - TCP_SKB_CB(skb)->sacked  = 0;
> - TCP_SKB_CB(skb)->has_rxtstamp =
> - skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
> -
>  lookup:
>   sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source,
>  th->dest, sdif, );
> @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   sock_hold(sk);
>   refcounted = true;
>   nsk = NULL;
> - if (!tcp_filter(sk, skb))
> + if (!tcp_filter(sk, skb)) {
> + th = (const struct tcphdr *)skb->data;
> + iph = ip_hdr(skb);
> + tcp_v4_fill_cb(skb, iph, th);
>   nsk = tcp_check_req(sk, skb, req, false);
> + }
>   if (!nsk) {
>   reqsk_put(req);
>   goto discard_and_relse;
>   }
>   if (nsk == sk) {
>   reqsk_put(req);
> + tcp_v4_restore_cb(skb);
>   } else if (tcp_child_process(sk, nsk, skb)) {
>   tcp_v4_send_reset(nsk, skb);
>   goto discard_and_relse;
> @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   goto discard_and_relse;
>   th = (const struct tcphdr *)skb->data;
>   iph = ip_hdr(skb);
> + tcp_v4_fill_cb(skb, iph, th);
>  
>   skb->dev = NULL;
>  
> @@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
>   goto discard_it;
>  
> + tcp_v4_fill_cb(skb, iph, th);
> +
>   if (tcp_checksum_complete(skb)) {
>  csum_error:
>   __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS);
> @@ -1768,6 +1786,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   goto discard_it;
>   }
>  
> + 

Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread Casey Schaufler
On 11/30/2017 2:50 AM, Eric Dumazet wrote:
> On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote:
>> On 11/29/2017 4:31 PM, James Morris wrote:
>>> On Wed, 29 Nov 2017, Casey Schaufler wrote:
>>>
 I see that there is a proposed fix later in the thread, but I
 don't see
 the patch. Could you send it to me, so I can try it on my
 problem?
>>> Forwarded off-list.
>> The patch does fix the problem I was seeing in Smack.
> Can you guys test the following more complete patch ?

Building now. I should have results soon.

>
> It should cover IPv4 and IPv6, and also the corner cases.
>
> ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8
> 42844/ that I spotted while cooking this patch )
>
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 
> c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d
>  100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb)
>  }
>  EXPORT_SYMBOL(tcp_filter);
>  
> +static void tcp_v4_restore_cb(struct sk_buff *skb)
> +{
> + memmove(IPCB(skb), _SKB_CB(skb)->header.h4,
> + sizeof(struct inet_skb_parm));
> +}
> +
> +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
> +const struct tcphdr *th)
> +{
> + /* This is tricky : We move IPCB at its correct location into 
> TCP_SKB_CB()
> +  * barrier() makes sure compiler wont play fool^Waliasing games.
> +  */
> + memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> + sizeof(struct inet_skb_parm));
> + barrier();
> +
> + TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
> + skb->len - th->doff * 4);
> + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> + TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> + TCP_SKB_CB(skb)->sacked  = 0;
> + TCP_SKB_CB(skb)->has_rxtstamp =
> + skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
> +}
> +
>  /*
>   *   From tcp_input.c
>   */
> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
>  
>   th = (const struct tcphdr *)skb->data;
>   iph = ip_hdr(skb);
> - /* This is tricky : We move IPCB at its correct location into 
> TCP_SKB_CB()
> -  * barrier() makes sure compiler wont play fool^Waliasing games.
> -  */
> - memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> - sizeof(struct inet_skb_parm));
> - barrier();
> -
> - TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
> - skb->len - th->doff * 4);
> - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> - TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> - TCP_SKB_CB(skb)->sacked  = 0;
> - TCP_SKB_CB(skb)->has_rxtstamp =
> - skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
> -
>  lookup:
>   sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source,
>  th->dest, sdif, );
> @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   sock_hold(sk);
>   refcounted = true;
>   nsk = NULL;
> - if (!tcp_filter(sk, skb))
> + if (!tcp_filter(sk, skb)) {
> + th = (const struct tcphdr *)skb->data;
> + iph = ip_hdr(skb);
> + tcp_v4_fill_cb(skb, iph, th);
>   nsk = tcp_check_req(sk, skb, req, false);
> + }
>   if (!nsk) {
>   reqsk_put(req);
>   goto discard_and_relse;
>   }
>   if (nsk == sk) {
>   reqsk_put(req);
> + tcp_v4_restore_cb(skb);
>   } else if (tcp_child_process(sk, nsk, skb)) {
>   tcp_v4_send_reset(nsk, skb);
>   goto discard_and_relse;
> @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   goto discard_and_relse;
>   th = (const struct tcphdr *)skb->data;
>   iph = ip_hdr(skb);
> + tcp_v4_fill_cb(skb, iph, th);
>  
>   skb->dev = NULL;
>  
> @@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
>   goto discard_it;
>  
> + tcp_v4_fill_cb(skb, iph, th);
> +
>   if (tcp_checksum_complete(skb)) {
>  csum_error:
>   __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS);
> @@ -1768,6 +1786,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
>   goto discard_it;
>   }
>  
> + tcp_v4_fill_cb(skb, iph, th);
> +
>   if 

Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread Paul Moore
On Thu, Nov 30, 2017 at 5:50 AM, Eric Dumazet  wrote:
> On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote:
>> On 11/29/2017 4:31 PM, James Morris wrote:
>> > On Wed, 29 Nov 2017, Casey Schaufler wrote:
>> >
>> > > I see that there is a proposed fix later in the thread, but I
>> > > don't see
>> > > the patch. Could you send it to me, so I can try it on my
>> > > problem?
>> >
>> > Forwarded off-list.
>>
>> The patch does fix the problem I was seeing in Smack.
>
> Can you guys test the following more complete patch ?
>
> It should cover IPv4 and IPv6, and also the corner cases.
>
> ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8
> 42844/ that I spotted while cooking this patch )

Building a test kernel now, although it make take me a few hours to
test it due to some commitments this morning.

> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 
> c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d
>  100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb)
>  }
>  EXPORT_SYMBOL(tcp_filter);
>
> +static void tcp_v4_restore_cb(struct sk_buff *skb)
> +{
> +   memmove(IPCB(skb), _SKB_CB(skb)->header.h4,
> +   sizeof(struct inet_skb_parm));
> +}
> +
> +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
> +  const struct tcphdr *th)
> +{
> +   /* This is tricky : We move IPCB at its correct location into 
> TCP_SKB_CB()
> +* barrier() makes sure compiler wont play fool^Waliasing games.
> +*/
> +   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> +   sizeof(struct inet_skb_parm));
> +   barrier();
> +
> +   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> +   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
> +   skb->len - th->doff * 4);
> +   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> +   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> +   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> +   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> +   TCP_SKB_CB(skb)->sacked  = 0;
> +   TCP_SKB_CB(skb)->has_rxtstamp =
> +   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
> +}
> +
>  /*
>   * From tcp_input.c
>   */
> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
>
> th = (const struct tcphdr *)skb->data;
> iph = ip_hdr(skb);
> -   /* This is tricky : We move IPCB at its correct location into 
> TCP_SKB_CB()
> -* barrier() makes sure compiler wont play fool^Waliasing games.
> -*/
> -   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> -   sizeof(struct inet_skb_parm));
> -   barrier();
> -
> -   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
> -   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
> -   skb->len - th->doff * 4);
> -   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
> -   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
> -   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
> -   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
> -   TCP_SKB_CB(skb)->sacked  = 0;
> -   TCP_SKB_CB(skb)->has_rxtstamp =
> -   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
> -
>  lookup:
> sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), 
> th->source,
>th->dest, sdif, );
> @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb)
> sock_hold(sk);
> refcounted = true;
> nsk = NULL;
> -   if (!tcp_filter(sk, skb))
> +   if (!tcp_filter(sk, skb)) {
> +   th = (const struct tcphdr *)skb->data;
> +   iph = ip_hdr(skb);
> +   tcp_v4_fill_cb(skb, iph, th);
> nsk = tcp_check_req(sk, skb, req, false);
> +   }
> if (!nsk) {
> reqsk_put(req);
> goto discard_and_relse;
> }
> if (nsk == sk) {
> reqsk_put(req);
> +   tcp_v4_restore_cb(skb);
> } else if (tcp_child_process(sk, nsk, skb)) {
> tcp_v4_send_reset(nsk, skb);
> goto discard_and_relse;
> @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
> goto discard_and_relse;
> th = (const struct tcphdr *)skb->data;
> iph = ip_hdr(skb);
> +   tcp_v4_fill_cb(skb, iph, th);
>
> skb->dev = NULL;
>
> @@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
> if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
> goto discard_it;
>
> +   tcp_v4_fill_cb(skb, iph, th);
> +
> if 

Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-30 Thread Eric Dumazet
On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote:
> On 11/29/2017 4:31 PM, James Morris wrote:
> > On Wed, 29 Nov 2017, Casey Schaufler wrote:
> > 
> > > I see that there is a proposed fix later in the thread, but I
> > > don't see
> > > the patch. Could you send it to me, so I can try it on my
> > > problem?
> > 
> > Forwarded off-list.
> 
> The patch does fix the problem I was seeing in Smack.

Can you guys test the following more complete patch ?

It should cover IPv4 and IPv6, and also the corner cases.

( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8
42844/ that I spotted while cooking this patch )

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 
c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d
 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL(tcp_filter);
 
+static void tcp_v4_restore_cb(struct sk_buff *skb)
+{
+   memmove(IPCB(skb), _SKB_CB(skb)->header.h4,
+   sizeof(struct inet_skb_parm));
+}
+
+static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
+  const struct tcphdr *th)
+{
+   /* This is tricky : We move IPCB at its correct location into 
TCP_SKB_CB()
+* barrier() makes sure compiler wont play fool^Waliasing games.
+*/
+   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
+   sizeof(struct inet_skb_parm));
+   barrier();
+
+   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
+   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
+   skb->len - th->doff * 4);
+   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
+   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
+   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
+   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
+   TCP_SKB_CB(skb)->sacked  = 0;
+   TCP_SKB_CB(skb)->has_rxtstamp =
+   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
+}
+
 /*
  * From tcp_input.c
  */
@@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
 
th = (const struct tcphdr *)skb->data;
iph = ip_hdr(skb);
-   /* This is tricky : We move IPCB at its correct location into 
TCP_SKB_CB()
-* barrier() makes sure compiler wont play fool^Waliasing games.
-*/
-   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
-   sizeof(struct inet_skb_parm));
-   barrier();
-
-   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
-   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
-   skb->len - th->doff * 4);
-   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
-   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
-   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
-   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
-   TCP_SKB_CB(skb)->sacked  = 0;
-   TCP_SKB_CB(skb)->has_rxtstamp =
-   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
-
 lookup:
sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source,
   th->dest, sdif, );
@@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb)
sock_hold(sk);
refcounted = true;
nsk = NULL;
-   if (!tcp_filter(sk, skb))
+   if (!tcp_filter(sk, skb)) {
+   th = (const struct tcphdr *)skb->data;
+   iph = ip_hdr(skb);
+   tcp_v4_fill_cb(skb, iph, th);
nsk = tcp_check_req(sk, skb, req, false);
+   }
if (!nsk) {
reqsk_put(req);
goto discard_and_relse;
}
if (nsk == sk) {
reqsk_put(req);
+   tcp_v4_restore_cb(skb);
} else if (tcp_child_process(sk, nsk, skb)) {
tcp_v4_send_reset(nsk, skb);
goto discard_and_relse;
@@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
goto discard_and_relse;
th = (const struct tcphdr *)skb->data;
iph = ip_hdr(skb);
+   tcp_v4_fill_cb(skb, iph, th);
 
skb->dev = NULL;
 
@@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
goto discard_it;
 
+   tcp_v4_fill_cb(skb, iph, th);
+
if (tcp_checksum_complete(skb)) {
 csum_error:
__TCP_INC_STATS(net, TCP_MIB_CSUMERRORS);
@@ -1768,6 +1786,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
goto discard_it;
}
 
+   tcp_v4_fill_cb(skb, iph, th);
+
if (tcp_checksum_complete(skb)) {
inet_twsk_put(inet_twsk(sk));
goto csum_error;
@@ -1784,6 +1804,7 @@ int tcp_v4_rcv(struct sk_buff 

Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Casey Schaufler

On 11/29/2017 4:31 PM, James Morris wrote:
> On Wed, 29 Nov 2017, Casey Schaufler wrote:
>
>> I see that there is a proposed fix later in the thread, but I don't see
>> the patch. Could you send it to me, so I can try it on my problem?
> Forwarded off-list.

The patch does fix the problem I was seeing in Smack.

>
> Interestingly, I didn't see the KASAN output email from Stephen here.
>
>



Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread James Morris
On Wed, 29 Nov 2017, Casey Schaufler wrote:

> I see that there is a proposed fix later in the thread, but I don't see
> the patch. Could you send it to me, so I can try it on my problem?

Forwarded off-list.

Interestingly, I didn't see the KASAN output email from Stephen here.


-- 
James Morris




Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Casey Schaufler
On 11/29/2017 2:26 AM, James Morris wrote:
> I'm seeing a kernel stack corruption bug (detected via gcc) when running 
> the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket test:
>
> https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests/inet_socket/test
>
>   # Verify that unauthorized client cannot communicate with the server.
>   $result = system
>   "runcon -t test_inet_bad_client_t -- $basedir/client stream 127.0.0.1 65535 
> 2>&1";
>
> This correctlly causes an access control error in the Netlabel code, and 
> the bug seems to be triggered during the ICMP send:
>
> ..
>
> This is mostly reliable, and I'm only seeing it on bare metal (not in a 
> virtualbox vm).
>
> The SELinux skb parse error at the start only sometimes appears, and 
> looking at the code, I suspect some kind of memory corruption being the 
> cause at that point (basic packet header checks).
>
> I bisected the bug down to the following change:
>
> commit bffa72cf7f9df842f0016ba03586039296b4caaf
> Author: Eric Dumazet 
> Date:   Tue Sep 19 05:14:24 2017 -0700
>
> net: sk_buff rbnode reorg
> ...
>
>
> Anyone else able to reproduce this, or have any ideas on what's happening?

I have also bisected a problem to this change. I do not have a trace
because the problem manifests as a hard system hang without a trace
being presented. The issue arises when Smack attempts to relabel a TCP
socket using netlbl_sock_setattr().

I see that there is a proposed fix later in the thread, but I don't see
the patch. Could you send it to me, so I can try it on my problem?

Thank you.

>
>
>
> - James



Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread James Morris
On Wed, 29 Nov 2017, Eric Dumazet wrote:

> On Wed, 2017-11-29 at 12:23 -0800, Eric Dumazet wrote:
> > 
> > I suspect this exposes an ancient bug, caused by fact that TCP moves
> > IP[6]CB in skb->cb[]
> > 
> > Basically the 2nd tcp_filter() added in commit
> > 8fac365f63c866a00015fa13932d8ffc584518b8
> > ("tcp: Add a tcp_filter hook before handle ack packet") was not
> > expecting selinux code being called a 2nd time,
> > while skb->cb[] has been mangled [1]
> > 
> > [1]
> > memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> > sizeof(struct inet_skb_parm));
> 
> Please try this fix for IPv4 (a similar patch will be needed for IPv6)
> 
>  net/ipv4/tcp_ipv4.c |   51 ++
>  1 file changed, 32 insertions(+), 19 deletions(-)

Works for me, no crashes with the testsuite running in a loop.


Tested-by: James Morris 


-- 
James Morris



Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Eric Dumazet
On Wed, 2017-11-29 at 12:23 -0800, Eric Dumazet wrote:
> 
> I suspect this exposes an ancient bug, caused by fact that TCP moves
> IP[6]CB in skb->cb[]
> 
> Basically the 2nd tcp_filter() added in commit
> 8fac365f63c866a00015fa13932d8ffc584518b8
> ("tcp: Add a tcp_filter hook before handle ack packet") was not
> expecting selinux code being called a 2nd time,
> while skb->cb[] has been mangled [1]
> 
> [1]
> memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
> sizeof(struct inet_skb_parm));

Please try this fix for IPv4 (a similar patch will be needed for IPv6)

 net/ipv4/tcp_ipv4.c |   51 ++
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 
c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..912928105942b9714dda9132e45961ab1baf0852
 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1591,6 +1591,28 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL(tcp_filter);
 
+static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph,
+  const struct tcphdr *th)
+{
+   /* This is tricky : We move IPCB at its correct location into 
TCP_SKB_CB()
+* barrier() makes sure compiler wont play fool^Waliasing games.
+*/
+   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
+   sizeof(struct inet_skb_parm));
+   barrier();
+
+   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
+   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
+   skb->len - th->doff * 4);
+   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
+   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
+   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
+   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
+   TCP_SKB_CB(skb)->sacked  = 0;
+   TCP_SKB_CB(skb)->has_rxtstamp =
+   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
+}
+
 /*
  * From tcp_input.c
  */
@@ -1631,24 +1653,6 @@ int tcp_v4_rcv(struct sk_buff *skb)
 
th = (const struct tcphdr *)skb->data;
iph = ip_hdr(skb);
-   /* This is tricky : We move IPCB at its correct location into 
TCP_SKB_CB()
-* barrier() makes sure compiler wont play fool^Waliasing games.
-*/
-   memmove(_SKB_CB(skb)->header.h4, IPCB(skb),
-   sizeof(struct inet_skb_parm));
-   barrier();
-
-   TCP_SKB_CB(skb)->seq = ntohl(th->seq);
-   TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin +
-   skb->len - th->doff * 4);
-   TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq);
-   TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th);
-   TCP_SKB_CB(skb)->tcp_tw_isn = 0;
-   TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph);
-   TCP_SKB_CB(skb)->sacked  = 0;
-   TCP_SKB_CB(skb)->has_rxtstamp =
-   skb->tstamp || skb_hwtstamps(skb)->hwtstamp;
-
 lookup:
sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source,
   th->dest, sdif, );
@@ -1679,8 +1683,12 @@ int tcp_v4_rcv(struct sk_buff *skb)
sock_hold(sk);
refcounted = true;
nsk = NULL;
-   if (!tcp_filter(sk, skb))
+   if (!tcp_filter(sk, skb)) {
+   th = (const struct tcphdr *)skb->data;
+   iph = ip_hdr(skb);
+   tcp_v4_fill_cb(skb, iph, th);
nsk = tcp_check_req(sk, skb, req, false);
+   }
if (!nsk) {
reqsk_put(req);
goto discard_and_relse;
@@ -1712,6 +1720,7 @@ int tcp_v4_rcv(struct sk_buff *skb)
goto discard_and_relse;
th = (const struct tcphdr *)skb->data;
iph = ip_hdr(skb);
+   tcp_v4_fill_cb(skb, iph, th);
 
skb->dev = NULL;
 
@@ -1742,6 +1751,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
goto discard_it;
 
+   tcp_v4_fill_cb(skb, iph, th);
+
if (tcp_checksum_complete(skb)) {
 csum_error:
__TCP_INC_STATS(net, TCP_MIB_CSUMERRORS);
@@ -1768,6 +1779,8 @@ int tcp_v4_rcv(struct sk_buff *skb)
goto discard_it;
}
 
+   tcp_v4_fill_cb(skb, iph, th);
+
if (tcp_checksum_complete(skb)) {
inet_twsk_put(inet_twsk(sk));
goto csum_error;


Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Eric Dumazet
On Wed, Nov 29, 2017 at 11:59 AM, Stephen Smalley  wrote:
> On Wed, 2017-11-29 at 09:34 -0800, Eric Dumazet wrote:
>> On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalley 
>> wrote:
>> > On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote:
>> > > I'm seeing a kernel stack corruption bug (detected via gcc) when
>> > > running
>> > > the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd
>> > > inet_socket
>> > > test:
>> > >
>> > > https://github.com/SELinuxProject/selinux-testsuite/blob/master/t
>> > > ests
>> > > /inet_socket/test
>> > >
>> > >   # Verify that unauthorized client cannot communicate with the
>> > > server.
>> > >   $result = system
>> > >   "runcon -t test_inet_bad_client_t -- $basedir/client stream
>> > > 127.0.0.1 65535 2>&1";
>> > >
>> > > This correctlly causes an access control error in the Netlabel
>> > > code,
>> > > and
>> > > the bug seems to be triggered during the ICMP send:
>> > >
>> > > [  339.806024] SELinux: failure in selinux_parse_skb(), unable to
>> > > parse packet
>> > > [  339.822505] Kernel panic - not syncing: stack-protector:
>> > > Kernel
>> > > stack is corrupted in: 81745af5
>> > > [  339.822505]
>> > > [  339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-
>> > > rc1-
>> > > test #15
>> > > [  339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS
>> > > FWKT68A   01/19/2017
>> > > [  339.885060] Call Trace:
>> > > [  339.896875]  
>> > > [  339.908103]  dump_stack+0x63/0x87
>> > > [  339.920645]  panic+0xe8/0x248
>> > > [  339.932668]  ? ip_push_pending_frames+0x33/0x40
>> > > [  339.946328]  ? icmp_send+0x525/0x530
>> > > [  339.958861]  ? kfree_skbmem+0x60/0x70
>> > > [  339.971431]  __stack_chk_fail+0x1b/0x20
>> > > [  339.984049]  icmp_send+0x525/0x530
>> > > [  339.996205]  ? netlbl_skbuff_err+0x36/0x40
>> > > [  340.008997]  ? selinux_netlbl_err+0x11/0x20
>> > > [  340.021816]  ? selinux_socket_sock_rcv_skb+0x211/0x230
>> > > [  340.035529]  ? security_sock_rcv_skb+0x3b/0x50
>> > > [  340.048471]  ? sk_filter_trim_cap+0x44/0x1c0
>> > > [  340.061246]  ? tcp_v4_inbound_md5_hash+0x69/0x1b0
>> > > [  340.074562]  ? tcp_filter+0x2c/0x40
>> > > [  340.086400]  ? tcp_v4_rcv+0x820/0xa20
>> > > [  340.098329]  ? ip_local_deliver_finish+0x71/0x1a0
>> > > [  340.111279]  ? ip_local_deliver+0x6f/0xe0
>> > > [  340.123535]  ? ip_rcv_finish+0x3a0/0x3a0
>> > > [  340.135523]  ? ip_rcv_finish+0xdb/0x3a0
>> > > [  340.147442]  ? ip_rcv+0x27c/0x3c0
>> > > [  340.158668]  ? inet_del_offload+0x40/0x40
>> > > [  340.170580]  ? __netif_receive_skb_core+0x4ac/0x900
>> > > [  340.183285]  ? rcu_accelerate_cbs+0x5b/0x80
>> > > [  340.195282]  ? __netif_receive_skb+0x18/0x60
>> > > [  340.207288]  ? process_backlog+0x95/0x140
>> > > [  340.218948]  ? net_rx_action+0x26c/0x3b0
>> > > [  340.230416]  ? __do_softirq+0xc9/0x26a
>> > > [  340.241625]  ? do_softirq_own_stack+0x2a/0x40
>> > > [  340.253368]  
>> > > [  340.262673]  ? do_softirq+0x50/0x60
>> > > [  340.273450]  ? __local_bh_enable_ip+0x57/0x60
>> > > [  340.285045]  ? ip_finish_output2+0x175/0x350
>> > > [  340.296403]  ? ip_finish_output+0x127/0x1d0
>> > > [  340.307665]  ? nf_hook_slow+0x3c/0xb0
>> > > [  340.318230]  ? ip_output+0x72/0xe0
>> > > [  340.328524]  ? ip_fragment.constprop.54+0x80/0x80
>> > > [  340.340070]  ? ip_local_out+0x35/0x40
>> > > [  340.350497]  ? ip_queue_xmit+0x15c/0x3f0
>> > > [  340.361060]  ? __kmalloc_reserve.isra.40+0x31/0x90
>> > > [  340.372484]  ? __skb_clone+0x2e/0x130
>> > > [  340.382633]  ? tcp_transmit_skb+0x558/0xa10
>> > > [  340.393262]  ? tcp_connect+0x938/0xad0
>> > > [  340.403370]  ? ktime_get_with_offset+0x4c/0xb0
>> > > [  340.414206]  ? tcp_v4_connect+0x457/0x4e0
>> > > [  340.424471]  ? __inet_stream_connect+0xb3/0x300
>> > > [  340.435195]  ? inet_stream_connect+0x3b/0x60
>> > > [  340.445607]  ? SYSC_connect+0xd9/0x110
>> > > [  340.455455]  ? __audit_syscall_entry+0xaf/0x100
>> > > [  340.466112]  ? syscall_trace_enter+0x1d0/0x2b0
>> > > [  340.476636]  ? __audit_syscall_exit+0x209/0x290
>> > > [  340.487151]  ? SyS_connect+0xe/0x10
>> > > [  340.496453]  ? do_syscall_64+0x67/0x1b0
>> > > [  340.506078]  ? entry_SYSCALL64_slow_path+0x25/0x25
>> > > [  340.516693] Kernel Offset: disabled
>> > > [  340.526393] Rebooting in 11 seconds..
>> > >
>> > > This is mostly reliable, and I'm only seeing it on bare metal
>> > > (not in
>> > > a
>> > > virtualbox vm).
>> > >
>> > > The SELinux skb parse error at the start only sometimes appears,
>> > > and
>> > > looking at the code, I suspect some kind of memory corruption
>> > > being
>> > > the
>> > > cause at that point (basic packet header checks).
>> > >
>> > > I bisected the bug down to the following change:
>> > >
>> > > commit bffa72cf7f9df842f0016ba03586039296b4caaf
>> > > Author: Eric Dumazet 
>> > > Date:   Tue Sep 19 05:14:24 2017 -0700
>> > >
>> > > net: sk_buff rbnode reorg
>> > > ...
>> > >
>> > >
>> > > Anyone else able to 

Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Stephen Smalley
On Wed, 2017-11-29 at 09:34 -0800, Eric Dumazet wrote:
> On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalley 
> wrote:
> > On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote:
> > > I'm seeing a kernel stack corruption bug (detected via gcc) when
> > > running
> > > the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd
> > > inet_socket
> > > test:
> > > 
> > > https://github.com/SELinuxProject/selinux-testsuite/blob/master/t
> > > ests
> > > /inet_socket/test
> > > 
> > >   # Verify that unauthorized client cannot communicate with the
> > > server.
> > >   $result = system
> > >   "runcon -t test_inet_bad_client_t -- $basedir/client stream
> > > 127.0.0.1 65535 2>&1";
> > > 
> > > This correctlly causes an access control error in the Netlabel
> > > code,
> > > and
> > > the bug seems to be triggered during the ICMP send:
> > > 
> > > [  339.806024] SELinux: failure in selinux_parse_skb(), unable to
> > > parse packet
> > > [  339.822505] Kernel panic - not syncing: stack-protector:
> > > Kernel
> > > stack is corrupted in: 81745af5
> > > [  339.822505]
> > > [  339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-
> > > rc1-
> > > test #15
> > > [  339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS
> > > FWKT68A   01/19/2017
> > > [  339.885060] Call Trace:
> > > [  339.896875]  
> > > [  339.908103]  dump_stack+0x63/0x87
> > > [  339.920645]  panic+0xe8/0x248
> > > [  339.932668]  ? ip_push_pending_frames+0x33/0x40
> > > [  339.946328]  ? icmp_send+0x525/0x530
> > > [  339.958861]  ? kfree_skbmem+0x60/0x70
> > > [  339.971431]  __stack_chk_fail+0x1b/0x20
> > > [  339.984049]  icmp_send+0x525/0x530
> > > [  339.996205]  ? netlbl_skbuff_err+0x36/0x40
> > > [  340.008997]  ? selinux_netlbl_err+0x11/0x20
> > > [  340.021816]  ? selinux_socket_sock_rcv_skb+0x211/0x230
> > > [  340.035529]  ? security_sock_rcv_skb+0x3b/0x50
> > > [  340.048471]  ? sk_filter_trim_cap+0x44/0x1c0
> > > [  340.061246]  ? tcp_v4_inbound_md5_hash+0x69/0x1b0
> > > [  340.074562]  ? tcp_filter+0x2c/0x40
> > > [  340.086400]  ? tcp_v4_rcv+0x820/0xa20
> > > [  340.098329]  ? ip_local_deliver_finish+0x71/0x1a0
> > > [  340.111279]  ? ip_local_deliver+0x6f/0xe0
> > > [  340.123535]  ? ip_rcv_finish+0x3a0/0x3a0
> > > [  340.135523]  ? ip_rcv_finish+0xdb/0x3a0
> > > [  340.147442]  ? ip_rcv+0x27c/0x3c0
> > > [  340.158668]  ? inet_del_offload+0x40/0x40
> > > [  340.170580]  ? __netif_receive_skb_core+0x4ac/0x900
> > > [  340.183285]  ? rcu_accelerate_cbs+0x5b/0x80
> > > [  340.195282]  ? __netif_receive_skb+0x18/0x60
> > > [  340.207288]  ? process_backlog+0x95/0x140
> > > [  340.218948]  ? net_rx_action+0x26c/0x3b0
> > > [  340.230416]  ? __do_softirq+0xc9/0x26a
> > > [  340.241625]  ? do_softirq_own_stack+0x2a/0x40
> > > [  340.253368]  
> > > [  340.262673]  ? do_softirq+0x50/0x60
> > > [  340.273450]  ? __local_bh_enable_ip+0x57/0x60
> > > [  340.285045]  ? ip_finish_output2+0x175/0x350
> > > [  340.296403]  ? ip_finish_output+0x127/0x1d0
> > > [  340.307665]  ? nf_hook_slow+0x3c/0xb0
> > > [  340.318230]  ? ip_output+0x72/0xe0
> > > [  340.328524]  ? ip_fragment.constprop.54+0x80/0x80
> > > [  340.340070]  ? ip_local_out+0x35/0x40
> > > [  340.350497]  ? ip_queue_xmit+0x15c/0x3f0
> > > [  340.361060]  ? __kmalloc_reserve.isra.40+0x31/0x90
> > > [  340.372484]  ? __skb_clone+0x2e/0x130
> > > [  340.382633]  ? tcp_transmit_skb+0x558/0xa10
> > > [  340.393262]  ? tcp_connect+0x938/0xad0
> > > [  340.403370]  ? ktime_get_with_offset+0x4c/0xb0
> > > [  340.414206]  ? tcp_v4_connect+0x457/0x4e0
> > > [  340.424471]  ? __inet_stream_connect+0xb3/0x300
> > > [  340.435195]  ? inet_stream_connect+0x3b/0x60
> > > [  340.445607]  ? SYSC_connect+0xd9/0x110
> > > [  340.455455]  ? __audit_syscall_entry+0xaf/0x100
> > > [  340.466112]  ? syscall_trace_enter+0x1d0/0x2b0
> > > [  340.476636]  ? __audit_syscall_exit+0x209/0x290
> > > [  340.487151]  ? SyS_connect+0xe/0x10
> > > [  340.496453]  ? do_syscall_64+0x67/0x1b0
> > > [  340.506078]  ? entry_SYSCALL64_slow_path+0x25/0x25
> > > [  340.516693] Kernel Offset: disabled
> > > [  340.526393] Rebooting in 11 seconds..
> > > 
> > > This is mostly reliable, and I'm only seeing it on bare metal
> > > (not in
> > > a
> > > virtualbox vm).
> > > 
> > > The SELinux skb parse error at the start only sometimes appears,
> > > and
> > > looking at the code, I suspect some kind of memory corruption
> > > being
> > > the
> > > cause at that point (basic packet header checks).
> > > 
> > > I bisected the bug down to the following change:
> > > 
> > > commit bffa72cf7f9df842f0016ba03586039296b4caaf
> > > Author: Eric Dumazet 
> > > Date:   Tue Sep 19 05:14:24 2017 -0700
> > > 
> > > net: sk_buff rbnode reorg
> > > ...
> > > 
> > > 
> > > Anyone else able to reproduce this, or have any ideas on what's
> > > happening?
> > 
> > So far I haven't been able to reproduce with 4.15-rc1 or -linus.
> > 
> 
> You might try adding KASAN in the picture ? ( 

Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Paul Moore
On Wed, Nov 29, 2017 at 12:34 PM, Eric Dumazet  wrote:
> On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalley  wrote:
>> On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote:
>>> I'm seeing a kernel stack corruption bug (detected via gcc) when
>>> running
>>> the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket
>>> test:
>>>
>>> https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests
>>> /inet_socket/test
>>>
>>>   # Verify that unauthorized client cannot communicate with the
>>> server.
>>>   $result = system
>>>   "runcon -t test_inet_bad_client_t -- $basedir/client stream
>>> 127.0.0.1 65535 2>&1";
>>>
>>> This correctlly causes an access control error in the Netlabel code,
>>> and
>>> the bug seems to be triggered during the ICMP send:
>>>
>>> [  339.806024] SELinux: failure in selinux_parse_skb(), unable to
>>> parse packet
>>> [  339.822505] Kernel panic - not syncing: stack-protector: Kernel
>>> stack is corrupted in: 81745af5
>>> [  339.822505]
>>> [  339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-
>>> test #15
>>> [  339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS
>>> FWKT68A   01/19/2017
>>> [  339.885060] Call Trace:
>>> [  339.896875]  
>>> [  339.908103]  dump_stack+0x63/0x87
>>> [  339.920645]  panic+0xe8/0x248
>>> [  339.932668]  ? ip_push_pending_frames+0x33/0x40
>>> [  339.946328]  ? icmp_send+0x525/0x530
>>> [  339.958861]  ? kfree_skbmem+0x60/0x70
>>> [  339.971431]  __stack_chk_fail+0x1b/0x20
>>> [  339.984049]  icmp_send+0x525/0x530

...

>>> This is mostly reliable, and I'm only seeing it on bare metal (not in
>>> a
>>> virtualbox vm).
>>>
>>> The SELinux skb parse error at the start only sometimes appears, and
>>> looking at the code, I suspect some kind of memory corruption being
>>> the
>>> cause at that point (basic packet header checks).
>>>
>>> I bisected the bug down to the following change:
>>>
>>> commit bffa72cf7f9df842f0016ba03586039296b4caaf
>>> Author: Eric Dumazet 
>>> Date:   Tue Sep 19 05:14:24 2017 -0700
>>>
>>> net: sk_buff rbnode reorg
>>> ...
>>>
>>>
>>> Anyone else able to reproduce this, or have any ideas on what's
>>> happening?
>>
>> So far I haven't been able to reproduce with 4.15-rc1 or -linus.
>
> You might try adding KASAN in the picture ? ( CONFIG_KASAN=y )

As another data point, I have not hit this problem either, but I'm not
currently building my test kernels with KASAN enabled.

-- 
paul moore
www.paul-moore.com


Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Eric Dumazet
On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalley  wrote:
> On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote:
>> I'm seeing a kernel stack corruption bug (detected via gcc) when
>> running
>> the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket
>> test:
>>
>> https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests
>> /inet_socket/test
>>
>>   # Verify that unauthorized client cannot communicate with the
>> server.
>>   $result = system
>>   "runcon -t test_inet_bad_client_t -- $basedir/client stream
>> 127.0.0.1 65535 2>&1";
>>
>> This correctlly causes an access control error in the Netlabel code,
>> and
>> the bug seems to be triggered during the ICMP send:
>>
>> [  339.806024] SELinux: failure in selinux_parse_skb(), unable to
>> parse packet
>> [  339.822505] Kernel panic - not syncing: stack-protector: Kernel
>> stack is corrupted in: 81745af5
>> [  339.822505]
>> [  339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-
>> test #15
>> [  339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS
>> FWKT68A   01/19/2017
>> [  339.885060] Call Trace:
>> [  339.896875]  
>> [  339.908103]  dump_stack+0x63/0x87
>> [  339.920645]  panic+0xe8/0x248
>> [  339.932668]  ? ip_push_pending_frames+0x33/0x40
>> [  339.946328]  ? icmp_send+0x525/0x530
>> [  339.958861]  ? kfree_skbmem+0x60/0x70
>> [  339.971431]  __stack_chk_fail+0x1b/0x20
>> [  339.984049]  icmp_send+0x525/0x530
>> [  339.996205]  ? netlbl_skbuff_err+0x36/0x40
>> [  340.008997]  ? selinux_netlbl_err+0x11/0x20
>> [  340.021816]  ? selinux_socket_sock_rcv_skb+0x211/0x230
>> [  340.035529]  ? security_sock_rcv_skb+0x3b/0x50
>> [  340.048471]  ? sk_filter_trim_cap+0x44/0x1c0
>> [  340.061246]  ? tcp_v4_inbound_md5_hash+0x69/0x1b0
>> [  340.074562]  ? tcp_filter+0x2c/0x40
>> [  340.086400]  ? tcp_v4_rcv+0x820/0xa20
>> [  340.098329]  ? ip_local_deliver_finish+0x71/0x1a0
>> [  340.111279]  ? ip_local_deliver+0x6f/0xe0
>> [  340.123535]  ? ip_rcv_finish+0x3a0/0x3a0
>> [  340.135523]  ? ip_rcv_finish+0xdb/0x3a0
>> [  340.147442]  ? ip_rcv+0x27c/0x3c0
>> [  340.158668]  ? inet_del_offload+0x40/0x40
>> [  340.170580]  ? __netif_receive_skb_core+0x4ac/0x900
>> [  340.183285]  ? rcu_accelerate_cbs+0x5b/0x80
>> [  340.195282]  ? __netif_receive_skb+0x18/0x60
>> [  340.207288]  ? process_backlog+0x95/0x140
>> [  340.218948]  ? net_rx_action+0x26c/0x3b0
>> [  340.230416]  ? __do_softirq+0xc9/0x26a
>> [  340.241625]  ? do_softirq_own_stack+0x2a/0x40
>> [  340.253368]  
>> [  340.262673]  ? do_softirq+0x50/0x60
>> [  340.273450]  ? __local_bh_enable_ip+0x57/0x60
>> [  340.285045]  ? ip_finish_output2+0x175/0x350
>> [  340.296403]  ? ip_finish_output+0x127/0x1d0
>> [  340.307665]  ? nf_hook_slow+0x3c/0xb0
>> [  340.318230]  ? ip_output+0x72/0xe0
>> [  340.328524]  ? ip_fragment.constprop.54+0x80/0x80
>> [  340.340070]  ? ip_local_out+0x35/0x40
>> [  340.350497]  ? ip_queue_xmit+0x15c/0x3f0
>> [  340.361060]  ? __kmalloc_reserve.isra.40+0x31/0x90
>> [  340.372484]  ? __skb_clone+0x2e/0x130
>> [  340.382633]  ? tcp_transmit_skb+0x558/0xa10
>> [  340.393262]  ? tcp_connect+0x938/0xad0
>> [  340.403370]  ? ktime_get_with_offset+0x4c/0xb0
>> [  340.414206]  ? tcp_v4_connect+0x457/0x4e0
>> [  340.424471]  ? __inet_stream_connect+0xb3/0x300
>> [  340.435195]  ? inet_stream_connect+0x3b/0x60
>> [  340.445607]  ? SYSC_connect+0xd9/0x110
>> [  340.455455]  ? __audit_syscall_entry+0xaf/0x100
>> [  340.466112]  ? syscall_trace_enter+0x1d0/0x2b0
>> [  340.476636]  ? __audit_syscall_exit+0x209/0x290
>> [  340.487151]  ? SyS_connect+0xe/0x10
>> [  340.496453]  ? do_syscall_64+0x67/0x1b0
>> [  340.506078]  ? entry_SYSCALL64_slow_path+0x25/0x25
>> [  340.516693] Kernel Offset: disabled
>> [  340.526393] Rebooting in 11 seconds..
>>
>> This is mostly reliable, and I'm only seeing it on bare metal (not in
>> a
>> virtualbox vm).
>>
>> The SELinux skb parse error at the start only sometimes appears, and
>> looking at the code, I suspect some kind of memory corruption being
>> the
>> cause at that point (basic packet header checks).
>>
>> I bisected the bug down to the following change:
>>
>> commit bffa72cf7f9df842f0016ba03586039296b4caaf
>> Author: Eric Dumazet 
>> Date:   Tue Sep 19 05:14:24 2017 -0700
>>
>> net: sk_buff rbnode reorg
>> ...
>>
>>
>> Anyone else able to reproduce this, or have any ideas on what's
>> happening?
>
> So far I haven't been able to reproduce with 4.15-rc1 or -linus.
>

You might try adding KASAN in the picture ? ( CONFIG_KASAN=y )

Thanks.


Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Stephen Smalley
On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote:
> I'm seeing a kernel stack corruption bug (detected via gcc) when
> running 
> the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket
> test:
> 
> https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests
> /inet_socket/test
> 
>   # Verify that unauthorized client cannot communicate with the
> server.
>   $result = system
>   "runcon -t test_inet_bad_client_t -- $basedir/client stream
> 127.0.0.1 65535 2>&1";
> 
> This correctlly causes an access control error in the Netlabel code,
> and 
> the bug seems to be triggered during the ICMP send:
> 
> [  339.806024] SELinux: failure in selinux_parse_skb(), unable to
> parse packet
> [  339.822505] Kernel panic - not syncing: stack-protector: Kernel
> stack is corrupted in: 81745af5
> [  339.822505] 
> [  339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-
> test #15
> [  339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS
> FWKT68A   01/19/2017
> [  339.885060] Call Trace:
> [  339.896875]  
> [  339.908103]  dump_stack+0x63/0x87
> [  339.920645]  panic+0xe8/0x248
> [  339.932668]  ? ip_push_pending_frames+0x33/0x40
> [  339.946328]  ? icmp_send+0x525/0x530
> [  339.958861]  ? kfree_skbmem+0x60/0x70
> [  339.971431]  __stack_chk_fail+0x1b/0x20
> [  339.984049]  icmp_send+0x525/0x530
> [  339.996205]  ? netlbl_skbuff_err+0x36/0x40
> [  340.008997]  ? selinux_netlbl_err+0x11/0x20
> [  340.021816]  ? selinux_socket_sock_rcv_skb+0x211/0x230
> [  340.035529]  ? security_sock_rcv_skb+0x3b/0x50
> [  340.048471]  ? sk_filter_trim_cap+0x44/0x1c0
> [  340.061246]  ? tcp_v4_inbound_md5_hash+0x69/0x1b0
> [  340.074562]  ? tcp_filter+0x2c/0x40
> [  340.086400]  ? tcp_v4_rcv+0x820/0xa20
> [  340.098329]  ? ip_local_deliver_finish+0x71/0x1a0
> [  340.111279]  ? ip_local_deliver+0x6f/0xe0
> [  340.123535]  ? ip_rcv_finish+0x3a0/0x3a0
> [  340.135523]  ? ip_rcv_finish+0xdb/0x3a0
> [  340.147442]  ? ip_rcv+0x27c/0x3c0
> [  340.158668]  ? inet_del_offload+0x40/0x40
> [  340.170580]  ? __netif_receive_skb_core+0x4ac/0x900
> [  340.183285]  ? rcu_accelerate_cbs+0x5b/0x80
> [  340.195282]  ? __netif_receive_skb+0x18/0x60
> [  340.207288]  ? process_backlog+0x95/0x140
> [  340.218948]  ? net_rx_action+0x26c/0x3b0
> [  340.230416]  ? __do_softirq+0xc9/0x26a
> [  340.241625]  ? do_softirq_own_stack+0x2a/0x40
> [  340.253368]  
> [  340.262673]  ? do_softirq+0x50/0x60
> [  340.273450]  ? __local_bh_enable_ip+0x57/0x60
> [  340.285045]  ? ip_finish_output2+0x175/0x350
> [  340.296403]  ? ip_finish_output+0x127/0x1d0
> [  340.307665]  ? nf_hook_slow+0x3c/0xb0
> [  340.318230]  ? ip_output+0x72/0xe0
> [  340.328524]  ? ip_fragment.constprop.54+0x80/0x80
> [  340.340070]  ? ip_local_out+0x35/0x40
> [  340.350497]  ? ip_queue_xmit+0x15c/0x3f0
> [  340.361060]  ? __kmalloc_reserve.isra.40+0x31/0x90
> [  340.372484]  ? __skb_clone+0x2e/0x130
> [  340.382633]  ? tcp_transmit_skb+0x558/0xa10
> [  340.393262]  ? tcp_connect+0x938/0xad0
> [  340.403370]  ? ktime_get_with_offset+0x4c/0xb0
> [  340.414206]  ? tcp_v4_connect+0x457/0x4e0
> [  340.424471]  ? __inet_stream_connect+0xb3/0x300
> [  340.435195]  ? inet_stream_connect+0x3b/0x60
> [  340.445607]  ? SYSC_connect+0xd9/0x110
> [  340.455455]  ? __audit_syscall_entry+0xaf/0x100
> [  340.466112]  ? syscall_trace_enter+0x1d0/0x2b0
> [  340.476636]  ? __audit_syscall_exit+0x209/0x290
> [  340.487151]  ? SyS_connect+0xe/0x10
> [  340.496453]  ? do_syscall_64+0x67/0x1b0
> [  340.506078]  ? entry_SYSCALL64_slow_path+0x25/0x25
> [  340.516693] Kernel Offset: disabled
> [  340.526393] Rebooting in 11 seconds..
> 
> This is mostly reliable, and I'm only seeing it on bare metal (not in
> a 
> virtualbox vm).
> 
> The SELinux skb parse error at the start only sometimes appears, and 
> looking at the code, I suspect some kind of memory corruption being
> the 
> cause at that point (basic packet header checks).
> 
> I bisected the bug down to the following change:
> 
> commit bffa72cf7f9df842f0016ba03586039296b4caaf
> Author: Eric Dumazet 
> Date:   Tue Sep 19 05:14:24 2017 -0700
> 
> net: sk_buff rbnode reorg
> ...
> 
> 
> Anyone else able to reproduce this, or have any ideas on what's
> happening?

So far I haven't been able to reproduce with 4.15-rc1 or -linus.



Re: [BUG] kernel stack corruption during/after Netlabel error

2017-11-29 Thread Eric Dumazet
On Wed, Nov 29, 2017 at 2:26 AM, James Morris  wrote:
> I'm seeing a kernel stack corruption bug (detected via gcc) when running
> the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket test:
>
> https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests/inet_socket/test
>
>   # Verify that unauthorized client cannot communicate with the server.
>   $result = system
>   "runcon -t test_inet_bad_client_t -- $basedir/client stream 127.0.0.1 65535 
> 2>&1";
>
> This correctlly causes an access control error in the Netlabel code, and
> the bug seems to be triggered during the ICMP send:
>
> [  339.806024] SELinux: failure in selinux_parse_skb(), unable to parse packet
> [  339.822505] Kernel panic - not syncing: stack-protector: Kernel stack is 
> corrupted in: 81745af5
> [  339.822505]
> [  339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-test #15
> [  339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS FWKT68A   
> 01/19/2017
> [  339.885060] Call Trace:
> [  339.896875]  
> [  339.908103]  dump_stack+0x63/0x87
> [  339.920645]  panic+0xe8/0x248
> [  339.932668]  ? ip_push_pending_frames+0x33/0x40
> [  339.946328]  ? icmp_send+0x525/0x530
> [  339.958861]  ? kfree_skbmem+0x60/0x70
> [  339.971431]  __stack_chk_fail+0x1b/0x20
> [  339.984049]  icmp_send+0x525/0x530
> [  339.996205]  ? netlbl_skbuff_err+0x36/0x40
> [  340.008997]  ? selinux_netlbl_err+0x11/0x20
> [  340.021816]  ? selinux_socket_sock_rcv_skb+0x211/0x230
> [  340.035529]  ? security_sock_rcv_skb+0x3b/0x50
> [  340.048471]  ? sk_filter_trim_cap+0x44/0x1c0
> [  340.061246]  ? tcp_v4_inbound_md5_hash+0x69/0x1b0
> [  340.074562]  ? tcp_filter+0x2c/0x40
> [  340.086400]  ? tcp_v4_rcv+0x820/0xa20
> [  340.098329]  ? ip_local_deliver_finish+0x71/0x1a0
> [  340.111279]  ? ip_local_deliver+0x6f/0xe0
> [  340.123535]  ? ip_rcv_finish+0x3a0/0x3a0
> [  340.135523]  ? ip_rcv_finish+0xdb/0x3a0
> [  340.147442]  ? ip_rcv+0x27c/0x3c0
> [  340.158668]  ? inet_del_offload+0x40/0x40
> [  340.170580]  ? __netif_receive_skb_core+0x4ac/0x900
> [  340.183285]  ? rcu_accelerate_cbs+0x5b/0x80
> [  340.195282]  ? __netif_receive_skb+0x18/0x60
> [  340.207288]  ? process_backlog+0x95/0x140
> [  340.218948]  ? net_rx_action+0x26c/0x3b0
> [  340.230416]  ? __do_softirq+0xc9/0x26a
> [  340.241625]  ? do_softirq_own_stack+0x2a/0x40
> [  340.253368]  
> [  340.262673]  ? do_softirq+0x50/0x60
> [  340.273450]  ? __local_bh_enable_ip+0x57/0x60
> [  340.285045]  ? ip_finish_output2+0x175/0x350
> [  340.296403]  ? ip_finish_output+0x127/0x1d0
> [  340.307665]  ? nf_hook_slow+0x3c/0xb0
> [  340.318230]  ? ip_output+0x72/0xe0
> [  340.328524]  ? ip_fragment.constprop.54+0x80/0x80
> [  340.340070]  ? ip_local_out+0x35/0x40
> [  340.350497]  ? ip_queue_xmit+0x15c/0x3f0
> [  340.361060]  ? __kmalloc_reserve.isra.40+0x31/0x90
> [  340.372484]  ? __skb_clone+0x2e/0x130
> [  340.382633]  ? tcp_transmit_skb+0x558/0xa10
> [  340.393262]  ? tcp_connect+0x938/0xad0
> [  340.403370]  ? ktime_get_with_offset+0x4c/0xb0
> [  340.414206]  ? tcp_v4_connect+0x457/0x4e0
> [  340.424471]  ? __inet_stream_connect+0xb3/0x300
> [  340.435195]  ? inet_stream_connect+0x3b/0x60
> [  340.445607]  ? SYSC_connect+0xd9/0x110
> [  340.455455]  ? __audit_syscall_entry+0xaf/0x100
> [  340.466112]  ? syscall_trace_enter+0x1d0/0x2b0
> [  340.476636]  ? __audit_syscall_exit+0x209/0x290
> [  340.487151]  ? SyS_connect+0xe/0x10
> [  340.496453]  ? do_syscall_64+0x67/0x1b0
> [  340.506078]  ? entry_SYSCALL64_slow_path+0x25/0x25
> [  340.516693] Kernel Offset: disabled
> [  340.526393] Rebooting in 11 seconds..
>
> This is mostly reliable, and I'm only seeing it on bare metal (not in a
> virtualbox vm).
>
> The SELinux skb parse error at the start only sometimes appears, and
> looking at the code, I suspect some kind of memory corruption being the
> cause at that point (basic packet header checks).
>
> I bisected the bug down to the following change:


>
> commit bffa72cf7f9df842f0016ba03586039296b4caaf
> Author: Eric Dumazet 
> Date:   Tue Sep 19 05:14:24 2017 -0700
>
> net: sk_buff rbnode reorg
> ...
>
>
> Anyone else able to reproduce this, or have any ideas on what's happening?
>
>

Hi James, thanks for the report.

Issue here is that icmp_send() used to be called with skb_in->dev ==
NULL or a valid device pointer ?

After my patch, skb_in->dev is aliased with part of skb_in->rbnode
(rb_left pointer)

So this code in icmp_send() might be fooled :

if (!(skb_in->dev && (skb_in->dev->flags_LOOPBACK)) &&
!icmpv4_global_allow(net, type, code))
goto out_bh_enable;

Although TCP stack should not manipulate skb->rbnode before the calls
to tcp_filter() (and thus security_sock_rcv_skb())

So at the point security_sock_rcv_skb is called, skb->dev should still be valid.