Re: [BUG] kernel stack corruption during/after Netlabel error
On Thu, 30 Nov 2017, Eric Dumazet wrote: > On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote: > > On 11/29/2017 4:31 PM, James Morris wrote: > > > On Wed, 29 Nov 2017, Casey Schaufler wrote: > > > > > > > I see that there is a proposed fix later in the thread, but I > > > > don't see > > > > the patch. Could you send it to me, so I can try it on my > > > > problem? > > > > > > Forwarded off-list. > > > > The patch does fix the problem I was seeing in Smack. > > Can you guys test the following more complete patch ? > > It should cover IPv4 and IPv6, and also the corner cases. Tested-by: James Morris-- James Morris
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/30/2017 9:57 AM, Eric Dumazet wrote: > On Thu, 2017-11-30 at 10:30 -0700, David Ahern wrote: >> On 11/30/17 8:44 AM, David Ahern wrote: >>> On 11/30/17 3:50 AM, Eric Dumazet wrote: @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); - /* This is tricky : We move IPCB at its correct location into TCP_SKB_CB() - * barrier() makes sure compiler wont play fool^Waliasing games. - */ - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), - sizeof(struct inet_skb_parm)); - barrier(); - - TCP_SKB_CB(skb)->seq = ntohl(th->seq); - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th- > syn + th->fin + - skb->len - th->doff * 4); - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); - TCP_SKB_CB(skb)->tcp_tw_isn = 0; - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); - TCP_SKB_CB(skb)->sacked = 0; - TCP_SKB_CB(skb)->has_rxtstamp = - skb->tstamp || skb_hwtstamps(skb)- > hwtstamp; - lookup: sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source, th->dest, sdif, ); >>> I believe moving the above is going to affect lookups with VRF. Let >>> me >>> take a look before this gets committed. >>> >> Eric: >> >> Can you add this to the patch? Fixes socket lookups with VRF which >> stashes a flag in the cb. I've done my testing and it works both ways for Smack. >> >> Thanks, >> >> diff --git a/include/net/tcp.h b/include/net/tcp.h >> index 4e09398009c1..6c020015d556 100644 >> --- a/include/net/tcp.h >> +++ b/include/net/tcp.h >> @@ -849,7 +849,7 @@ static inline bool inet_exact_dif_match(struct >> net >> *net, struct sk_buff *skb) >> { >> #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV) >> if (!net->ipv4.sysctl_tcp_l3mdev_accept && >> - skb && ipv4_l3mdev_skb(TCP_SKB_CB(skb)->header.h4.flags)) >> + skb && ipv4_l3mdev_skb(IPCB(skb)->flags)) >> return true; >> #endif >> return false; > > I wonder if this should not be in a separate patch ? > > Bug was added in 971f10eca186cab238c49daa91f703c5a001b0b1 ("tcp: better > TCP_SKB_CB layout to reduce cache line misses") in linux 3.18 > > While VRF was added later. > > If you agree, I will prepare a patch series, with different Fixes tag > so that David can decide which path needs to be backported into each > stable version. > > Thanks. > >
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/30/17 10:57 AM, Eric Dumazet wrote: > I wonder if this should not be in a separate patch ? > > Bug was added in 971f10eca186cab238c49daa91f703c5a001b0b1 ("tcp: better > TCP_SKB_CB layout to reduce cache line misses") in linux 3.18 > > While VRF was added later. > > If you agree, I will prepare a patch series, with different Fixes tag > so that David can decide which path needs to be backported into each > stable version. > That's sound fine to me.
Re: [BUG] kernel stack corruption during/after Netlabel error
On Thu, 2017-11-30 at 10:30 -0700, David Ahern wrote: > On 11/30/17 8:44 AM, David Ahern wrote: > > On 11/30/17 3:50 AM, Eric Dumazet wrote: > > > @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) > > > > > > th = (const struct tcphdr *)skb->data; > > > iph = ip_hdr(skb); > > > - /* This is tricky : We move IPCB at its correct location > > > into TCP_SKB_CB() > > > - * barrier() makes sure compiler wont play > > > fool^Waliasing games. > > > - */ > > > - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > > > - sizeof(struct inet_skb_parm)); > > > - barrier(); > > > - > > > - TCP_SKB_CB(skb)->seq = ntohl(th->seq); > > > - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th- > > > >syn + th->fin + > > > - skb->len - th->doff * 4); > > > - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > > > - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > > > - TCP_SKB_CB(skb)->tcp_tw_isn = 0; > > > - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > > > - TCP_SKB_CB(skb)->sacked = 0; > > > - TCP_SKB_CB(skb)->has_rxtstamp = > > > - skb->tstamp || skb_hwtstamps(skb)- > > > >hwtstamp; > > > - > > > lookup: > > > sk = __inet_lookup_skb(_hashinfo, skb, > > > __tcp_hdrlen(th), th->source, > > > th->dest, sdif, ); > > > > I believe moving the above is going to affect lookups with VRF. Let > > me > > take a look before this gets committed. > > > > Eric: > > Can you add this to the patch? Fixes socket lookups with VRF which > stashes a flag in the cb. > > Thanks, > > diff --git a/include/net/tcp.h b/include/net/tcp.h > index 4e09398009c1..6c020015d556 100644 > --- a/include/net/tcp.h > +++ b/include/net/tcp.h > @@ -849,7 +849,7 @@ static inline bool inet_exact_dif_match(struct > net > *net, struct sk_buff *skb) > { > #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV) > if (!net->ipv4.sysctl_tcp_l3mdev_accept && > - skb && ipv4_l3mdev_skb(TCP_SKB_CB(skb)->header.h4.flags)) > + skb && ipv4_l3mdev_skb(IPCB(skb)->flags)) > return true; > #endif > return false; I wonder if this should not be in a separate patch ? Bug was added in 971f10eca186cab238c49daa91f703c5a001b0b1 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") in linux 3.18 While VRF was added later. If you agree, I will prepare a patch series, with different Fixes tag so that David can decide which path needs to be backported into each stable version. Thanks.
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/30/17 8:44 AM, David Ahern wrote: > On 11/30/17 3:50 AM, Eric Dumazet wrote: >> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) >> >> th = (const struct tcphdr *)skb->data; >> iph = ip_hdr(skb); >> -/* This is tricky : We move IPCB at its correct location into >> TCP_SKB_CB() >> - * barrier() makes sure compiler wont play fool^Waliasing games. >> - */ >> -memmove(_SKB_CB(skb)->header.h4, IPCB(skb), >> -sizeof(struct inet_skb_parm)); >> -barrier(); >> - >> -TCP_SKB_CB(skb)->seq = ntohl(th->seq); >> -TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + >> -skb->len - th->doff * 4); >> -TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); >> -TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); >> -TCP_SKB_CB(skb)->tcp_tw_isn = 0; >> -TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); >> -TCP_SKB_CB(skb)->sacked = 0; >> -TCP_SKB_CB(skb)->has_rxtstamp = >> -skb->tstamp || skb_hwtstamps(skb)->hwtstamp; >> - >> lookup: >> sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source, >> th->dest, sdif, ); > > I believe moving the above is going to affect lookups with VRF. Let me > take a look before this gets committed. > Eric: Can you add this to the patch? Fixes socket lookups with VRF which stashes a flag in the cb. Thanks, diff --git a/include/net/tcp.h b/include/net/tcp.h index 4e09398009c1..6c020015d556 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -849,7 +849,7 @@ static inline bool inet_exact_dif_match(struct net *net, struct sk_buff *skb) { #if IS_ENABLED(CONFIG_NET_L3_MASTER_DEV) if (!net->ipv4.sysctl_tcp_l3mdev_accept && - skb && ipv4_l3mdev_skb(TCP_SKB_CB(skb)->header.h4.flags)) + skb && ipv4_l3mdev_skb(IPCB(skb)->flags)) return true; #endif return false;
Re: [BUG] kernel stack corruption during/after Netlabel error
On Thu, Nov 30, 2017 at 7:47 AM, Paul Moorewrote: > On Thu, Nov 30, 2017 at 5:50 AM, Eric Dumazet wrote: >> On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote: >>> On 11/29/2017 4:31 PM, James Morris wrote: >>> > On Wed, 29 Nov 2017, Casey Schaufler wrote: >>> > >>> > > I see that there is a proposed fix later in the thread, but I >>> > > don't see >>> > > the patch. Could you send it to me, so I can try it on my >>> > > problem? >>> > >>> > Forwarded off-list. >>> >>> The patch does fix the problem I was seeing in Smack. >> >> Can you guys test the following more complete patch ? >> >> It should cover IPv4 and IPv6, and also the corner cases. >> >> ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8 >> 42844/ that I spotted while cooking this patch ) > > Building a test kernel now, although it make take me a few hours to > test it due to some commitments this morning. I just realized I forgot to enable KASAN in the build, but I can verify that the patch doesn't break anything in the selinux-testsuite. Tested-by: Paul Moore >> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c >> index >> c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d >> 100644 >> --- a/net/ipv4/tcp_ipv4.c >> +++ b/net/ipv4/tcp_ipv4.c >> @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb) >> } >> EXPORT_SYMBOL(tcp_filter); >> >> +static void tcp_v4_restore_cb(struct sk_buff *skb) >> +{ >> + memmove(IPCB(skb), _SKB_CB(skb)->header.h4, >> + sizeof(struct inet_skb_parm)); >> +} >> + >> +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph, >> + const struct tcphdr *th) >> +{ >> + /* This is tricky : We move IPCB at its correct location into >> TCP_SKB_CB() >> +* barrier() makes sure compiler wont play fool^Waliasing games. >> +*/ >> + memmove(_SKB_CB(skb)->header.h4, IPCB(skb), >> + sizeof(struct inet_skb_parm)); >> + barrier(); >> + >> + TCP_SKB_CB(skb)->seq = ntohl(th->seq); >> + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin >> + >> + skb->len - th->doff * 4); >> + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); >> + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); >> + TCP_SKB_CB(skb)->tcp_tw_isn = 0; >> + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); >> + TCP_SKB_CB(skb)->sacked = 0; >> + TCP_SKB_CB(skb)->has_rxtstamp = >> + skb->tstamp || skb_hwtstamps(skb)->hwtstamp; >> +} >> + >> /* >> * From tcp_input.c >> */ >> @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) >> >> th = (const struct tcphdr *)skb->data; >> iph = ip_hdr(skb); >> - /* This is tricky : We move IPCB at its correct location into >> TCP_SKB_CB() >> -* barrier() makes sure compiler wont play fool^Waliasing games. >> -*/ >> - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), >> - sizeof(struct inet_skb_parm)); >> - barrier(); >> - >> - TCP_SKB_CB(skb)->seq = ntohl(th->seq); >> - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin >> + >> - skb->len - th->doff * 4); >> - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); >> - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); >> - TCP_SKB_CB(skb)->tcp_tw_isn = 0; >> - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); >> - TCP_SKB_CB(skb)->sacked = 0; >> - TCP_SKB_CB(skb)->has_rxtstamp = >> - skb->tstamp || skb_hwtstamps(skb)->hwtstamp; >> - >> lookup: >> sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), >> th->source, >>th->dest, sdif, ); >> @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb) >> sock_hold(sk); >> refcounted = true; >> nsk = NULL; >> - if (!tcp_filter(sk, skb)) >> + if (!tcp_filter(sk, skb)) { >> + th = (const struct tcphdr *)skb->data; >> + iph = ip_hdr(skb); >> + tcp_v4_fill_cb(skb, iph, th); >> nsk = tcp_check_req(sk, skb, req, false); >> + } >> if (!nsk) { >> reqsk_put(req); >> goto discard_and_relse; >> } >> if (nsk == sk) { >> reqsk_put(req); >> + tcp_v4_restore_cb(skb); >> } else if (tcp_child_process(sk, nsk, skb)) { >> tcp_v4_send_reset(nsk, skb); >> goto discard_and_relse; >> @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb) >> goto
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/30/17 3:50 AM, Eric Dumazet wrote: > @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) > > th = (const struct tcphdr *)skb->data; > iph = ip_hdr(skb); > - /* This is tricky : We move IPCB at its correct location into > TCP_SKB_CB() > - * barrier() makes sure compiler wont play fool^Waliasing games. > - */ > - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > - sizeof(struct inet_skb_parm)); > - barrier(); > - > - TCP_SKB_CB(skb)->seq = ntohl(th->seq); > - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > - skb->len - th->doff * 4); > - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > - TCP_SKB_CB(skb)->tcp_tw_isn = 0; > - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > - TCP_SKB_CB(skb)->sacked = 0; > - TCP_SKB_CB(skb)->has_rxtstamp = > - skb->tstamp || skb_hwtstamps(skb)->hwtstamp; > - > lookup: > sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source, > th->dest, sdif, ); I believe moving the above is going to affect lookups with VRF. Let me take a look before this gets committed.
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/30/2017 2:50 AM, Eric Dumazet wrote: > On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote: >> On 11/29/2017 4:31 PM, James Morris wrote: >>> On Wed, 29 Nov 2017, Casey Schaufler wrote: >>> I see that there is a proposed fix later in the thread, but I don't see the patch. Could you send it to me, so I can try it on my problem? >>> Forwarded off-list. >> The patch does fix the problem I was seeing in Smack. > Can you guys test the following more complete patch ? My tests are passing. Thank you. Tested-by: Casey Schaufler> > It should cover IPv4 and IPv6, and also the corner cases. > > ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8 > 42844/ that I spotted while cooking this patch ) > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index > c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d > 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb) > } > EXPORT_SYMBOL(tcp_filter); > > +static void tcp_v4_restore_cb(struct sk_buff *skb) > +{ > + memmove(IPCB(skb), _SKB_CB(skb)->header.h4, > + sizeof(struct inet_skb_parm)); > +} > + > +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph, > +const struct tcphdr *th) > +{ > + /* This is tricky : We move IPCB at its correct location into > TCP_SKB_CB() > + * barrier() makes sure compiler wont play fool^Waliasing games. > + */ > + memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > + sizeof(struct inet_skb_parm)); > + barrier(); > + > + TCP_SKB_CB(skb)->seq = ntohl(th->seq); > + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > + skb->len - th->doff * 4); > + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > + TCP_SKB_CB(skb)->tcp_tw_isn = 0; > + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > + TCP_SKB_CB(skb)->sacked = 0; > + TCP_SKB_CB(skb)->has_rxtstamp = > + skb->tstamp || skb_hwtstamps(skb)->hwtstamp; > +} > + > /* > * From tcp_input.c > */ > @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) > > th = (const struct tcphdr *)skb->data; > iph = ip_hdr(skb); > - /* This is tricky : We move IPCB at its correct location into > TCP_SKB_CB() > - * barrier() makes sure compiler wont play fool^Waliasing games. > - */ > - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > - sizeof(struct inet_skb_parm)); > - barrier(); > - > - TCP_SKB_CB(skb)->seq = ntohl(th->seq); > - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > - skb->len - th->doff * 4); > - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > - TCP_SKB_CB(skb)->tcp_tw_isn = 0; > - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > - TCP_SKB_CB(skb)->sacked = 0; > - TCP_SKB_CB(skb)->has_rxtstamp = > - skb->tstamp || skb_hwtstamps(skb)->hwtstamp; > - > lookup: > sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source, > th->dest, sdif, ); > @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb) > sock_hold(sk); > refcounted = true; > nsk = NULL; > - if (!tcp_filter(sk, skb)) > + if (!tcp_filter(sk, skb)) { > + th = (const struct tcphdr *)skb->data; > + iph = ip_hdr(skb); > + tcp_v4_fill_cb(skb, iph, th); > nsk = tcp_check_req(sk, skb, req, false); > + } > if (!nsk) { > reqsk_put(req); > goto discard_and_relse; > } > if (nsk == sk) { > reqsk_put(req); > + tcp_v4_restore_cb(skb); > } else if (tcp_child_process(sk, nsk, skb)) { > tcp_v4_send_reset(nsk, skb); > goto discard_and_relse; > @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > goto discard_and_relse; > th = (const struct tcphdr *)skb->data; > iph = ip_hdr(skb); > + tcp_v4_fill_cb(skb, iph, th); > > skb->dev = NULL; > > @@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) > goto discard_it; > > + tcp_v4_fill_cb(skb, iph, th); > + > if (tcp_checksum_complete(skb)) { > csum_error: > __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS); > @@ -1768,6 +1786,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > goto discard_it; > } > > +
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/30/2017 2:50 AM, Eric Dumazet wrote: > On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote: >> On 11/29/2017 4:31 PM, James Morris wrote: >>> On Wed, 29 Nov 2017, Casey Schaufler wrote: >>> I see that there is a proposed fix later in the thread, but I don't see the patch. Could you send it to me, so I can try it on my problem? >>> Forwarded off-list. >> The patch does fix the problem I was seeing in Smack. > Can you guys test the following more complete patch ? Building now. I should have results soon. > > It should cover IPv4 and IPv6, and also the corner cases. > > ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8 > 42844/ that I spotted while cooking this patch ) > > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index > c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d > 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb) > } > EXPORT_SYMBOL(tcp_filter); > > +static void tcp_v4_restore_cb(struct sk_buff *skb) > +{ > + memmove(IPCB(skb), _SKB_CB(skb)->header.h4, > + sizeof(struct inet_skb_parm)); > +} > + > +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph, > +const struct tcphdr *th) > +{ > + /* This is tricky : We move IPCB at its correct location into > TCP_SKB_CB() > + * barrier() makes sure compiler wont play fool^Waliasing games. > + */ > + memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > + sizeof(struct inet_skb_parm)); > + barrier(); > + > + TCP_SKB_CB(skb)->seq = ntohl(th->seq); > + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > + skb->len - th->doff * 4); > + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > + TCP_SKB_CB(skb)->tcp_tw_isn = 0; > + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > + TCP_SKB_CB(skb)->sacked = 0; > + TCP_SKB_CB(skb)->has_rxtstamp = > + skb->tstamp || skb_hwtstamps(skb)->hwtstamp; > +} > + > /* > * From tcp_input.c > */ > @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) > > th = (const struct tcphdr *)skb->data; > iph = ip_hdr(skb); > - /* This is tricky : We move IPCB at its correct location into > TCP_SKB_CB() > - * barrier() makes sure compiler wont play fool^Waliasing games. > - */ > - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > - sizeof(struct inet_skb_parm)); > - barrier(); > - > - TCP_SKB_CB(skb)->seq = ntohl(th->seq); > - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > - skb->len - th->doff * 4); > - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > - TCP_SKB_CB(skb)->tcp_tw_isn = 0; > - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > - TCP_SKB_CB(skb)->sacked = 0; > - TCP_SKB_CB(skb)->has_rxtstamp = > - skb->tstamp || skb_hwtstamps(skb)->hwtstamp; > - > lookup: > sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source, > th->dest, sdif, ); > @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb) > sock_hold(sk); > refcounted = true; > nsk = NULL; > - if (!tcp_filter(sk, skb)) > + if (!tcp_filter(sk, skb)) { > + th = (const struct tcphdr *)skb->data; > + iph = ip_hdr(skb); > + tcp_v4_fill_cb(skb, iph, th); > nsk = tcp_check_req(sk, skb, req, false); > + } > if (!nsk) { > reqsk_put(req); > goto discard_and_relse; > } > if (nsk == sk) { > reqsk_put(req); > + tcp_v4_restore_cb(skb); > } else if (tcp_child_process(sk, nsk, skb)) { > tcp_v4_send_reset(nsk, skb); > goto discard_and_relse; > @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > goto discard_and_relse; > th = (const struct tcphdr *)skb->data; > iph = ip_hdr(skb); > + tcp_v4_fill_cb(skb, iph, th); > > skb->dev = NULL; > > @@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) > goto discard_it; > > + tcp_v4_fill_cb(skb, iph, th); > + > if (tcp_checksum_complete(skb)) { > csum_error: > __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS); > @@ -1768,6 +1786,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > goto discard_it; > } > > + tcp_v4_fill_cb(skb, iph, th); > + > if
Re: [BUG] kernel stack corruption during/after Netlabel error
On Thu, Nov 30, 2017 at 5:50 AM, Eric Dumazetwrote: > On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote: >> On 11/29/2017 4:31 PM, James Morris wrote: >> > On Wed, 29 Nov 2017, Casey Schaufler wrote: >> > >> > > I see that there is a proposed fix later in the thread, but I >> > > don't see >> > > the patch. Could you send it to me, so I can try it on my >> > > problem? >> > >> > Forwarded off-list. >> >> The patch does fix the problem I was seeing in Smack. > > Can you guys test the following more complete patch ? > > It should cover IPv4 and IPv6, and also the corner cases. > > ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8 > 42844/ that I spotted while cooking this patch ) Building a test kernel now, although it make take me a few hours to test it due to some commitments this morning. > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index > c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d > 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb) > } > EXPORT_SYMBOL(tcp_filter); > > +static void tcp_v4_restore_cb(struct sk_buff *skb) > +{ > + memmove(IPCB(skb), _SKB_CB(skb)->header.h4, > + sizeof(struct inet_skb_parm)); > +} > + > +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph, > + const struct tcphdr *th) > +{ > + /* This is tricky : We move IPCB at its correct location into > TCP_SKB_CB() > +* barrier() makes sure compiler wont play fool^Waliasing games. > +*/ > + memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > + sizeof(struct inet_skb_parm)); > + barrier(); > + > + TCP_SKB_CB(skb)->seq = ntohl(th->seq); > + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > + skb->len - th->doff * 4); > + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > + TCP_SKB_CB(skb)->tcp_tw_isn = 0; > + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > + TCP_SKB_CB(skb)->sacked = 0; > + TCP_SKB_CB(skb)->has_rxtstamp = > + skb->tstamp || skb_hwtstamps(skb)->hwtstamp; > +} > + > /* > * From tcp_input.c > */ > @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) > > th = (const struct tcphdr *)skb->data; > iph = ip_hdr(skb); > - /* This is tricky : We move IPCB at its correct location into > TCP_SKB_CB() > -* barrier() makes sure compiler wont play fool^Waliasing games. > -*/ > - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > - sizeof(struct inet_skb_parm)); > - barrier(); > - > - TCP_SKB_CB(skb)->seq = ntohl(th->seq); > - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + > - skb->len - th->doff * 4); > - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); > - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); > - TCP_SKB_CB(skb)->tcp_tw_isn = 0; > - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); > - TCP_SKB_CB(skb)->sacked = 0; > - TCP_SKB_CB(skb)->has_rxtstamp = > - skb->tstamp || skb_hwtstamps(skb)->hwtstamp; > - > lookup: > sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), > th->source, >th->dest, sdif, ); > @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb) > sock_hold(sk); > refcounted = true; > nsk = NULL; > - if (!tcp_filter(sk, skb)) > + if (!tcp_filter(sk, skb)) { > + th = (const struct tcphdr *)skb->data; > + iph = ip_hdr(skb); > + tcp_v4_fill_cb(skb, iph, th); > nsk = tcp_check_req(sk, skb, req, false); > + } > if (!nsk) { > reqsk_put(req); > goto discard_and_relse; > } > if (nsk == sk) { > reqsk_put(req); > + tcp_v4_restore_cb(skb); > } else if (tcp_child_process(sk, nsk, skb)) { > tcp_v4_send_reset(nsk, skb); > goto discard_and_relse; > @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb) > goto discard_and_relse; > th = (const struct tcphdr *)skb->data; > iph = ip_hdr(skb); > + tcp_v4_fill_cb(skb, iph, th); > > skb->dev = NULL; > > @@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb) > if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) > goto discard_it; > > + tcp_v4_fill_cb(skb, iph, th); > + > if
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, 2017-11-29 at 19:16 -0800, Casey Schaufler wrote: > On 11/29/2017 4:31 PM, James Morris wrote: > > On Wed, 29 Nov 2017, Casey Schaufler wrote: > > > > > I see that there is a proposed fix later in the thread, but I > > > don't see > > > the patch. Could you send it to me, so I can try it on my > > > problem? > > > > Forwarded off-list. > > The patch does fix the problem I was seeing in Smack. Can you guys test the following more complete patch ? It should cover IPv4 and IPv6, and also the corner cases. ( Note that I squashed ipv6 fix in https://patchwork.ozlabs.org/patch/8 42844/ that I spotted while cooking this patch ) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..77ea45da0fe9c746907a312989658af3ad3b198d 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1591,6 +1591,34 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb) } EXPORT_SYMBOL(tcp_filter); +static void tcp_v4_restore_cb(struct sk_buff *skb) +{ + memmove(IPCB(skb), _SKB_CB(skb)->header.h4, + sizeof(struct inet_skb_parm)); +} + +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph, + const struct tcphdr *th) +{ + /* This is tricky : We move IPCB at its correct location into TCP_SKB_CB() +* barrier() makes sure compiler wont play fool^Waliasing games. +*/ + memmove(_SKB_CB(skb)->header.h4, IPCB(skb), + sizeof(struct inet_skb_parm)); + barrier(); + + TCP_SKB_CB(skb)->seq = ntohl(th->seq); + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + + skb->len - th->doff * 4); + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); + TCP_SKB_CB(skb)->tcp_tw_isn = 0; + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); + TCP_SKB_CB(skb)->sacked = 0; + TCP_SKB_CB(skb)->has_rxtstamp = + skb->tstamp || skb_hwtstamps(skb)->hwtstamp; +} + /* * From tcp_input.c */ @@ -1631,24 +1659,6 @@ int tcp_v4_rcv(struct sk_buff *skb) th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); - /* This is tricky : We move IPCB at its correct location into TCP_SKB_CB() -* barrier() makes sure compiler wont play fool^Waliasing games. -*/ - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), - sizeof(struct inet_skb_parm)); - barrier(); - - TCP_SKB_CB(skb)->seq = ntohl(th->seq); - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + - skb->len - th->doff * 4); - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); - TCP_SKB_CB(skb)->tcp_tw_isn = 0; - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); - TCP_SKB_CB(skb)->sacked = 0; - TCP_SKB_CB(skb)->has_rxtstamp = - skb->tstamp || skb_hwtstamps(skb)->hwtstamp; - lookup: sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source, th->dest, sdif, ); @@ -1679,14 +1689,19 @@ int tcp_v4_rcv(struct sk_buff *skb) sock_hold(sk); refcounted = true; nsk = NULL; - if (!tcp_filter(sk, skb)) + if (!tcp_filter(sk, skb)) { + th = (const struct tcphdr *)skb->data; + iph = ip_hdr(skb); + tcp_v4_fill_cb(skb, iph, th); nsk = tcp_check_req(sk, skb, req, false); + } if (!nsk) { reqsk_put(req); goto discard_and_relse; } if (nsk == sk) { reqsk_put(req); + tcp_v4_restore_cb(skb); } else if (tcp_child_process(sk, nsk, skb)) { tcp_v4_send_reset(nsk, skb); goto discard_and_relse; @@ -1712,6 +1727,7 @@ int tcp_v4_rcv(struct sk_buff *skb) goto discard_and_relse; th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); + tcp_v4_fill_cb(skb, iph, th); skb->dev = NULL; @@ -1742,6 +1758,8 @@ int tcp_v4_rcv(struct sk_buff *skb) if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) goto discard_it; + tcp_v4_fill_cb(skb, iph, th); + if (tcp_checksum_complete(skb)) { csum_error: __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS); @@ -1768,6 +1786,8 @@ int tcp_v4_rcv(struct sk_buff *skb) goto discard_it; } + tcp_v4_fill_cb(skb, iph, th); + if (tcp_checksum_complete(skb)) { inet_twsk_put(inet_twsk(sk)); goto csum_error; @@ -1784,6 +1804,7 @@ int tcp_v4_rcv(struct sk_buff
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/29/2017 4:31 PM, James Morris wrote: > On Wed, 29 Nov 2017, Casey Schaufler wrote: > >> I see that there is a proposed fix later in the thread, but I don't see >> the patch. Could you send it to me, so I can try it on my problem? > Forwarded off-list. The patch does fix the problem I was seeing in Smack. > > Interestingly, I didn't see the KASAN output email from Stephen here. > >
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, 29 Nov 2017, Casey Schaufler wrote: > I see that there is a proposed fix later in the thread, but I don't see > the patch. Could you send it to me, so I can try it on my problem? Forwarded off-list. Interestingly, I didn't see the KASAN output email from Stephen here. -- James Morris
Re: [BUG] kernel stack corruption during/after Netlabel error
On 11/29/2017 2:26 AM, James Morris wrote: > I'm seeing a kernel stack corruption bug (detected via gcc) when running > the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket test: > > https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests/inet_socket/test > > # Verify that unauthorized client cannot communicate with the server. > $result = system > "runcon -t test_inet_bad_client_t -- $basedir/client stream 127.0.0.1 65535 > 2>&1"; > > This correctlly causes an access control error in the Netlabel code, and > the bug seems to be triggered during the ICMP send: > > .. > > This is mostly reliable, and I'm only seeing it on bare metal (not in a > virtualbox vm). > > The SELinux skb parse error at the start only sometimes appears, and > looking at the code, I suspect some kind of memory corruption being the > cause at that point (basic packet header checks). > > I bisected the bug down to the following change: > > commit bffa72cf7f9df842f0016ba03586039296b4caaf > Author: Eric Dumazet> Date: Tue Sep 19 05:14:24 2017 -0700 > > net: sk_buff rbnode reorg > ... > > > Anyone else able to reproduce this, or have any ideas on what's happening? I have also bisected a problem to this change. I do not have a trace because the problem manifests as a hard system hang without a trace being presented. The issue arises when Smack attempts to relabel a TCP socket using netlbl_sock_setattr(). I see that there is a proposed fix later in the thread, but I don't see the patch. Could you send it to me, so I can try it on my problem? Thank you. > > > > - James
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, 29 Nov 2017, Eric Dumazet wrote: > On Wed, 2017-11-29 at 12:23 -0800, Eric Dumazet wrote: > > > > I suspect this exposes an ancient bug, caused by fact that TCP moves > > IP[6]CB in skb->cb[] > > > > Basically the 2nd tcp_filter() added in commit > > 8fac365f63c866a00015fa13932d8ffc584518b8 > > ("tcp: Add a tcp_filter hook before handle ack packet") was not > > expecting selinux code being called a 2nd time, > > while skb->cb[] has been mangled [1] > > > > [1] > > memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > > sizeof(struct inet_skb_parm)); > > Please try this fix for IPv4 (a similar patch will be needed for IPv6) > > net/ipv4/tcp_ipv4.c | 51 ++ > 1 file changed, 32 insertions(+), 19 deletions(-) Works for me, no crashes with the testsuite running in a loop. Tested-by: James Morris-- James Morris
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, 2017-11-29 at 12:23 -0800, Eric Dumazet wrote: > > I suspect this exposes an ancient bug, caused by fact that TCP moves > IP[6]CB in skb->cb[] > > Basically the 2nd tcp_filter() added in commit > 8fac365f63c866a00015fa13932d8ffc584518b8 > ("tcp: Add a tcp_filter hook before handle ack packet") was not > expecting selinux code being called a 2nd time, > while skb->cb[] has been mangled [1] > > [1] > memmove(_SKB_CB(skb)->header.h4, IPCB(skb), > sizeof(struct inet_skb_parm)); Please try this fix for IPv4 (a similar patch will be needed for IPv6) net/ipv4/tcp_ipv4.c | 51 ++ 1 file changed, 32 insertions(+), 19 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c6bc0c4d19c624888b0d0b5a4246c7183edf63f5..912928105942b9714dda9132e45961ab1baf0852 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1591,6 +1591,28 @@ int tcp_filter(struct sock *sk, struct sk_buff *skb) } EXPORT_SYMBOL(tcp_filter); +static void tcp_v4_fill_cb(struct sk_buff *skb, const struct iphdr *iph, + const struct tcphdr *th) +{ + /* This is tricky : We move IPCB at its correct location into TCP_SKB_CB() +* barrier() makes sure compiler wont play fool^Waliasing games. +*/ + memmove(_SKB_CB(skb)->header.h4, IPCB(skb), + sizeof(struct inet_skb_parm)); + barrier(); + + TCP_SKB_CB(skb)->seq = ntohl(th->seq); + TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + + skb->len - th->doff * 4); + TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); + TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); + TCP_SKB_CB(skb)->tcp_tw_isn = 0; + TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); + TCP_SKB_CB(skb)->sacked = 0; + TCP_SKB_CB(skb)->has_rxtstamp = + skb->tstamp || skb_hwtstamps(skb)->hwtstamp; +} + /* * From tcp_input.c */ @@ -1631,24 +1653,6 @@ int tcp_v4_rcv(struct sk_buff *skb) th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); - /* This is tricky : We move IPCB at its correct location into TCP_SKB_CB() -* barrier() makes sure compiler wont play fool^Waliasing games. -*/ - memmove(_SKB_CB(skb)->header.h4, IPCB(skb), - sizeof(struct inet_skb_parm)); - barrier(); - - TCP_SKB_CB(skb)->seq = ntohl(th->seq); - TCP_SKB_CB(skb)->end_seq = (TCP_SKB_CB(skb)->seq + th->syn + th->fin + - skb->len - th->doff * 4); - TCP_SKB_CB(skb)->ack_seq = ntohl(th->ack_seq); - TCP_SKB_CB(skb)->tcp_flags = tcp_flag_byte(th); - TCP_SKB_CB(skb)->tcp_tw_isn = 0; - TCP_SKB_CB(skb)->ip_dsfield = ipv4_get_dsfield(iph); - TCP_SKB_CB(skb)->sacked = 0; - TCP_SKB_CB(skb)->has_rxtstamp = - skb->tstamp || skb_hwtstamps(skb)->hwtstamp; - lookup: sk = __inet_lookup_skb(_hashinfo, skb, __tcp_hdrlen(th), th->source, th->dest, sdif, ); @@ -1679,8 +1683,12 @@ int tcp_v4_rcv(struct sk_buff *skb) sock_hold(sk); refcounted = true; nsk = NULL; - if (!tcp_filter(sk, skb)) + if (!tcp_filter(sk, skb)) { + th = (const struct tcphdr *)skb->data; + iph = ip_hdr(skb); + tcp_v4_fill_cb(skb, iph, th); nsk = tcp_check_req(sk, skb, req, false); + } if (!nsk) { reqsk_put(req); goto discard_and_relse; @@ -1712,6 +1720,7 @@ int tcp_v4_rcv(struct sk_buff *skb) goto discard_and_relse; th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); + tcp_v4_fill_cb(skb, iph, th); skb->dev = NULL; @@ -1742,6 +1751,8 @@ int tcp_v4_rcv(struct sk_buff *skb) if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) goto discard_it; + tcp_v4_fill_cb(skb, iph, th); + if (tcp_checksum_complete(skb)) { csum_error: __TCP_INC_STATS(net, TCP_MIB_CSUMERRORS); @@ -1768,6 +1779,8 @@ int tcp_v4_rcv(struct sk_buff *skb) goto discard_it; } + tcp_v4_fill_cb(skb, iph, th); + if (tcp_checksum_complete(skb)) { inet_twsk_put(inet_twsk(sk)); goto csum_error;
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, Nov 29, 2017 at 11:59 AM, Stephen Smalleywrote: > On Wed, 2017-11-29 at 09:34 -0800, Eric Dumazet wrote: >> On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalley >> wrote: >> > On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote: >> > > I'm seeing a kernel stack corruption bug (detected via gcc) when >> > > running >> > > the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd >> > > inet_socket >> > > test: >> > > >> > > https://github.com/SELinuxProject/selinux-testsuite/blob/master/t >> > > ests >> > > /inet_socket/test >> > > >> > > # Verify that unauthorized client cannot communicate with the >> > > server. >> > > $result = system >> > > "runcon -t test_inet_bad_client_t -- $basedir/client stream >> > > 127.0.0.1 65535 2>&1"; >> > > >> > > This correctlly causes an access control error in the Netlabel >> > > code, >> > > and >> > > the bug seems to be triggered during the ICMP send: >> > > >> > > [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to >> > > parse packet >> > > [ 339.822505] Kernel panic - not syncing: stack-protector: >> > > Kernel >> > > stack is corrupted in: 81745af5 >> > > [ 339.822505] >> > > [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0- >> > > rc1- >> > > test #15 >> > > [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS >> > > FWKT68A 01/19/2017 >> > > [ 339.885060] Call Trace: >> > > [ 339.896875] >> > > [ 339.908103] dump_stack+0x63/0x87 >> > > [ 339.920645] panic+0xe8/0x248 >> > > [ 339.932668] ? ip_push_pending_frames+0x33/0x40 >> > > [ 339.946328] ? icmp_send+0x525/0x530 >> > > [ 339.958861] ? kfree_skbmem+0x60/0x70 >> > > [ 339.971431] __stack_chk_fail+0x1b/0x20 >> > > [ 339.984049] icmp_send+0x525/0x530 >> > > [ 339.996205] ? netlbl_skbuff_err+0x36/0x40 >> > > [ 340.008997] ? selinux_netlbl_err+0x11/0x20 >> > > [ 340.021816] ? selinux_socket_sock_rcv_skb+0x211/0x230 >> > > [ 340.035529] ? security_sock_rcv_skb+0x3b/0x50 >> > > [ 340.048471] ? sk_filter_trim_cap+0x44/0x1c0 >> > > [ 340.061246] ? tcp_v4_inbound_md5_hash+0x69/0x1b0 >> > > [ 340.074562] ? tcp_filter+0x2c/0x40 >> > > [ 340.086400] ? tcp_v4_rcv+0x820/0xa20 >> > > [ 340.098329] ? ip_local_deliver_finish+0x71/0x1a0 >> > > [ 340.111279] ? ip_local_deliver+0x6f/0xe0 >> > > [ 340.123535] ? ip_rcv_finish+0x3a0/0x3a0 >> > > [ 340.135523] ? ip_rcv_finish+0xdb/0x3a0 >> > > [ 340.147442] ? ip_rcv+0x27c/0x3c0 >> > > [ 340.158668] ? inet_del_offload+0x40/0x40 >> > > [ 340.170580] ? __netif_receive_skb_core+0x4ac/0x900 >> > > [ 340.183285] ? rcu_accelerate_cbs+0x5b/0x80 >> > > [ 340.195282] ? __netif_receive_skb+0x18/0x60 >> > > [ 340.207288] ? process_backlog+0x95/0x140 >> > > [ 340.218948] ? net_rx_action+0x26c/0x3b0 >> > > [ 340.230416] ? __do_softirq+0xc9/0x26a >> > > [ 340.241625] ? do_softirq_own_stack+0x2a/0x40 >> > > [ 340.253368] >> > > [ 340.262673] ? do_softirq+0x50/0x60 >> > > [ 340.273450] ? __local_bh_enable_ip+0x57/0x60 >> > > [ 340.285045] ? ip_finish_output2+0x175/0x350 >> > > [ 340.296403] ? ip_finish_output+0x127/0x1d0 >> > > [ 340.307665] ? nf_hook_slow+0x3c/0xb0 >> > > [ 340.318230] ? ip_output+0x72/0xe0 >> > > [ 340.328524] ? ip_fragment.constprop.54+0x80/0x80 >> > > [ 340.340070] ? ip_local_out+0x35/0x40 >> > > [ 340.350497] ? ip_queue_xmit+0x15c/0x3f0 >> > > [ 340.361060] ? __kmalloc_reserve.isra.40+0x31/0x90 >> > > [ 340.372484] ? __skb_clone+0x2e/0x130 >> > > [ 340.382633] ? tcp_transmit_skb+0x558/0xa10 >> > > [ 340.393262] ? tcp_connect+0x938/0xad0 >> > > [ 340.403370] ? ktime_get_with_offset+0x4c/0xb0 >> > > [ 340.414206] ? tcp_v4_connect+0x457/0x4e0 >> > > [ 340.424471] ? __inet_stream_connect+0xb3/0x300 >> > > [ 340.435195] ? inet_stream_connect+0x3b/0x60 >> > > [ 340.445607] ? SYSC_connect+0xd9/0x110 >> > > [ 340.455455] ? __audit_syscall_entry+0xaf/0x100 >> > > [ 340.466112] ? syscall_trace_enter+0x1d0/0x2b0 >> > > [ 340.476636] ? __audit_syscall_exit+0x209/0x290 >> > > [ 340.487151] ? SyS_connect+0xe/0x10 >> > > [ 340.496453] ? do_syscall_64+0x67/0x1b0 >> > > [ 340.506078] ? entry_SYSCALL64_slow_path+0x25/0x25 >> > > [ 340.516693] Kernel Offset: disabled >> > > [ 340.526393] Rebooting in 11 seconds.. >> > > >> > > This is mostly reliable, and I'm only seeing it on bare metal >> > > (not in >> > > a >> > > virtualbox vm). >> > > >> > > The SELinux skb parse error at the start only sometimes appears, >> > > and >> > > looking at the code, I suspect some kind of memory corruption >> > > being >> > > the >> > > cause at that point (basic packet header checks). >> > > >> > > I bisected the bug down to the following change: >> > > >> > > commit bffa72cf7f9df842f0016ba03586039296b4caaf >> > > Author: Eric Dumazet >> > > Date: Tue Sep 19 05:14:24 2017 -0700 >> > > >> > > net: sk_buff rbnode reorg >> > > ... >> > > >> > > >> > > Anyone else able to
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, 2017-11-29 at 09:34 -0800, Eric Dumazet wrote: > On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalley> wrote: > > On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote: > > > I'm seeing a kernel stack corruption bug (detected via gcc) when > > > running > > > the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd > > > inet_socket > > > test: > > > > > > https://github.com/SELinuxProject/selinux-testsuite/blob/master/t > > > ests > > > /inet_socket/test > > > > > > # Verify that unauthorized client cannot communicate with the > > > server. > > > $result = system > > > "runcon -t test_inet_bad_client_t -- $basedir/client stream > > > 127.0.0.1 65535 2>&1"; > > > > > > This correctlly causes an access control error in the Netlabel > > > code, > > > and > > > the bug seems to be triggered during the ICMP send: > > > > > > [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to > > > parse packet > > > [ 339.822505] Kernel panic - not syncing: stack-protector: > > > Kernel > > > stack is corrupted in: 81745af5 > > > [ 339.822505] > > > [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0- > > > rc1- > > > test #15 > > > [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS > > > FWKT68A 01/19/2017 > > > [ 339.885060] Call Trace: > > > [ 339.896875] > > > [ 339.908103] dump_stack+0x63/0x87 > > > [ 339.920645] panic+0xe8/0x248 > > > [ 339.932668] ? ip_push_pending_frames+0x33/0x40 > > > [ 339.946328] ? icmp_send+0x525/0x530 > > > [ 339.958861] ? kfree_skbmem+0x60/0x70 > > > [ 339.971431] __stack_chk_fail+0x1b/0x20 > > > [ 339.984049] icmp_send+0x525/0x530 > > > [ 339.996205] ? netlbl_skbuff_err+0x36/0x40 > > > [ 340.008997] ? selinux_netlbl_err+0x11/0x20 > > > [ 340.021816] ? selinux_socket_sock_rcv_skb+0x211/0x230 > > > [ 340.035529] ? security_sock_rcv_skb+0x3b/0x50 > > > [ 340.048471] ? sk_filter_trim_cap+0x44/0x1c0 > > > [ 340.061246] ? tcp_v4_inbound_md5_hash+0x69/0x1b0 > > > [ 340.074562] ? tcp_filter+0x2c/0x40 > > > [ 340.086400] ? tcp_v4_rcv+0x820/0xa20 > > > [ 340.098329] ? ip_local_deliver_finish+0x71/0x1a0 > > > [ 340.111279] ? ip_local_deliver+0x6f/0xe0 > > > [ 340.123535] ? ip_rcv_finish+0x3a0/0x3a0 > > > [ 340.135523] ? ip_rcv_finish+0xdb/0x3a0 > > > [ 340.147442] ? ip_rcv+0x27c/0x3c0 > > > [ 340.158668] ? inet_del_offload+0x40/0x40 > > > [ 340.170580] ? __netif_receive_skb_core+0x4ac/0x900 > > > [ 340.183285] ? rcu_accelerate_cbs+0x5b/0x80 > > > [ 340.195282] ? __netif_receive_skb+0x18/0x60 > > > [ 340.207288] ? process_backlog+0x95/0x140 > > > [ 340.218948] ? net_rx_action+0x26c/0x3b0 > > > [ 340.230416] ? __do_softirq+0xc9/0x26a > > > [ 340.241625] ? do_softirq_own_stack+0x2a/0x40 > > > [ 340.253368] > > > [ 340.262673] ? do_softirq+0x50/0x60 > > > [ 340.273450] ? __local_bh_enable_ip+0x57/0x60 > > > [ 340.285045] ? ip_finish_output2+0x175/0x350 > > > [ 340.296403] ? ip_finish_output+0x127/0x1d0 > > > [ 340.307665] ? nf_hook_slow+0x3c/0xb0 > > > [ 340.318230] ? ip_output+0x72/0xe0 > > > [ 340.328524] ? ip_fragment.constprop.54+0x80/0x80 > > > [ 340.340070] ? ip_local_out+0x35/0x40 > > > [ 340.350497] ? ip_queue_xmit+0x15c/0x3f0 > > > [ 340.361060] ? __kmalloc_reserve.isra.40+0x31/0x90 > > > [ 340.372484] ? __skb_clone+0x2e/0x130 > > > [ 340.382633] ? tcp_transmit_skb+0x558/0xa10 > > > [ 340.393262] ? tcp_connect+0x938/0xad0 > > > [ 340.403370] ? ktime_get_with_offset+0x4c/0xb0 > > > [ 340.414206] ? tcp_v4_connect+0x457/0x4e0 > > > [ 340.424471] ? __inet_stream_connect+0xb3/0x300 > > > [ 340.435195] ? inet_stream_connect+0x3b/0x60 > > > [ 340.445607] ? SYSC_connect+0xd9/0x110 > > > [ 340.455455] ? __audit_syscall_entry+0xaf/0x100 > > > [ 340.466112] ? syscall_trace_enter+0x1d0/0x2b0 > > > [ 340.476636] ? __audit_syscall_exit+0x209/0x290 > > > [ 340.487151] ? SyS_connect+0xe/0x10 > > > [ 340.496453] ? do_syscall_64+0x67/0x1b0 > > > [ 340.506078] ? entry_SYSCALL64_slow_path+0x25/0x25 > > > [ 340.516693] Kernel Offset: disabled > > > [ 340.526393] Rebooting in 11 seconds.. > > > > > > This is mostly reliable, and I'm only seeing it on bare metal > > > (not in > > > a > > > virtualbox vm). > > > > > > The SELinux skb parse error at the start only sometimes appears, > > > and > > > looking at the code, I suspect some kind of memory corruption > > > being > > > the > > > cause at that point (basic packet header checks). > > > > > > I bisected the bug down to the following change: > > > > > > commit bffa72cf7f9df842f0016ba03586039296b4caaf > > > Author: Eric Dumazet > > > Date: Tue Sep 19 05:14:24 2017 -0700 > > > > > > net: sk_buff rbnode reorg > > > ... > > > > > > > > > Anyone else able to reproduce this, or have any ideas on what's > > > happening? > > > > So far I haven't been able to reproduce with 4.15-rc1 or -linus. > > > > You might try adding KASAN in the picture ? (
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, Nov 29, 2017 at 12:34 PM, Eric Dumazetwrote: > On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalley wrote: >> On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote: >>> I'm seeing a kernel stack corruption bug (detected via gcc) when >>> running >>> the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket >>> test: >>> >>> https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests >>> /inet_socket/test >>> >>> # Verify that unauthorized client cannot communicate with the >>> server. >>> $result = system >>> "runcon -t test_inet_bad_client_t -- $basedir/client stream >>> 127.0.0.1 65535 2>&1"; >>> >>> This correctlly causes an access control error in the Netlabel code, >>> and >>> the bug seems to be triggered during the ICMP send: >>> >>> [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to >>> parse packet >>> [ 339.822505] Kernel panic - not syncing: stack-protector: Kernel >>> stack is corrupted in: 81745af5 >>> [ 339.822505] >>> [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1- >>> test #15 >>> [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS >>> FWKT68A 01/19/2017 >>> [ 339.885060] Call Trace: >>> [ 339.896875] >>> [ 339.908103] dump_stack+0x63/0x87 >>> [ 339.920645] panic+0xe8/0x248 >>> [ 339.932668] ? ip_push_pending_frames+0x33/0x40 >>> [ 339.946328] ? icmp_send+0x525/0x530 >>> [ 339.958861] ? kfree_skbmem+0x60/0x70 >>> [ 339.971431] __stack_chk_fail+0x1b/0x20 >>> [ 339.984049] icmp_send+0x525/0x530 ... >>> This is mostly reliable, and I'm only seeing it on bare metal (not in >>> a >>> virtualbox vm). >>> >>> The SELinux skb parse error at the start only sometimes appears, and >>> looking at the code, I suspect some kind of memory corruption being >>> the >>> cause at that point (basic packet header checks). >>> >>> I bisected the bug down to the following change: >>> >>> commit bffa72cf7f9df842f0016ba03586039296b4caaf >>> Author: Eric Dumazet >>> Date: Tue Sep 19 05:14:24 2017 -0700 >>> >>> net: sk_buff rbnode reorg >>> ... >>> >>> >>> Anyone else able to reproduce this, or have any ideas on what's >>> happening? >> >> So far I haven't been able to reproduce with 4.15-rc1 or -linus. > > You might try adding KASAN in the picture ? ( CONFIG_KASAN=y ) As another data point, I have not hit this problem either, but I'm not currently building my test kernels with KASAN enabled. -- paul moore www.paul-moore.com
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, Nov 29, 2017 at 9:31 AM, Stephen Smalleywrote: > On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote: >> I'm seeing a kernel stack corruption bug (detected via gcc) when >> running >> the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket >> test: >> >> https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests >> /inet_socket/test >> >> # Verify that unauthorized client cannot communicate with the >> server. >> $result = system >> "runcon -t test_inet_bad_client_t -- $basedir/client stream >> 127.0.0.1 65535 2>&1"; >> >> This correctlly causes an access control error in the Netlabel code, >> and >> the bug seems to be triggered during the ICMP send: >> >> [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to >> parse packet >> [ 339.822505] Kernel panic - not syncing: stack-protector: Kernel >> stack is corrupted in: 81745af5 >> [ 339.822505] >> [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1- >> test #15 >> [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS >> FWKT68A 01/19/2017 >> [ 339.885060] Call Trace: >> [ 339.896875] >> [ 339.908103] dump_stack+0x63/0x87 >> [ 339.920645] panic+0xe8/0x248 >> [ 339.932668] ? ip_push_pending_frames+0x33/0x40 >> [ 339.946328] ? icmp_send+0x525/0x530 >> [ 339.958861] ? kfree_skbmem+0x60/0x70 >> [ 339.971431] __stack_chk_fail+0x1b/0x20 >> [ 339.984049] icmp_send+0x525/0x530 >> [ 339.996205] ? netlbl_skbuff_err+0x36/0x40 >> [ 340.008997] ? selinux_netlbl_err+0x11/0x20 >> [ 340.021816] ? selinux_socket_sock_rcv_skb+0x211/0x230 >> [ 340.035529] ? security_sock_rcv_skb+0x3b/0x50 >> [ 340.048471] ? sk_filter_trim_cap+0x44/0x1c0 >> [ 340.061246] ? tcp_v4_inbound_md5_hash+0x69/0x1b0 >> [ 340.074562] ? tcp_filter+0x2c/0x40 >> [ 340.086400] ? tcp_v4_rcv+0x820/0xa20 >> [ 340.098329] ? ip_local_deliver_finish+0x71/0x1a0 >> [ 340.111279] ? ip_local_deliver+0x6f/0xe0 >> [ 340.123535] ? ip_rcv_finish+0x3a0/0x3a0 >> [ 340.135523] ? ip_rcv_finish+0xdb/0x3a0 >> [ 340.147442] ? ip_rcv+0x27c/0x3c0 >> [ 340.158668] ? inet_del_offload+0x40/0x40 >> [ 340.170580] ? __netif_receive_skb_core+0x4ac/0x900 >> [ 340.183285] ? rcu_accelerate_cbs+0x5b/0x80 >> [ 340.195282] ? __netif_receive_skb+0x18/0x60 >> [ 340.207288] ? process_backlog+0x95/0x140 >> [ 340.218948] ? net_rx_action+0x26c/0x3b0 >> [ 340.230416] ? __do_softirq+0xc9/0x26a >> [ 340.241625] ? do_softirq_own_stack+0x2a/0x40 >> [ 340.253368] >> [ 340.262673] ? do_softirq+0x50/0x60 >> [ 340.273450] ? __local_bh_enable_ip+0x57/0x60 >> [ 340.285045] ? ip_finish_output2+0x175/0x350 >> [ 340.296403] ? ip_finish_output+0x127/0x1d0 >> [ 340.307665] ? nf_hook_slow+0x3c/0xb0 >> [ 340.318230] ? ip_output+0x72/0xe0 >> [ 340.328524] ? ip_fragment.constprop.54+0x80/0x80 >> [ 340.340070] ? ip_local_out+0x35/0x40 >> [ 340.350497] ? ip_queue_xmit+0x15c/0x3f0 >> [ 340.361060] ? __kmalloc_reserve.isra.40+0x31/0x90 >> [ 340.372484] ? __skb_clone+0x2e/0x130 >> [ 340.382633] ? tcp_transmit_skb+0x558/0xa10 >> [ 340.393262] ? tcp_connect+0x938/0xad0 >> [ 340.403370] ? ktime_get_with_offset+0x4c/0xb0 >> [ 340.414206] ? tcp_v4_connect+0x457/0x4e0 >> [ 340.424471] ? __inet_stream_connect+0xb3/0x300 >> [ 340.435195] ? inet_stream_connect+0x3b/0x60 >> [ 340.445607] ? SYSC_connect+0xd9/0x110 >> [ 340.455455] ? __audit_syscall_entry+0xaf/0x100 >> [ 340.466112] ? syscall_trace_enter+0x1d0/0x2b0 >> [ 340.476636] ? __audit_syscall_exit+0x209/0x290 >> [ 340.487151] ? SyS_connect+0xe/0x10 >> [ 340.496453] ? do_syscall_64+0x67/0x1b0 >> [ 340.506078] ? entry_SYSCALL64_slow_path+0x25/0x25 >> [ 340.516693] Kernel Offset: disabled >> [ 340.526393] Rebooting in 11 seconds.. >> >> This is mostly reliable, and I'm only seeing it on bare metal (not in >> a >> virtualbox vm). >> >> The SELinux skb parse error at the start only sometimes appears, and >> looking at the code, I suspect some kind of memory corruption being >> the >> cause at that point (basic packet header checks). >> >> I bisected the bug down to the following change: >> >> commit bffa72cf7f9df842f0016ba03586039296b4caaf >> Author: Eric Dumazet >> Date: Tue Sep 19 05:14:24 2017 -0700 >> >> net: sk_buff rbnode reorg >> ... >> >> >> Anyone else able to reproduce this, or have any ideas on what's >> happening? > > So far I haven't been able to reproduce with 4.15-rc1 or -linus. > You might try adding KASAN in the picture ? ( CONFIG_KASAN=y ) Thanks.
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, 2017-11-29 at 21:26 +1100, James Morris wrote: > I'm seeing a kernel stack corruption bug (detected via gcc) when > running > the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket > test: > > https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests > /inet_socket/test > > # Verify that unauthorized client cannot communicate with the > server. > $result = system > "runcon -t test_inet_bad_client_t -- $basedir/client stream > 127.0.0.1 65535 2>&1"; > > This correctlly causes an access control error in the Netlabel code, > and > the bug seems to be triggered during the ICMP send: > > [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to > parse packet > [ 339.822505] Kernel panic - not syncing: stack-protector: Kernel > stack is corrupted in: 81745af5 > [ 339.822505] > [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1- > test #15 > [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS > FWKT68A 01/19/2017 > [ 339.885060] Call Trace: > [ 339.896875] > [ 339.908103] dump_stack+0x63/0x87 > [ 339.920645] panic+0xe8/0x248 > [ 339.932668] ? ip_push_pending_frames+0x33/0x40 > [ 339.946328] ? icmp_send+0x525/0x530 > [ 339.958861] ? kfree_skbmem+0x60/0x70 > [ 339.971431] __stack_chk_fail+0x1b/0x20 > [ 339.984049] icmp_send+0x525/0x530 > [ 339.996205] ? netlbl_skbuff_err+0x36/0x40 > [ 340.008997] ? selinux_netlbl_err+0x11/0x20 > [ 340.021816] ? selinux_socket_sock_rcv_skb+0x211/0x230 > [ 340.035529] ? security_sock_rcv_skb+0x3b/0x50 > [ 340.048471] ? sk_filter_trim_cap+0x44/0x1c0 > [ 340.061246] ? tcp_v4_inbound_md5_hash+0x69/0x1b0 > [ 340.074562] ? tcp_filter+0x2c/0x40 > [ 340.086400] ? tcp_v4_rcv+0x820/0xa20 > [ 340.098329] ? ip_local_deliver_finish+0x71/0x1a0 > [ 340.111279] ? ip_local_deliver+0x6f/0xe0 > [ 340.123535] ? ip_rcv_finish+0x3a0/0x3a0 > [ 340.135523] ? ip_rcv_finish+0xdb/0x3a0 > [ 340.147442] ? ip_rcv+0x27c/0x3c0 > [ 340.158668] ? inet_del_offload+0x40/0x40 > [ 340.170580] ? __netif_receive_skb_core+0x4ac/0x900 > [ 340.183285] ? rcu_accelerate_cbs+0x5b/0x80 > [ 340.195282] ? __netif_receive_skb+0x18/0x60 > [ 340.207288] ? process_backlog+0x95/0x140 > [ 340.218948] ? net_rx_action+0x26c/0x3b0 > [ 340.230416] ? __do_softirq+0xc9/0x26a > [ 340.241625] ? do_softirq_own_stack+0x2a/0x40 > [ 340.253368] > [ 340.262673] ? do_softirq+0x50/0x60 > [ 340.273450] ? __local_bh_enable_ip+0x57/0x60 > [ 340.285045] ? ip_finish_output2+0x175/0x350 > [ 340.296403] ? ip_finish_output+0x127/0x1d0 > [ 340.307665] ? nf_hook_slow+0x3c/0xb0 > [ 340.318230] ? ip_output+0x72/0xe0 > [ 340.328524] ? ip_fragment.constprop.54+0x80/0x80 > [ 340.340070] ? ip_local_out+0x35/0x40 > [ 340.350497] ? ip_queue_xmit+0x15c/0x3f0 > [ 340.361060] ? __kmalloc_reserve.isra.40+0x31/0x90 > [ 340.372484] ? __skb_clone+0x2e/0x130 > [ 340.382633] ? tcp_transmit_skb+0x558/0xa10 > [ 340.393262] ? tcp_connect+0x938/0xad0 > [ 340.403370] ? ktime_get_with_offset+0x4c/0xb0 > [ 340.414206] ? tcp_v4_connect+0x457/0x4e0 > [ 340.424471] ? __inet_stream_connect+0xb3/0x300 > [ 340.435195] ? inet_stream_connect+0x3b/0x60 > [ 340.445607] ? SYSC_connect+0xd9/0x110 > [ 340.455455] ? __audit_syscall_entry+0xaf/0x100 > [ 340.466112] ? syscall_trace_enter+0x1d0/0x2b0 > [ 340.476636] ? __audit_syscall_exit+0x209/0x290 > [ 340.487151] ? SyS_connect+0xe/0x10 > [ 340.496453] ? do_syscall_64+0x67/0x1b0 > [ 340.506078] ? entry_SYSCALL64_slow_path+0x25/0x25 > [ 340.516693] Kernel Offset: disabled > [ 340.526393] Rebooting in 11 seconds.. > > This is mostly reliable, and I'm only seeing it on bare metal (not in > a > virtualbox vm). > > The SELinux skb parse error at the start only sometimes appears, and > looking at the code, I suspect some kind of memory corruption being > the > cause at that point (basic packet header checks). > > I bisected the bug down to the following change: > > commit bffa72cf7f9df842f0016ba03586039296b4caaf > Author: Eric Dumazet> Date: Tue Sep 19 05:14:24 2017 -0700 > > net: sk_buff rbnode reorg > ... > > > Anyone else able to reproduce this, or have any ideas on what's > happening? So far I haven't been able to reproduce with 4.15-rc1 or -linus.
Re: [BUG] kernel stack corruption during/after Netlabel error
On Wed, Nov 29, 2017 at 2:26 AM, James Morriswrote: > I'm seeing a kernel stack corruption bug (detected via gcc) when running > the SELinux testsuite on a 4.15-rc1 kernel, in the 2nd inet_socket test: > > https://github.com/SELinuxProject/selinux-testsuite/blob/master/tests/inet_socket/test > > # Verify that unauthorized client cannot communicate with the server. > $result = system > "runcon -t test_inet_bad_client_t -- $basedir/client stream 127.0.0.1 65535 > 2>&1"; > > This correctlly causes an access control error in the Netlabel code, and > the bug seems to be triggered during the ICMP send: > > [ 339.806024] SELinux: failure in selinux_parse_skb(), unable to parse packet > [ 339.822505] Kernel panic - not syncing: stack-protector: Kernel stack is > corrupted in: 81745af5 > [ 339.822505] > [ 339.852250] CPU: 4 PID: 3642 Comm: client Not tainted 4.15.0-rc1-test #15 > [ 339.868498] Hardware name: LENOVO 10FGS0VA1L/30BC, BIOS FWKT68A > 01/19/2017 > [ 339.885060] Call Trace: > [ 339.896875] > [ 339.908103] dump_stack+0x63/0x87 > [ 339.920645] panic+0xe8/0x248 > [ 339.932668] ? ip_push_pending_frames+0x33/0x40 > [ 339.946328] ? icmp_send+0x525/0x530 > [ 339.958861] ? kfree_skbmem+0x60/0x70 > [ 339.971431] __stack_chk_fail+0x1b/0x20 > [ 339.984049] icmp_send+0x525/0x530 > [ 339.996205] ? netlbl_skbuff_err+0x36/0x40 > [ 340.008997] ? selinux_netlbl_err+0x11/0x20 > [ 340.021816] ? selinux_socket_sock_rcv_skb+0x211/0x230 > [ 340.035529] ? security_sock_rcv_skb+0x3b/0x50 > [ 340.048471] ? sk_filter_trim_cap+0x44/0x1c0 > [ 340.061246] ? tcp_v4_inbound_md5_hash+0x69/0x1b0 > [ 340.074562] ? tcp_filter+0x2c/0x40 > [ 340.086400] ? tcp_v4_rcv+0x820/0xa20 > [ 340.098329] ? ip_local_deliver_finish+0x71/0x1a0 > [ 340.111279] ? ip_local_deliver+0x6f/0xe0 > [ 340.123535] ? ip_rcv_finish+0x3a0/0x3a0 > [ 340.135523] ? ip_rcv_finish+0xdb/0x3a0 > [ 340.147442] ? ip_rcv+0x27c/0x3c0 > [ 340.158668] ? inet_del_offload+0x40/0x40 > [ 340.170580] ? __netif_receive_skb_core+0x4ac/0x900 > [ 340.183285] ? rcu_accelerate_cbs+0x5b/0x80 > [ 340.195282] ? __netif_receive_skb+0x18/0x60 > [ 340.207288] ? process_backlog+0x95/0x140 > [ 340.218948] ? net_rx_action+0x26c/0x3b0 > [ 340.230416] ? __do_softirq+0xc9/0x26a > [ 340.241625] ? do_softirq_own_stack+0x2a/0x40 > [ 340.253368] > [ 340.262673] ? do_softirq+0x50/0x60 > [ 340.273450] ? __local_bh_enable_ip+0x57/0x60 > [ 340.285045] ? ip_finish_output2+0x175/0x350 > [ 340.296403] ? ip_finish_output+0x127/0x1d0 > [ 340.307665] ? nf_hook_slow+0x3c/0xb0 > [ 340.318230] ? ip_output+0x72/0xe0 > [ 340.328524] ? ip_fragment.constprop.54+0x80/0x80 > [ 340.340070] ? ip_local_out+0x35/0x40 > [ 340.350497] ? ip_queue_xmit+0x15c/0x3f0 > [ 340.361060] ? __kmalloc_reserve.isra.40+0x31/0x90 > [ 340.372484] ? __skb_clone+0x2e/0x130 > [ 340.382633] ? tcp_transmit_skb+0x558/0xa10 > [ 340.393262] ? tcp_connect+0x938/0xad0 > [ 340.403370] ? ktime_get_with_offset+0x4c/0xb0 > [ 340.414206] ? tcp_v4_connect+0x457/0x4e0 > [ 340.424471] ? __inet_stream_connect+0xb3/0x300 > [ 340.435195] ? inet_stream_connect+0x3b/0x60 > [ 340.445607] ? SYSC_connect+0xd9/0x110 > [ 340.455455] ? __audit_syscall_entry+0xaf/0x100 > [ 340.466112] ? syscall_trace_enter+0x1d0/0x2b0 > [ 340.476636] ? __audit_syscall_exit+0x209/0x290 > [ 340.487151] ? SyS_connect+0xe/0x10 > [ 340.496453] ? do_syscall_64+0x67/0x1b0 > [ 340.506078] ? entry_SYSCALL64_slow_path+0x25/0x25 > [ 340.516693] Kernel Offset: disabled > [ 340.526393] Rebooting in 11 seconds.. > > This is mostly reliable, and I'm only seeing it on bare metal (not in a > virtualbox vm). > > The SELinux skb parse error at the start only sometimes appears, and > looking at the code, I suspect some kind of memory corruption being the > cause at that point (basic packet header checks). > > I bisected the bug down to the following change: > > commit bffa72cf7f9df842f0016ba03586039296b4caaf > Author: Eric Dumazet > Date: Tue Sep 19 05:14:24 2017 -0700 > > net: sk_buff rbnode reorg > ... > > > Anyone else able to reproduce this, or have any ideas on what's happening? > > Hi James, thanks for the report. Issue here is that icmp_send() used to be called with skb_in->dev == NULL or a valid device pointer ? After my patch, skb_in->dev is aliased with part of skb_in->rbnode (rb_left pointer) So this code in icmp_send() might be fooled : if (!(skb_in->dev && (skb_in->dev->flags_LOOPBACK)) && !icmpv4_global_allow(net, type, code)) goto out_bh_enable; Although TCP stack should not manipulate skb->rbnode before the calls to tcp_filter() (and thus security_sock_rcv_skb()) So at the point security_sock_rcv_skb is called, skb->dev should still be valid.