Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-20 Thread Jacob Siverskog
On Tue, Jan 5, 2016 at 3:39 PM, Eric Dumazet  wrote:
> On Tue, 2016-01-05 at 15:34 +0100, Jacob Siverskog wrote:
>> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  wrote:
>
>> >
>> > You might build a kernel with KASAN support to get maybe more chances to
>> > trigger the bug.
>> >
>> > ( https://www.kernel.org/doc/Documentation/kasan.txt )
>> >
>>
>> Ah. Doesn't seem to be supported on arm(32) unfortunately.
>
> Then you could at least use standard debugging features :
>
> CONFIG_SLAB=y
> CONFIG_SLABINFO=y
> CONFIG_DEBUG_SLAB=y
> CONFIG_DEBUG_SLAB_LEAK=y
>
> (Or equivalent SLUB options)
>
> and
>
> CONFIG_DEBUG_PAGEALLOC=y
>
> (If arm(32) has CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y)

I tried with those enabled and while toggling power on the Bluetooth
interface I usually get this after a few iterations:
kernel: Bluetooth: Unable to push skb to HCI core(-6)
kernel: (stc):  proto stack 4's ->recv failed
kernel: Slab corruption (Not tainted): skbuff_head_cache start=c08a8a00, len=176
kernel: 0a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a 6b 6b a5  jkk.
kernel: Prev obj: start=c08a8940, len=176
kernel: 000: 00 00 00 00 00 00 00 00 31 73 52 00 43 17 2b 14  1sR.C.+.
kernel: 010: 00 00 00 00 00 00 00 00 04 00 00 00 01 00 00 00  
kernel: Next obj: start=c08a8ac0, len=176
kernel: 000: 00 00 00 00 00 00 00 00 01 42 f6 50 36 17 2b 14  .B.P6.+.
kernel: 010: 00 00 00 00 00 00 00 00 04 00 00 00 01 00 00 00  

The "Unable to push skb" and "recv failed" lines always appear before
the corruption.

Unfortunately, the corruptions occur also with your patch.


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-20 Thread Eric Dumazet
On Wed, 2016-01-20 at 17:17 +0100, Jacob Siverskog wrote:
> On Wed, Jan 20, 2016 at 4:48 PM, Eric Dumazet  wrote:
> > On Wed, 2016-01-20 at 16:06 +0100, Jacob Siverskog wrote:
> >> On Tue, Jan 5, 2016 at 3:39 PM, Eric Dumazet  
> >> wrote:
> >> > On Tue, 2016-01-05 at 15:34 +0100, Jacob Siverskog wrote:
> >> >> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  
> >> >> wrote:
> >> >
> >> >> >
> >> >> > You might build a kernel with KASAN support to get maybe more chances 
> >> >> > to
> >> >> > trigger the bug.
> >> >> >
> >> >> > ( https://www.kernel.org/doc/Documentation/kasan.txt )
> >> >> >
> >> >>
> >> >> Ah. Doesn't seem to be supported on arm(32) unfortunately.
> >> >
> >> > Then you could at least use standard debugging features :
> >> >
> >> > CONFIG_SLAB=y
> >> > CONFIG_SLABINFO=y
> >> > CONFIG_DEBUG_SLAB=y
> >> > CONFIG_DEBUG_SLAB_LEAK=y
> >> >
> >> > (Or equivalent SLUB options)
> >> >
> >> > and
> >> >
> >> > CONFIG_DEBUG_PAGEALLOC=y
> >> >
> >> > (If arm(32) has CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y)
> >>
> >> I tried with those enabled and while toggling power on the Bluetooth
> >> interface I usually get this after a few iterations:
> >> kernel: Bluetooth: Unable to push skb to HCI core(-6)
> >
> > Well, this code seems to be quite buggy.
> >
> > I do not have time to audit it, but 5 minutes are enough to spot 2
> > issues.
> >
> > skb, once given to another queue/layer should not be accessed anymore.
> >
> 
> Ok. Unfortunately I still see the slab corruption even with your changes.

Patch was only showing potential _reads_ after free, which do not
generally corrupt memory.

As I said, a full audit is needed, and I don't have time for this.





Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-20 Thread Peter Hurley
Hi Jacob,

On 01/05/2016 06:34 AM, Jacob Siverskog wrote:
> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  wrote:
>> On Tue, 2016-01-05 at 12:07 +0100, Jacob Siverskog wrote:
>>> On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet  wrote:
 On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  
> wrote:
>> On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
>>  wrote:
>>> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  
>>> wrote:
 How often can you trigger this bug ?
>>>
>>> Ok. I don't have a good repro to trigger it unfortunately, I've seen it 
>>> just a
>>> few times when bringing up/down network interfaces. Does the trace
>>> give any clue?
>>>
>>
>> A little bit. You need to help people to narrow down the problem
>> because there are too many places using skb->next and skb->prev.
>>
>> Since you mentioned it seems related to network interface flip,
>> what network interfaces are you using? What's is your TC setup?
>>
>> Thanks.
>
> The system contains only one physical network interface (TI WL1837,
> wl18xx module).
> The state prior to the crash was as follows:
> - One virtual network interface active (as STA, associated with access 
> point)
> - Bluetooth (BLE only) active (same physical chip, co-existence,
> btwilink/st_drv modules)
>
> Actions made around the time of the crash:
> - Bluetooth disabled
> - One additional virtual network interface brought up (also as STA)
>
> I believe the crash occurred between these two actions. I just saw
> that there are some interesting events in the log prior to the crash:
> kernel: Bluetooth: Unable to push skb to HCI core(-6)
> kernel: (stc):  proto stack 4's ->recv failed
> kernel: (stc): remove_channel_from_table: id 3
> kernel: (stc): remove_channel_from_table: id 2
> kernel: (stc): remove_channel_from_table: id 4
> kernel: (stc):  all chnl_ids unregistered
> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
>
> The first print is from btwilink.c. However, I can't see the
> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
> 6LoWPAN or anything similar).
>
> Thanks, Jacob

 Definitely these details are useful ;)

 Could you try :

 diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
 index 6e3af8b42cdd..0c99a74fb895 100644
 --- a/drivers/misc/ti-st/st_core.c
 +++ b/drivers/misc/ti-st/st_core.c
 @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
 skb_queue_purge(_gdata->txq);
 skb_queue_purge(_gdata->tx_waitq);
 kfree_skb(st_gdata->rx_skb);
 +   st_gdata->rx_skb = NULL;
 kfree_skb(st_gdata->tx_skb);
 +   st_gdata->tx_skb = NULL;
 /* TTY ldisc cleanup */
 err = tty_unregister_ldisc(N_TI_WL);
 if (err)

FWIW,

You don't need that ti-st junk to get the WL1837 working; the WL1837 only
has BT channels. Unfortunately, that's really all I can say about it; sorry.

Regards,
Peter Hurley




Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-20 Thread Eric Dumazet
On Wed, 2016-01-20 at 16:06 +0100, Jacob Siverskog wrote:
> On Tue, Jan 5, 2016 at 3:39 PM, Eric Dumazet  wrote:
> > On Tue, 2016-01-05 at 15:34 +0100, Jacob Siverskog wrote:
> >> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  
> >> wrote:
> >
> >> >
> >> > You might build a kernel with KASAN support to get maybe more chances to
> >> > trigger the bug.
> >> >
> >> > ( https://www.kernel.org/doc/Documentation/kasan.txt )
> >> >
> >>
> >> Ah. Doesn't seem to be supported on arm(32) unfortunately.
> >
> > Then you could at least use standard debugging features :
> >
> > CONFIG_SLAB=y
> > CONFIG_SLABINFO=y
> > CONFIG_DEBUG_SLAB=y
> > CONFIG_DEBUG_SLAB_LEAK=y
> >
> > (Or equivalent SLUB options)
> >
> > and
> >
> > CONFIG_DEBUG_PAGEALLOC=y
> >
> > (If arm(32) has CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y)
> 
> I tried with those enabled and while toggling power on the Bluetooth
> interface I usually get this after a few iterations:
> kernel: Bluetooth: Unable to push skb to HCI core(-6)

Well, this code seems to be quite buggy.

I do not have time to audit it, but 5 minutes are enough to spot 2
issues.

skb, once given to another queue/layer should not be accessed anymore.

diff --git a/drivers/bluetooth/btwilink.c b/drivers/bluetooth/btwilink.c
index 24a652f9252b..2d3092aa6cfe 100644
--- a/drivers/bluetooth/btwilink.c
+++ b/drivers/bluetooth/btwilink.c
@@ -98,6 +98,7 @@ static void st_reg_completion_cb(void *priv_data, char data)
 static long st_receive(void *priv_data, struct sk_buff *skb)
 {
struct ti_st *lhst = priv_data;
+   unsigned int len;
int err;
 
if (!skb)
@@ -109,13 +110,14 @@ static long st_receive(void *priv_data, struct sk_buff 
*skb)
}
 
/* Forward skb to HCI core layer */
+   len = skb->len;
err = hci_recv_frame(lhst->hdev, skb);
if (err < 0) {
BT_ERR("Unable to push skb to HCI core(%d)", err);
return err;
}
 
-   lhst->hdev->stat.byte_rx += skb->len;
+   lhst->hdev->stat.byte_rx += len;
 
return 0;
 }
@@ -245,6 +247,7 @@ static int ti_st_send_frame(struct hci_dev *hdev, struct 
sk_buff *skb)
 {
struct ti_st *hst;
long len;
+   u8 pkt_type;
 
hst = hci_get_drvdata(hdev);
 
@@ -258,6 +261,7 @@ static int ti_st_send_frame(struct hci_dev *hdev, struct 
sk_buff *skb)
 * Freeing skb memory is taken care in shared transport layer,
 * so don't free skb memory here.
 */
+   pkt_type = hci_skb_pkt_type(skb);
len = hst->st_write(skb);
if (len < 0) {
kfree_skb(skb);
@@ -268,7 +272,7 @@ static int ti_st_send_frame(struct hci_dev *hdev, struct 
sk_buff *skb)
 
/* ST accepted our skb. So, Go ahead and do rest */
hdev->stat.byte_tx += len;
-   ti_st_tx_complete(hst, hci_skb_pkt_type(skb));
+   ti_st_tx_complete(hst, pkt_type);
 
return 0;
 }





Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-20 Thread Jacob Siverskog
On Wed, Jan 20, 2016 at 4:48 PM, Eric Dumazet  wrote:
> On Wed, 2016-01-20 at 16:06 +0100, Jacob Siverskog wrote:
>> On Tue, Jan 5, 2016 at 3:39 PM, Eric Dumazet  wrote:
>> > On Tue, 2016-01-05 at 15:34 +0100, Jacob Siverskog wrote:
>> >> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  
>> >> wrote:
>> >
>> >> >
>> >> > You might build a kernel with KASAN support to get maybe more chances to
>> >> > trigger the bug.
>> >> >
>> >> > ( https://www.kernel.org/doc/Documentation/kasan.txt )
>> >> >
>> >>
>> >> Ah. Doesn't seem to be supported on arm(32) unfortunately.
>> >
>> > Then you could at least use standard debugging features :
>> >
>> > CONFIG_SLAB=y
>> > CONFIG_SLABINFO=y
>> > CONFIG_DEBUG_SLAB=y
>> > CONFIG_DEBUG_SLAB_LEAK=y
>> >
>> > (Or equivalent SLUB options)
>> >
>> > and
>> >
>> > CONFIG_DEBUG_PAGEALLOC=y
>> >
>> > (If arm(32) has CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y)
>>
>> I tried with those enabled and while toggling power on the Bluetooth
>> interface I usually get this after a few iterations:
>> kernel: Bluetooth: Unable to push skb to HCI core(-6)
>
> Well, this code seems to be quite buggy.
>
> I do not have time to audit it, but 5 minutes are enough to spot 2
> issues.
>
> skb, once given to another queue/layer should not be accessed anymore.
>

Ok. Unfortunately I still see the slab corruption even with your changes.


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Jacob Siverskog
On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  wrote:
> On Tue, 2016-01-05 at 12:07 +0100, Jacob Siverskog wrote:
>> On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet  wrote:
>> > On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
>> >> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  
>> >> wrote:
>> >> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
>> >> >  wrote:
>> >> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  
>> >> >> wrote:
>> >> >>> How often can you trigger this bug ?
>> >> >>
>> >> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen 
>> >> >> it just a
>> >> >> few times when bringing up/down network interfaces. Does the trace
>> >> >> give any clue?
>> >> >>
>> >> >
>> >> > A little bit. You need to help people to narrow down the problem
>> >> > because there are too many places using skb->next and skb->prev.
>> >> >
>> >> > Since you mentioned it seems related to network interface flip,
>> >> > what network interfaces are you using? What's is your TC setup?
>> >> >
>> >> > Thanks.
>> >>
>> >> The system contains only one physical network interface (TI WL1837,
>> >> wl18xx module).
>> >> The state prior to the crash was as follows:
>> >> - One virtual network interface active (as STA, associated with access 
>> >> point)
>> >> - Bluetooth (BLE only) active (same physical chip, co-existence,
>> >> btwilink/st_drv modules)
>> >>
>> >> Actions made around the time of the crash:
>> >> - Bluetooth disabled
>> >> - One additional virtual network interface brought up (also as STA)
>> >>
>> >> I believe the crash occurred between these two actions. I just saw
>> >> that there are some interesting events in the log prior to the crash:
>> >> kernel: Bluetooth: Unable to push skb to HCI core(-6)
>> >> kernel: (stc):  proto stack 4's ->recv failed
>> >> kernel: (stc): remove_channel_from_table: id 3
>> >> kernel: (stc): remove_channel_from_table: id 2
>> >> kernel: (stc): remove_channel_from_table: id 4
>> >> kernel: (stc):  all chnl_ids unregistered
>> >> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
>> >>
>> >> The first print is from btwilink.c. However, I can't see the
>> >> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
>> >> 6LoWPAN or anything similar).
>> >>
>> >> Thanks, Jacob
>> >
>> > Definitely these details are useful ;)
>> >
>> > Could you try :
>> >
>> > diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
>> > index 6e3af8b42cdd..0c99a74fb895 100644
>> > --- a/drivers/misc/ti-st/st_core.c
>> > +++ b/drivers/misc/ti-st/st_core.c
>> > @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
>> > skb_queue_purge(_gdata->txq);
>> > skb_queue_purge(_gdata->tx_waitq);
>> > kfree_skb(st_gdata->rx_skb);
>> > +   st_gdata->rx_skb = NULL;
>> > kfree_skb(st_gdata->tx_skb);
>> > +   st_gdata->tx_skb = NULL;
>> > /* TTY ldisc cleanup */
>> > err = tty_unregister_ldisc(N_TI_WL);
>> > if (err)
>> >
>> >
>>
>> Sure. Since I don't have a good way to trigger the initial issue, I
>> can't really know if there is a difference with your patch. However,
>> normal usage seems to work as expected with your patch. I've tried to
>> reproduce the initial issue with and without your patch repeatedly for
>> hours and have not seen any crash in any of the runs so far.
>> --
>
> You might build a kernel with KASAN support to get maybe more chances to
> trigger the bug.
>
> ( https://www.kernel.org/doc/Documentation/kasan.txt )
>

Ah. Doesn't seem to be supported on arm(32) unfortunately.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 15:34 +0100, Jacob Siverskog wrote:
> On Tue, Jan 5, 2016 at 3:14 PM, Eric Dumazet  wrote:

> >
> > You might build a kernel with KASAN support to get maybe more chances to
> > trigger the bug.
> >
> > ( https://www.kernel.org/doc/Documentation/kasan.txt )
> >
> 
> Ah. Doesn't seem to be supported on arm(32) unfortunately.

Then you could at least use standard debugging features :

CONFIG_SLAB=y
CONFIG_SLABINFO=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y

(Or equivalent SLUB options)

and

CONFIG_DEBUG_PAGEALLOC=y

(If arm(32) has CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 12:07 +0100, Jacob Siverskog wrote:
> On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet  wrote:
> > On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
> >> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  
> >> wrote:
> >> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
> >> >  wrote:
> >> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  
> >> >> wrote:
> >> >>> How often can you trigger this bug ?
> >> >>
> >> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen it 
> >> >> just a
> >> >> few times when bringing up/down network interfaces. Does the trace
> >> >> give any clue?
> >> >>
> >> >
> >> > A little bit. You need to help people to narrow down the problem
> >> > because there are too many places using skb->next and skb->prev.
> >> >
> >> > Since you mentioned it seems related to network interface flip,
> >> > what network interfaces are you using? What's is your TC setup?
> >> >
> >> > Thanks.
> >>
> >> The system contains only one physical network interface (TI WL1837,
> >> wl18xx module).
> >> The state prior to the crash was as follows:
> >> - One virtual network interface active (as STA, associated with access 
> >> point)
> >> - Bluetooth (BLE only) active (same physical chip, co-existence,
> >> btwilink/st_drv modules)
> >>
> >> Actions made around the time of the crash:
> >> - Bluetooth disabled
> >> - One additional virtual network interface brought up (also as STA)
> >>
> >> I believe the crash occurred between these two actions. I just saw
> >> that there are some interesting events in the log prior to the crash:
> >> kernel: Bluetooth: Unable to push skb to HCI core(-6)
> >> kernel: (stc):  proto stack 4's ->recv failed
> >> kernel: (stc): remove_channel_from_table: id 3
> >> kernel: (stc): remove_channel_from_table: id 2
> >> kernel: (stc): remove_channel_from_table: id 4
> >> kernel: (stc):  all chnl_ids unregistered
> >> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
> >>
> >> The first print is from btwilink.c. However, I can't see the
> >> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
> >> 6LoWPAN or anything similar).
> >>
> >> Thanks, Jacob
> >
> > Definitely these details are useful ;)
> >
> > Could you try :
> >
> > diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
> > index 6e3af8b42cdd..0c99a74fb895 100644
> > --- a/drivers/misc/ti-st/st_core.c
> > +++ b/drivers/misc/ti-st/st_core.c
> > @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
> > skb_queue_purge(_gdata->txq);
> > skb_queue_purge(_gdata->tx_waitq);
> > kfree_skb(st_gdata->rx_skb);
> > +   st_gdata->rx_skb = NULL;
> > kfree_skb(st_gdata->tx_skb);
> > +   st_gdata->tx_skb = NULL;
> > /* TTY ldisc cleanup */
> > err = tty_unregister_ldisc(N_TI_WL);
> > if (err)
> >
> >
> 
> Sure. Since I don't have a good way to trigger the initial issue, I
> can't really know if there is a difference with your patch. However,
> normal usage seems to work as expected with your patch. I've tried to
> reproduce the initial issue with and without your patch repeatedly for
> hours and have not seen any crash in any of the runs so far.
> --

You might build a kernel with KASAN support to get maybe more chances to
trigger the bug.

( https://www.kernel.org/doc/Documentation/kasan.txt )



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-05 Thread Jacob Siverskog
On Mon, Jan 4, 2016 at 4:25 PM, Eric Dumazet  wrote:
> On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
>> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  wrote:
>> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
>> >  wrote:
>> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  wrote:
>> >>> How often can you trigger this bug ?
>> >>
>> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen it 
>> >> just a
>> >> few times when bringing up/down network interfaces. Does the trace
>> >> give any clue?
>> >>
>> >
>> > A little bit. You need to help people to narrow down the problem
>> > because there are too many places using skb->next and skb->prev.
>> >
>> > Since you mentioned it seems related to network interface flip,
>> > what network interfaces are you using? What's is your TC setup?
>> >
>> > Thanks.
>>
>> The system contains only one physical network interface (TI WL1837,
>> wl18xx module).
>> The state prior to the crash was as follows:
>> - One virtual network interface active (as STA, associated with access point)
>> - Bluetooth (BLE only) active (same physical chip, co-existence,
>> btwilink/st_drv modules)
>>
>> Actions made around the time of the crash:
>> - Bluetooth disabled
>> - One additional virtual network interface brought up (also as STA)
>>
>> I believe the crash occurred between these two actions. I just saw
>> that there are some interesting events in the log prior to the crash:
>> kernel: Bluetooth: Unable to push skb to HCI core(-6)
>> kernel: (stc):  proto stack 4's ->recv failed
>> kernel: (stc): remove_channel_from_table: id 3
>> kernel: (stc): remove_channel_from_table: id 2
>> kernel: (stc): remove_channel_from_table: id 4
>> kernel: (stc):  all chnl_ids unregistered
>> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
>>
>> The first print is from btwilink.c. However, I can't see the
>> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
>> 6LoWPAN or anything similar).
>>
>> Thanks, Jacob
>
> Definitely these details are useful ;)
>
> Could you try :
>
> diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
> index 6e3af8b42cdd..0c99a74fb895 100644
> --- a/drivers/misc/ti-st/st_core.c
> +++ b/drivers/misc/ti-st/st_core.c
> @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
> skb_queue_purge(_gdata->txq);
> skb_queue_purge(_gdata->tx_waitq);
> kfree_skb(st_gdata->rx_skb);
> +   st_gdata->rx_skb = NULL;
> kfree_skb(st_gdata->tx_skb);
> +   st_gdata->tx_skb = NULL;
> /* TTY ldisc cleanup */
> err = tty_unregister_ldisc(N_TI_WL);
> if (err)
>
>

Sure. Since I don't have a good way to trigger the initial issue, I
can't really know if there is a difference with your patch. However,
normal usage seems to work as expected with your patch. I've tried to
reproduce the initial issue with and without your patch repeatedly for
hours and have not seen any crash in any of the runs so far.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-04 Thread Jacob Siverskog
On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  wrote:
> On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
>  wrote:
>> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  wrote:
>>> How often can you trigger this bug ?
>>
>> Ok. I don't have a good repro to trigger it unfortunately, I've seen it just 
>> a
>> few times when bringing up/down network interfaces. Does the trace
>> give any clue?
>>
>
> A little bit. You need to help people to narrow down the problem
> because there are too many places using skb->next and skb->prev.
>
> Since you mentioned it seems related to network interface flip,
> what network interfaces are you using? What's is your TC setup?
>
> Thanks.

The system contains only one physical network interface (TI WL1837,
wl18xx module).
The state prior to the crash was as follows:
- One virtual network interface active (as STA, associated with access point)
- Bluetooth (BLE only) active (same physical chip, co-existence,
btwilink/st_drv modules)

Actions made around the time of the crash:
- Bluetooth disabled
- One additional virtual network interface brought up (also as STA)

I believe the crash occurred between these two actions. I just saw
that there are some interesting events in the log prior to the crash:
kernel: Bluetooth: Unable to push skb to HCI core(-6)
kernel: (stc):  proto stack 4's ->recv failed
kernel: (stc): remove_channel_from_table: id 3
kernel: (stc): remove_channel_from_table: id 2
kernel: (stc): remove_channel_from_table: id 4
kernel: (stc):  all chnl_ids unregistered
kernel: (stk) :ldisc_install = 0(stc): st_tty_close

The first print is from btwilink.c. However, I can't see the
connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
6LoWPAN or anything similar).

Thanks, Jacob
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-04 Thread Eric Dumazet
On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
> On Wed, Dec 30, 2015 at 11:30 PM, Cong Wang  wrote:
> > On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
> >  wrote:
> >> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  wrote:
> >>> How often can you trigger this bug ?
> >>
> >> Ok. I don't have a good repro to trigger it unfortunately, I've seen it 
> >> just a
> >> few times when bringing up/down network interfaces. Does the trace
> >> give any clue?
> >>
> >
> > A little bit. You need to help people to narrow down the problem
> > because there are too many places using skb->next and skb->prev.
> >
> > Since you mentioned it seems related to network interface flip,
> > what network interfaces are you using? What's is your TC setup?
> >
> > Thanks.
> 
> The system contains only one physical network interface (TI WL1837,
> wl18xx module).
> The state prior to the crash was as follows:
> - One virtual network interface active (as STA, associated with access point)
> - Bluetooth (BLE only) active (same physical chip, co-existence,
> btwilink/st_drv modules)
> 
> Actions made around the time of the crash:
> - Bluetooth disabled
> - One additional virtual network interface brought up (also as STA)
> 
> I believe the crash occurred between these two actions. I just saw
> that there are some interesting events in the log prior to the crash:
> kernel: Bluetooth: Unable to push skb to HCI core(-6)
> kernel: (stc):  proto stack 4's ->recv failed
> kernel: (stc): remove_channel_from_table: id 3
> kernel: (stc): remove_channel_from_table: id 2
> kernel: (stc): remove_channel_from_table: id 4
> kernel: (stc):  all chnl_ids unregistered
> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
> 
> The first print is from btwilink.c. However, I can't see the
> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
> 6LoWPAN or anything similar).
> 
> Thanks, Jacob

Definitely these details are useful ;)

Could you try :

diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
index 6e3af8b42cdd..0c99a74fb895 100644
--- a/drivers/misc/ti-st/st_core.c
+++ b/drivers/misc/ti-st/st_core.c
@@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
skb_queue_purge(_gdata->txq);
skb_queue_purge(_gdata->tx_waitq);
kfree_skb(st_gdata->rx_skb);
+   st_gdata->rx_skb = NULL;
kfree_skb(st_gdata->tx_skb);
+   st_gdata->tx_skb = NULL;
/* TTY ldisc cleanup */
err = tty_unregister_ldisc(N_TI_WL);
if (err)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-04 Thread Eric Dumazet
On Mon, 2016-01-04 at 16:14 +, Rainer Weikusat wrote:
> Eric Dumazet  writes:
> > On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:
> 
> [...]
> 
> >> I believe the crash occurred between these two actions. I just saw
> >> that there are some interesting events in the log prior to the crash:
> >> kernel: Bluetooth: Unable to push skb to HCI core(-6)
> >> kernel: (stc):  proto stack 4's ->recv failed
> >> kernel: (stc): remove_channel_from_table: id 3
> >> kernel: (stc): remove_channel_from_table: id 2
> >> kernel: (stc): remove_channel_from_table: id 4
> >> kernel: (stc):  all chnl_ids unregistered
> >> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
> >> 
> >> The first print is from btwilink.c. However, I can't see the
> >> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
> >> 6LoWPAN or anything similar).
> >> 
> >> Thanks, Jacob
> >
> > Definitely these details are useful ;)
> >
> > Could you try :
> >
> > diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
> > index 6e3af8b42cdd..0c99a74fb895 100644
> > --- a/drivers/misc/ti-st/st_core.c
> > +++ b/drivers/misc/ti-st/st_core.c
> > @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
> > skb_queue_purge(_gdata->txq);
> > skb_queue_purge(_gdata->tx_waitq);
> > kfree_skb(st_gdata->rx_skb);
> > +   st_gdata->rx_skb = NULL;
> > kfree_skb(st_gdata->tx_skb);
> > +   st_gdata->tx_skb = NULL;
> > /* TTY ldisc cleanup */
> > err = tty_unregister_ldisc(N_TI_WL);
> > if (err)
> 
> Hmm ... the code continues with
> 
>   err = tty_unregister_ldisc(N_TI_WL);
>   if (err)
>   pr_err("unable to un-register ldisc %ld", err);
>   /* free the global data pointer */
>   kfree(st_gdata);
> 
> So who would ever see that the rx_skb and tx_skb pointers were cleared
> prior to freeing the data structure containing them?

This is the theory, but I suspect a use after free.

kfree(st_gdata) does not clear all content with 0, unless you use
special SLUB/SLAB debugging features.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2016-01-04 Thread Rainer Weikusat
Eric Dumazet  writes:
> On Mon, 2016-01-04 at 10:10 +0100, Jacob Siverskog wrote:

[...]

>> I believe the crash occurred between these two actions. I just saw
>> that there are some interesting events in the log prior to the crash:
>> kernel: Bluetooth: Unable to push skb to HCI core(-6)
>> kernel: (stc):  proto stack 4's ->recv failed
>> kernel: (stc): remove_channel_from_table: id 3
>> kernel: (stc): remove_channel_from_table: id 2
>> kernel: (stc): remove_channel_from_table: id 4
>> kernel: (stc):  all chnl_ids unregistered
>> kernel: (stk) :ldisc_install = 0(stc): st_tty_close
>> 
>> The first print is from btwilink.c. However, I can't see the
>> connection between Bluetooth (BLE) and UDP/IPv6 (we're not using
>> 6LoWPAN or anything similar).
>> 
>> Thanks, Jacob
>
> Definitely these details are useful ;)
>
> Could you try :
>
> diff --git a/drivers/misc/ti-st/st_core.c b/drivers/misc/ti-st/st_core.c
> index 6e3af8b42cdd..0c99a74fb895 100644
> --- a/drivers/misc/ti-st/st_core.c
> +++ b/drivers/misc/ti-st/st_core.c
> @@ -912,7 +912,9 @@ void st_core_exit(struct st_data_s *st_gdata)
>   skb_queue_purge(_gdata->txq);
>   skb_queue_purge(_gdata->tx_waitq);
>   kfree_skb(st_gdata->rx_skb);
> + st_gdata->rx_skb = NULL;
>   kfree_skb(st_gdata->tx_skb);
> + st_gdata->tx_skb = NULL;
>   /* TTY ldisc cleanup */
>   err = tty_unregister_ldisc(N_TI_WL);
>   if (err)

Hmm ... the code continues with

err = tty_unregister_ldisc(N_TI_WL);
if (err)
pr_err("unable to un-register ldisc %ld", err);
/* free the global data pointer */
kfree(st_gdata);

So who would ever see that the rx_skb and tx_skb pointers were cleared
prior to freeing the data structure containing them?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-30 Thread Cong Wang
On Wed, Dec 30, 2015 at 6:30 AM, Jacob Siverskog
 wrote:
> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  wrote:
>> How often can you trigger this bug ?
>
> Ok. I don't have a good repro to trigger it unfortunately, I've seen it just a
> few times when bringing up/down network interfaces. Does the trace
> give any clue?
>

A little bit. You need to help people to narrow down the problem
because there are too many places using skb->next and skb->prev.

Since you mentioned it seems related to network interface flip,
what network interfaces are you using? What's is your TC setup?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-30 Thread Jacob Siverskog
On Tue, Dec 29, 2015 at 9:08 PM, David Miller  wrote:
> From: Rainer Weikusat 
> Date: Tue, 29 Dec 2015 19:42:36 +
>
>> Jacob Siverskog  writes:
>>> This should fix a NULL pointer dereference I encountered (dump
>>> below). Since __skb_unlink is called while walking,
>>> skb_queue_walk_safe should be used.
>>
>> The code in question is:
>  ...
>> __skb_unlink is only called prior to returning from the function.
>> Consequently, it won't affect the skb_queue_walk code.
>
> Agreed, this patch doesn't fix anything.

Ok. Thanks for your feedback. How do you believe the issue could be
solved? Investigating it gives:

static inline void __skb_unlink(struct sk_buff *skb, struct sk_buff_head *list)
{
struct sk_buff *next, *prev;

list->qlen--;
 51c: e2433001 sub r3, r3, #1
 520: e58b3074 str r3, [fp, #116] ; 0x74
next   = skb->next;
prev   = skb->prev;
 524: e894000c ldm r4, {r2, r3}
skb->next  = skb->prev = NULL;
 528: e5841000 str r1, [r4]
 52c: e5841004 str r1, [r4, #4]
next->prev = prev;
 530: e5823004 str r3, [r2, #4]  <--
trapping instruction (r2 NULL)

Register contents:
r7 : c58cfe1c  r6 : c06351d0  r5 : c77810ac  r4 : c583eac0
r3 :   r2 :   r1 :   r0 : 2013

If I understand this correctly, then r4 = skb, r2 = next, r3 = prev.

Should there be a check for this in __skb_try_recv_datagram?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-30 Thread Jacob Siverskog
On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  wrote:
> On Wed, Dec 30, 2015 at 6:14 AM, Jacob Siverskog
>  wrote:
>
>> Ok. Thanks for your feedback. How do you believe the issue could be
>> solved? Investigating it gives:
>>
>> static inline void __skb_unlink(struct sk_buff *skb, struct sk_buff_head 
>> *list)
>> {
>> struct sk_buff *next, *prev;
>>
>> list->qlen--;
>>  51c: e2433001 sub r3, r3, #1
>>  520: e58b3074 str r3, [fp, #116] ; 0x74
>> next   = skb->next;
>> prev   = skb->prev;
>>  524: e894000c ldm r4, {r2, r3}
>> skb->next  = skb->prev = NULL;
>>  528: e5841000 str r1, [r4]
>>  52c: e5841004 str r1, [r4, #4]
>> next->prev = prev;
>>  530: e5823004 str r3, [r2, #4]  <--
>> trapping instruction (r2 NULL)
>>
>> Register contents:
>> r7 : c58cfe1c  r6 : c06351d0  r5 : c77810ac  r4 : c583eac0
>> r3 :   r2 :   r1 :   r0 : 2013
>>
>> If I understand this correctly, then r4 = skb, r2 = next, r3 = prev.
>>
>> Should there be a check for this in __skb_try_recv_datagram?
>
> At this point corruption already happened.
> We can not possibly detect every possible corruption caused by bugs
> elsewhere in the kernel and just 'recover' at this point.
> We must indeed find the root cause and fix it, instead of trying to hide it.
>
> How often can you trigger this bug ?

Ok. I don't have a good repro to trigger it unfortunately, I've seen it just a
few times when bringing up/down network interfaces. Does the trace
give any clue?

[] (__skb_recv_datagram) from [] (udpv6_recvmsg+0x1d0/0x6d0)
[] (udpv6_recvmsg) from [] (inet_recvmsg+0x38/0x4c)
[] (inet_recvmsg) from [] (___sys_recvmsg+0x94/0x170)
[] (___sys_recvmsg) from [] (__sys_recvmsg+0x3c/0x6c)
[] (__sys_recvmsg) from [] (ret_fast_syscall+0x0/0x3c)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-30 Thread Eric Dumazet
On Wed, Dec 30, 2015 at 6:14 AM, Jacob Siverskog
 wrote:

> Ok. Thanks for your feedback. How do you believe the issue could be
> solved? Investigating it gives:
>
> static inline void __skb_unlink(struct sk_buff *skb, struct sk_buff_head 
> *list)
> {
> struct sk_buff *next, *prev;
>
> list->qlen--;
>  51c: e2433001 sub r3, r3, #1
>  520: e58b3074 str r3, [fp, #116] ; 0x74
> next   = skb->next;
> prev   = skb->prev;
>  524: e894000c ldm r4, {r2, r3}
> skb->next  = skb->prev = NULL;
>  528: e5841000 str r1, [r4]
>  52c: e5841004 str r1, [r4, #4]
> next->prev = prev;
>  530: e5823004 str r3, [r2, #4]  <--
> trapping instruction (r2 NULL)
>
> Register contents:
> r7 : c58cfe1c  r6 : c06351d0  r5 : c77810ac  r4 : c583eac0
> r3 :   r2 :   r1 :   r0 : 2013
>
> If I understand this correctly, then r4 = skb, r2 = next, r3 = prev.
>
> Should there be a check for this in __skb_try_recv_datagram?

At this point corruption already happened.
We can not possibly detect every possible corruption caused by bugs
elsewhere in the kernel and just 'recover' at this point.
We must indeed find the root cause and fix it, instead of trying to hide it.

How often can you trigger this bug ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-30 Thread Eric Dumazet
On Wed, Dec 30, 2015 at 9:30 AM, Jacob Siverskog
 wrote:
> On Wed, Dec 30, 2015 at 2:26 PM, Eric Dumazet  wrote:

>> At this point corruption already happened.
>> We can not possibly detect every possible corruption caused by bugs
>> elsewhere in the kernel and just 'recover' at this point.
>> We must indeed find the root cause and fix it, instead of trying to hide it.
>>
>> How often can you trigger this bug ?
>
> Ok. I don't have a good repro to trigger it unfortunately, I've seen it just a
> few times when bringing up/down network interfaces. Does the trace
> give any clue?
>
> [] (__skb_recv_datagram) from [] 
> (udpv6_recvmsg+0x1d0/0x6d0)
> [] (udpv6_recvmsg) from [] (inet_recvmsg+0x38/0x4c)
> [] (inet_recvmsg) from [] (___sys_recvmsg+0x94/0x170)
> [] (___sys_recvmsg) from [] (__sys_recvmsg+0x3c/0x6c)
> [] (__sys_recvmsg) from [] (ret_fast_syscall+0x0/0x3c)

Not really : it only shows the point where the corruption is detected,
not the point where the corruption happened.

This might be caused by a netfilter module, a buggy driver... it is
hard to know.

You might add some traces on the skb itself, like its length or/and content.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-30 Thread Rainer Weikusat
Jacob Siverskog  writes:
> On Tue, Dec 29, 2015 at 9:08 PM, David Miller  wrote:
>> From: Rainer Weikusat 
>> Date: Tue, 29 Dec 2015 19:42:36 +
>>
>>> Jacob Siverskog  writes:
 This should fix a NULL pointer dereference I encountered (dump
 below). Since __skb_unlink is called while walking,
 skb_queue_walk_safe should be used.
>>>
>>> The code in question is:
>>  ...
>>> __skb_unlink is only called prior to returning from the function.
>>> Consequently, it won't affect the skb_queue_walk code.
>>
>> Agreed, this patch doesn't fix anything.
>
> Ok. Thanks for your feedback. How do you believe the issue could be
> solved? Investigating it gives:
>
> static inline void __skb_unlink(struct sk_buff *skb, struct sk_buff_head 
> *list)
> {

[...]

> next->prev = prev;
>  530: e5823004 str r3, [r2, #4]  <--
> trapping instruction (r2 NULL)
>
> Register contents:
> r7 : c58cfe1c  r6 : c06351d0  r5 : c77810ac  r4 : c583eac0
> r3 :   r2 :   r1 :   r0 : 2013
>
> If I understand this correctly, then r4 = skb, r2 = next, r3 = prev.

Some additional information which may be helpful: The next->prev = prev
was pretty obvious from the original error message alone: The invalid
access happened at 4 but no register contained 4. Considering that this
is for ARM, this must have been caused by an instruction using an
address of the form

[Rx, #4]

ie, value of register x + 4. And the next->prev = prev is the only
access to something located 4 bytes beyond something else.

> Should there be a check for this in __skb_try_recv_datagram?

These lists are supposed to be circular, ie, the next pointer of the
last element should point to the first and the prev pointer of the first
to the last. If there's an element with ->next == NULL on the list,
something either didn't do inserts correctly or corrupted an originally
intact list.

General advice: The original error occurred with 4.3.0. Had this
happened to me, I'd either tried to locate the error in the same kernel
version or to reproduce the bug with the one I was planning to
modify. Trying to fix a 'strange memory access' error which was observed
with version x.y by modifying version x.z is IMHO needlessly moving on
shaky ground.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-29 Thread David Miller
From: Rainer Weikusat 
Date: Tue, 29 Dec 2015 19:42:36 +

> Jacob Siverskog  writes:
>> This should fix a NULL pointer dereference I encountered (dump
>> below). Since __skb_unlink is called while walking,
>> skb_queue_walk_safe should be used.
> 
> The code in question is:
 ...
> __skb_unlink is only called prior to returning from the function.
> Consequently, it won't affect the skb_queue_walk code.

Agreed, this patch doesn't fix anything.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: Fix potential NULL pointer dereference in __skb_try_recv_datagram

2015-12-29 Thread Rainer Weikusat
Jacob Siverskog  writes:
> This should fix a NULL pointer dereference I encountered (dump
> below). Since __skb_unlink is called while walking,
> skb_queue_walk_safe should be used.

The code in question is:

skb_queue_walk(queue, skb) {
*last = skb;
*peeked = skb->peeked;
if (flags & MSG_PEEK) {
if (_off >= skb->len && (skb->len || _off ||
 skb->peeked)) {
_off -= skb->len;
continue;
}

skb = skb_set_peeked(skb);
error = PTR_ERR(skb);
if (IS_ERR(skb)) {
spin_unlock_irqrestore(>lock,
   cpu_flags);
goto no_packet;
}

atomic_inc(>users);
}  else
__skb_unlink(skb, queue);

spin_unlock_irqrestore(>lock, cpu_flags);
*off = _off;
return skb;
}

__skb_unlink is only called prior to returning from the function.
Consequently, it won't affect the skb_queue_walk code.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html