Re: [PATCH v4 1/2] rtw88: pci: Rearrange the memory usage for skb in RX ISR

2019-07-24 Thread Kalle Valo
Jian-Hong Pan  wrote:

> Testing with RTL8822BE hardware, when available memory is low, we
> frequently see a kernel panic and system freeze.
> 
> First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
> 
> rx routine starvation
> WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 
> rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
> [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
> 
> Then we see a variety of different error conditions and kernel panics,
> such as this one (trimmed):
> 
> rtw_pci :02:00.0: pci bus timeout, check dma status
> skbuff: skb_over_panic: text:091b6e66 len:415 put:415 
> head:d2880c6f data:7a02b1ea tail:0x1df end:0xc0 dev:
> [ cut here ]
> kernel BUG at net/core/skbuff.c:105!
> invalid opcode:  [#1] SMP NOPTI
> RIP: 0010:skb_panic+0x43/0x45
> 
> When skb allocation fails and the "rx routine starvation" is hit, the
> function returns immediately without updating the RX ring. At this
> point, the RX ring may continue referencing an old skb which was already
> handed off to ieee80211_rx_irqsafe(). When it comes to be used again,
> bad things happen.
> 
> This patch allocates a new, data-sized skb first in RX ISR. After
> copying the data in, we pass it to the upper layers. However, if skb
> allocation fails, we effectively drop the frame. In both cases, the
> original, full size ring skb is reused.
> 
> In addition, to fixing the kernel crash, the RX routine should now
> generally behave better under low memory conditions.
> 
> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053
> Signed-off-by: Jian-Hong Pan 
> Cc: 

2 patches applied to wireless-drivers-next.git, thanks.

ee6db78f5db9 rtw88: pci: Rearrange the memory usage for skb in RX ISR
29b68a920f6a rtw88: pci: Use DMA sync instead of remapping in RX ISR

-- 
https://patchwork.kernel.org/patch/11039275/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches



Re: [PATCH v4 1/2] rtw88: pci: Rearrange the memory usage for skb in RX ISR

2019-07-24 Thread Jian-Hong Pan
Jian-Hong Pan  於 2019年7月11日 週四 下午1:28寫道:
>
> Jian-Hong Pan  於 2019年7月11日 週四 下午1:25寫道:
> >
> > Testing with RTL8822BE hardware, when available memory is low, we
> > frequently see a kernel panic and system freeze.
> >
> > First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
> >
> > rx routine starvation
> > WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 
> > rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
> > [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
> >
> > Then we see a variety of different error conditions and kernel panics,
> > such as this one (trimmed):
> >
> > rtw_pci :02:00.0: pci bus timeout, check dma status
> > skbuff: skb_over_panic: text:091b6e66 len:415 put:415 
> > head:d2880c6f data:7a02b1ea tail:0x1df end:0xc0 dev:
> > [ cut here ]
> > kernel BUG at net/core/skbuff.c:105!
> > invalid opcode:  [#1] SMP NOPTI
> > RIP: 0010:skb_panic+0x43/0x45
> >
> > When skb allocation fails and the "rx routine starvation" is hit, the
> > function returns immediately without updating the RX ring. At this
> > point, the RX ring may continue referencing an old skb which was already
> > handed off to ieee80211_rx_irqsafe(). When it comes to be used again,
> > bad things happen.
> >
> > This patch allocates a new, data-sized skb first in RX ISR. After
> > copying the data in, we pass it to the upper layers. However, if skb
> > allocation fails, we effectively drop the frame. In both cases, the
> > original, full size ring skb is reused.
> >
> > In addition, to fixing the kernel crash, the RX routine should now
> > generally behave better under low memory conditions.
> >
> > Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053
> > Signed-off-by: Jian-Hong Pan 
> > Cc: 
> > ---
>
> Sorry, I forget to place the version difference here.
>
> v2:
>  - Allocate new data-sized skb and put data into it, then pass it to
>mac80211. Reuse the original skb in RX ring by DMA sync.
>  - Modify the commit message.
>  - Introduce following [PATCH v3 2/2] rtw88: pci: Use DMA sync instead
>of remapping in RX ISR.
>
> v3:
>  - Same as v2.
>
> v4:
>  - Fix comment: allocate a new skb for this frame, discard the frame
> if none available
>
> >  drivers/net/wireless/realtek/rtw88/pci.c | 49 +++-
> >  1 file changed, 22 insertions(+), 27 deletions(-)
> >
> > diff --git a/drivers/net/wireless/realtek/rtw88/pci.c 
> > b/drivers/net/wireless/realtek/rtw88/pci.c
> > index cfe05ba7280d..c415f5e94fed 100644
> > --- a/drivers/net/wireless/realtek/rtw88/pci.c
> > +++ b/drivers/net/wireless/realtek/rtw88/pci.c
> > @@ -763,6 +763,7 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, 
> > struct rtw_pci *rtwpci,
> > u32 pkt_offset;
> > u32 pkt_desc_sz = chip->rx_pkt_desc_sz;
> > u32 buf_desc_sz = chip->rx_buf_desc_sz;
> > +   u32 new_len;
> > u8 *rx_desc;
> > dma_addr_t dma;
> >
> > @@ -790,40 +791,34 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, 
> > struct rtw_pci *rtwpci,
> > pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz +
> >  pkt_stat.shift;
> >
> > -   if (pkt_stat.is_c2h) {
> > -   /* keep rx_desc, halmac needs it */
> > -   skb_put(skb, pkt_stat.pkt_len + pkt_offset);
> > +   /* allocate a new skb for this frame,
> > +* discard the frame if none available
> > +*/
> > +   new_len = pkt_stat.pkt_len + pkt_offset;
> > +   new = dev_alloc_skb(new_len);
> > +   if (WARN_ONCE(!new, "rx routine starvation\n"))
> > +   goto next_rp;
> > +
> > +   /* put the DMA data including rx_desc from phy to new skb */
> > +   skb_put_data(new, skb->data, new_len);
> >
> > -   /* pass offset for further operation */
> > -   *((u32 *)skb->cb) = pkt_offset;
> > -   skb_queue_tail(>c2h_queue, skb);
> > +   if (pkt_stat.is_c2h) {
> > +/* pass rx_desc & offset for further operation */
> > +   *((u32 *)new->cb) = pkt_offset;
> > +   skb_queue_tail(>c2h_queue, new);
> > ieee80211_queue_work(rtwdev->hw, >c2h_work);
> > } else {
> > -   /* remove rx_desc, maybe use skb_pull? */
> > -   skb_put(skb, pkt_stat.pkt_len);
> > -   skb_reserve(skb, pkt_offset);
> > -
> > -   /* alloc a smaller skb to mac80211 */
> > -   new = dev_alloc_skb(pkt_stat.pkt_len);
> > -   if (!new) {
> > -   new = skb;
> > -   } else {
> > -   skb_put_data(new, skb->data, skb->len);
> > -   

Re: [PATCH v4 1/2] rtw88: pci: Rearrange the memory usage for skb in RX ISR

2019-07-10 Thread Jian-Hong Pan
Jian-Hong Pan  於 2019年7月11日 週四 下午1:25寫道:
>
> Testing with RTL8822BE hardware, when available memory is low, we
> frequently see a kernel panic and system freeze.
>
> First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
>
> rx routine starvation
> WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 
> rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
> [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
>
> Then we see a variety of different error conditions and kernel panics,
> such as this one (trimmed):
>
> rtw_pci :02:00.0: pci bus timeout, check dma status
> skbuff: skb_over_panic: text:091b6e66 len:415 put:415 
> head:d2880c6f data:7a02b1ea tail:0x1df end:0xc0 dev:
> [ cut here ]
> kernel BUG at net/core/skbuff.c:105!
> invalid opcode:  [#1] SMP NOPTI
> RIP: 0010:skb_panic+0x43/0x45
>
> When skb allocation fails and the "rx routine starvation" is hit, the
> function returns immediately without updating the RX ring. At this
> point, the RX ring may continue referencing an old skb which was already
> handed off to ieee80211_rx_irqsafe(). When it comes to be used again,
> bad things happen.
>
> This patch allocates a new, data-sized skb first in RX ISR. After
> copying the data in, we pass it to the upper layers. However, if skb
> allocation fails, we effectively drop the frame. In both cases, the
> original, full size ring skb is reused.
>
> In addition, to fixing the kernel crash, the RX routine should now
> generally behave better under low memory conditions.
>
> Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053
> Signed-off-by: Jian-Hong Pan 
> Cc: 
> ---

Sorry, I forget to place the version difference here.

v2:
 - Allocate new data-sized skb and put data into it, then pass it to
   mac80211. Reuse the original skb in RX ring by DMA sync.
 - Modify the commit message.
 - Introduce following [PATCH v3 2/2] rtw88: pci: Use DMA sync instead
   of remapping in RX ISR.

v3:
 - Same as v2.

v4:
 - Fix comment: allocate a new skb for this frame, discard the frame
if none available

>  drivers/net/wireless/realtek/rtw88/pci.c | 49 +++-
>  1 file changed, 22 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/net/wireless/realtek/rtw88/pci.c 
> b/drivers/net/wireless/realtek/rtw88/pci.c
> index cfe05ba7280d..c415f5e94fed 100644
> --- a/drivers/net/wireless/realtek/rtw88/pci.c
> +++ b/drivers/net/wireless/realtek/rtw88/pci.c
> @@ -763,6 +763,7 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct 
> rtw_pci *rtwpci,
> u32 pkt_offset;
> u32 pkt_desc_sz = chip->rx_pkt_desc_sz;
> u32 buf_desc_sz = chip->rx_buf_desc_sz;
> +   u32 new_len;
> u8 *rx_desc;
> dma_addr_t dma;
>
> @@ -790,40 +791,34 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, 
> struct rtw_pci *rtwpci,
> pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz +
>  pkt_stat.shift;
>
> -   if (pkt_stat.is_c2h) {
> -   /* keep rx_desc, halmac needs it */
> -   skb_put(skb, pkt_stat.pkt_len + pkt_offset);
> +   /* allocate a new skb for this frame,
> +* discard the frame if none available
> +*/
> +   new_len = pkt_stat.pkt_len + pkt_offset;
> +   new = dev_alloc_skb(new_len);
> +   if (WARN_ONCE(!new, "rx routine starvation\n"))
> +   goto next_rp;
> +
> +   /* put the DMA data including rx_desc from phy to new skb */
> +   skb_put_data(new, skb->data, new_len);
>
> -   /* pass offset for further operation */
> -   *((u32 *)skb->cb) = pkt_offset;
> -   skb_queue_tail(>c2h_queue, skb);
> +   if (pkt_stat.is_c2h) {
> +/* pass rx_desc & offset for further operation */
> +   *((u32 *)new->cb) = pkt_offset;
> +   skb_queue_tail(>c2h_queue, new);
> ieee80211_queue_work(rtwdev->hw, >c2h_work);
> } else {
> -   /* remove rx_desc, maybe use skb_pull? */
> -   skb_put(skb, pkt_stat.pkt_len);
> -   skb_reserve(skb, pkt_offset);
> -
> -   /* alloc a smaller skb to mac80211 */
> -   new = dev_alloc_skb(pkt_stat.pkt_len);
> -   if (!new) {
> -   new = skb;
> -   } else {
> -   skb_put_data(new, skb->data, skb->len);
> -   dev_kfree_skb_any(skb);
> -   }
> -   /* TODO: merge into rx.c */
> -   rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
> +   /* remove rx_desc */
> +   

[PATCH v4 1/2] rtw88: pci: Rearrange the memory usage for skb in RX ISR

2019-07-10 Thread Jian-Hong Pan
Testing with RTL8822BE hardware, when available memory is low, we
frequently see a kernel panic and system freeze.

First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):

rx routine starvation
WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 
rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
[ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]

Then we see a variety of different error conditions and kernel panics,
such as this one (trimmed):

rtw_pci :02:00.0: pci bus timeout, check dma status
skbuff: skb_over_panic: text:091b6e66 len:415 put:415 
head:d2880c6f data:7a02b1ea tail:0x1df end:0xc0 dev:
[ cut here ]
kernel BUG at net/core/skbuff.c:105!
invalid opcode:  [#1] SMP NOPTI
RIP: 0010:skb_panic+0x43/0x45

When skb allocation fails and the "rx routine starvation" is hit, the
function returns immediately without updating the RX ring. At this
point, the RX ring may continue referencing an old skb which was already
handed off to ieee80211_rx_irqsafe(). When it comes to be used again,
bad things happen.

This patch allocates a new, data-sized skb first in RX ISR. After
copying the data in, we pass it to the upper layers. However, if skb
allocation fails, we effectively drop the frame. In both cases, the
original, full size ring skb is reused.

In addition, to fixing the kernel crash, the RX routine should now
generally behave better under low memory conditions.

Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053
Signed-off-by: Jian-Hong Pan 
Cc: 
---
 drivers/net/wireless/realtek/rtw88/pci.c | 49 +++-
 1 file changed, 22 insertions(+), 27 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtw88/pci.c 
b/drivers/net/wireless/realtek/rtw88/pci.c
index cfe05ba7280d..c415f5e94fed 100644
--- a/drivers/net/wireless/realtek/rtw88/pci.c
+++ b/drivers/net/wireless/realtek/rtw88/pci.c
@@ -763,6 +763,7 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct 
rtw_pci *rtwpci,
u32 pkt_offset;
u32 pkt_desc_sz = chip->rx_pkt_desc_sz;
u32 buf_desc_sz = chip->rx_buf_desc_sz;
+   u32 new_len;
u8 *rx_desc;
dma_addr_t dma;
 
@@ -790,40 +791,34 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct 
rtw_pci *rtwpci,
pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz +
 pkt_stat.shift;
 
-   if (pkt_stat.is_c2h) {
-   /* keep rx_desc, halmac needs it */
-   skb_put(skb, pkt_stat.pkt_len + pkt_offset);
+   /* allocate a new skb for this frame,
+* discard the frame if none available
+*/
+   new_len = pkt_stat.pkt_len + pkt_offset;
+   new = dev_alloc_skb(new_len);
+   if (WARN_ONCE(!new, "rx routine starvation\n"))
+   goto next_rp;
+
+   /* put the DMA data including rx_desc from phy to new skb */
+   skb_put_data(new, skb->data, new_len);
 
-   /* pass offset for further operation */
-   *((u32 *)skb->cb) = pkt_offset;
-   skb_queue_tail(>c2h_queue, skb);
+   if (pkt_stat.is_c2h) {
+/* pass rx_desc & offset for further operation */
+   *((u32 *)new->cb) = pkt_offset;
+   skb_queue_tail(>c2h_queue, new);
ieee80211_queue_work(rtwdev->hw, >c2h_work);
} else {
-   /* remove rx_desc, maybe use skb_pull? */
-   skb_put(skb, pkt_stat.pkt_len);
-   skb_reserve(skb, pkt_offset);
-
-   /* alloc a smaller skb to mac80211 */
-   new = dev_alloc_skb(pkt_stat.pkt_len);
-   if (!new) {
-   new = skb;
-   } else {
-   skb_put_data(new, skb->data, skb->len);
-   dev_kfree_skb_any(skb);
-   }
-   /* TODO: merge into rx.c */
-   rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
+   /* remove rx_desc */
+   skb_pull(new, pkt_offset);
+
+   rtw_rx_stats(rtwdev, pkt_stat.vif, new);
memcpy(new->cb, _status, sizeof(rx_status));
ieee80211_rx_irqsafe(rtwdev->hw, new);
}
 
-   /* skb delivered to mac80211, alloc a new one in rx ring */
-   new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
-   if (WARN(!new, "rx routine starvation\n"))
-   return;
-
-   ring->buf[cur_rp] = new;
-   rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);
+next_rp:
+   /* new skb delivered to mac80211,