Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure

2019-02-28 Thread Wei Liu
On Thu, Feb 28, 2019 at 12:07:07PM +, Paul Durrant wrote:
> Yes, I meant kfree_skb(nskb).
> 

In that case I think your patch looks fine.

Wei.


RE: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure

2019-02-28 Thread Paul Durrant


> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xenproject.org] On Behalf Of 
> Paul Durrant
> Sent: 28 February 2019 11:22
> To: Wei Liu 
> Cc: Igor Druzhinin ; Wei Liu 
> ; net...@vger.kernel.org;
> linux-kernel@vger.kernel.org; xen-de...@lists.xenproject.org; 
> da...@davemloft.net
> Subject: Re: [Xen-devel] [PATCH] xen-netback: fix occasional leak of grant 
> ref mappings under memory
> pressure
> 
> > -Original Message-
> > From: Wei Liu [mailto:wei.l...@citrix.com]
> > Sent: 28 February 2019 11:02
> > To: Paul Durrant 
> > Cc: Igor Druzhinin ; 
> > xen-de...@lists.xenproject.org;
> > net...@vger.kernel.org; linux-kernel@vger.kernel.org; Wei Liu 
> > ;
> > da...@davemloft.net
> > Subject: Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> > under memory pressure
> >
> > On Thu, Feb 28, 2019 at 09:46:57AM +, Paul Durrant wrote:
> > > > -Original Message-
> > > > From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> > > > Sent: 28 February 2019 02:03
> > > > To: xen-de...@lists.xenproject.org; net...@vger.kernel.org; 
> > > > linux-kernel@vger.kernel.org
> > > > Cc: Wei Liu ; Paul Durrant 
> > > > ; da...@davemloft.net;
> > Igor
> > > > Druzhinin 
> > > > Subject: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> > > > under memory pressure
> > > >
> > > > Zero-copy callback flag is not yet set on frag list skb at the moment
> > > > xenvif_handle_frag_list() returns -ENOMEM. This eventually results in
> > > > leaking grant ref mappings since xenvif_zerocopy_callback() is never
> > > > called for these fragments. Those eventually build up and cause Xen
> > > > to kill Dom0 as the slots get reused for new mappings.
> > > >
> > > > That behavior is observed under certain workloads where sudden spikes
> > > > of page cache usage for writes coexist with active atomic skb 
> > > > allocations.
> > > >
> > > > Signed-off-by: Igor Druzhinin 
> > > > ---
> > > >  drivers/net/xen-netback/netback.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/net/xen-netback/netback.c 
> > > > b/drivers/net/xen-netback/netback.c
> > > > index 80aae3a..2023317 100644
> > > > --- a/drivers/net/xen-netback/netback.c
> > > > +++ b/drivers/net/xen-netback/netback.c
> > > > @@ -1146,9 +1146,12 @@ static int xenvif_tx_submit(struct xenvif_queue 
> > > > *queue)
> > > >
> > > > if (unlikely(skb_has_frag_list(skb))) {
> > > > if (xenvif_handle_frag_list(queue, skb)) {
> > > > +   struct sk_buff *nskb =
> > > > +   
> > > > skb_shinfo(skb)->frag_list;
> > > > if (net_ratelimit())
> > > > netdev_err(queue->vif->dev,
> > > >"Not enough memory 
> > > > to consolidate frag_list!\n");
> > > > +   xenvif_skb_zerocopy_prepare(queue, 
> > > > nskb);
> > > > xenvif_skb_zerocopy_prepare(queue, skb);
> > > > kfree_skb(skb);
> > > > continue;
> > >
> > > Whilst this fix will do the job, I think it would be better to get rid of 
> > > the kfree_skb() from
> > inside xenvif_handle_frag_list() and always deal with it here rather than 
> > having it happen in two
> > different places. Something like the following...
> >
> > +1 for having only one place.
> >
> > >
> > > ---8<---
> > > diff --git a/drivers/net/xen-netback/netback.c 
> > > b/drivers/net/xen-netback/netback.c
> > > index 80aae3a32c2a..093c7b860772 100644
> > > --- a/drivers/net/xen-netback/netback.c
> > > +++ b/drivers/net/xen-netback/netback.c
> > > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct 
> > > xenvif_queue *queue,
> > >  /* Consolidate skb with a frag_list into a brand new one with local 
> > > pages on
> > >   * frags. Returns 0 or -ENOMEM if can't allocate new pages.
> > >   

RE: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure

2019-02-28 Thread Paul Durrant
> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 28 February 2019 11:44
> To: Paul Durrant ; Wei Liu 
> Cc: xen-de...@lists.xenproject.org; net...@vger.kernel.org; 
> linux-kernel@vger.kernel.org;
> da...@davemloft.net
> Subject: Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> under memory pressure
> 
> On 28/02/2019 11:21, Paul Durrant wrote:
> >>> @@ -1153,6 +1152,10 @@ static int xenvif_tx_submit(struct xenvif_queue 
> >>> *queue)
> >>> kfree_skb(skb);
> >>> continue;
> >>> }
> >>> +
> >>> +   /* Copied all the bits from the frag list. */
> >>> +   skb_frag_list_init(skb);
> >>> +   kfree(nskb);
> >>
> >> I think you want kfree_skb here?
> >
> > No. nskb is the frag list... it is unlinked from skb by the call to 
> > skb_frag_list_init() and then it
> can be freed on its own. The skb is what we need to retain, because that now 
> contains all the data.
> >
> 
> Are you saying previous code in xenvif_handle_frag_list() incorrectly
> called kfree_skb()?

No, it correctly called kfree_skb() on nskb in the success case. What Wei and 
myself would prefer is that we have a single place that the frag list is freed 
in both the success and error cases.

  Paul

> 
> Igor


Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure

2019-02-28 Thread Igor Druzhinin
On 28/02/2019 11:21, Paul Durrant wrote:
>>> @@ -1153,6 +1152,10 @@ static int xenvif_tx_submit(struct xenvif_queue 
>>> *queue)
>>> kfree_skb(skb);
>>> continue;
>>> }
>>> +
>>> +   /* Copied all the bits from the frag list. */
>>> +   skb_frag_list_init(skb);
>>> +   kfree(nskb);
>>
>> I think you want kfree_skb here?
> 
> No. nskb is the frag list... it is unlinked from skb by the call to 
> skb_frag_list_init() and then it can be freed on its own. The skb is what we 
> need to retain, because that now contains all the data.
> 

Are you saying previous code in xenvif_handle_frag_list() incorrectly
called kfree_skb()?

Igor


RE: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure

2019-02-28 Thread Paul Durrant
> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: 28 February 2019 11:02
> To: Paul Durrant 
> Cc: Igor Druzhinin ; 
> xen-de...@lists.xenproject.org;
> net...@vger.kernel.org; linux-kernel@vger.kernel.org; Wei Liu 
> ;
> da...@davemloft.net
> Subject: Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> under memory pressure
> 
> On Thu, Feb 28, 2019 at 09:46:57AM +, Paul Durrant wrote:
> > > -Original Message-
> > > From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> > > Sent: 28 February 2019 02:03
> > > To: xen-de...@lists.xenproject.org; net...@vger.kernel.org; 
> > > linux-kernel@vger.kernel.org
> > > Cc: Wei Liu ; Paul Durrant 
> > > ; da...@davemloft.net;
> Igor
> > > Druzhinin 
> > > Subject: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> > > under memory pressure
> > >
> > > Zero-copy callback flag is not yet set on frag list skb at the moment
> > > xenvif_handle_frag_list() returns -ENOMEM. This eventually results in
> > > leaking grant ref mappings since xenvif_zerocopy_callback() is never
> > > called for these fragments. Those eventually build up and cause Xen
> > > to kill Dom0 as the slots get reused for new mappings.
> > >
> > > That behavior is observed under certain workloads where sudden spikes
> > > of page cache usage for writes coexist with active atomic skb allocations.
> > >
> > > Signed-off-by: Igor Druzhinin 
> > > ---
> > >  drivers/net/xen-netback/netback.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/net/xen-netback/netback.c 
> > > b/drivers/net/xen-netback/netback.c
> > > index 80aae3a..2023317 100644
> > > --- a/drivers/net/xen-netback/netback.c
> > > +++ b/drivers/net/xen-netback/netback.c
> > > @@ -1146,9 +1146,12 @@ static int xenvif_tx_submit(struct xenvif_queue 
> > > *queue)
> > >
> > >   if (unlikely(skb_has_frag_list(skb))) {
> > >   if (xenvif_handle_frag_list(queue, skb)) {
> > > + struct sk_buff *nskb =
> > > + skb_shinfo(skb)->frag_list;
> > >   if (net_ratelimit())
> > >   netdev_err(queue->vif->dev,
> > >  "Not enough memory to 
> > > consolidate frag_list!\n");
> > > + xenvif_skb_zerocopy_prepare(queue, nskb);
> > >   xenvif_skb_zerocopy_prepare(queue, skb);
> > >   kfree_skb(skb);
> > >   continue;
> >
> > Whilst this fix will do the job, I think it would be better to get rid of 
> > the kfree_skb() from
> inside xenvif_handle_frag_list() and always deal with it here rather than 
> having it happen in two
> different places. Something like the following...
> 
> +1 for having only one place.
> 
> >
> > ---8<---
> > diff --git a/drivers/net/xen-netback/netback.c 
> > b/drivers/net/xen-netback/netback.c
> > index 80aae3a32c2a..093c7b860772 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct 
> > xenvif_queue *queue,
> >  /* Consolidate skb with a frag_list into a brand new one with local pages 
> > on
> >   * frags. Returns 0 or -ENOMEM if can't allocate new pages.
> >   */
> > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > sk_buff *skb)
> > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > sk_buff *diff --git
> a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > index 80aae3a32c2a..093c7b860772 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct 
> > xenvif_queue *qu
> > eue,
> >  /* Consolidate skb with a frag_list into a brand new one with local pages 
> > on
> >   * frags. Returns 0 or -ENOMEM if can't allocate new pages.
> >   */
> > -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > sk_buff *
> > skb)
> > +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> > sk_buff *
> > skb,
> > +

Re: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure

2019-02-28 Thread Wei Liu
On Thu, Feb 28, 2019 at 09:46:57AM +, Paul Durrant wrote:
> > -Original Message-
> > From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> > Sent: 28 February 2019 02:03
> > To: xen-de...@lists.xenproject.org; net...@vger.kernel.org; 
> > linux-kernel@vger.kernel.org
> > Cc: Wei Liu ; Paul Durrant ; 
> > da...@davemloft.net; Igor
> > Druzhinin 
> > Subject: [PATCH] xen-netback: fix occasional leak of grant ref mappings 
> > under memory pressure
> > 
> > Zero-copy callback flag is not yet set on frag list skb at the moment
> > xenvif_handle_frag_list() returns -ENOMEM. This eventually results in
> > leaking grant ref mappings since xenvif_zerocopy_callback() is never
> > called for these fragments. Those eventually build up and cause Xen
> > to kill Dom0 as the slots get reused for new mappings.
> > 
> > That behavior is observed under certain workloads where sudden spikes
> > of page cache usage for writes coexist with active atomic skb allocations.
> > 
> > Signed-off-by: Igor Druzhinin 
> > ---
> >  drivers/net/xen-netback/netback.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/net/xen-netback/netback.c 
> > b/drivers/net/xen-netback/netback.c
> > index 80aae3a..2023317 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -1146,9 +1146,12 @@ static int xenvif_tx_submit(struct xenvif_queue 
> > *queue)
> > 
> > if (unlikely(skb_has_frag_list(skb))) {
> > if (xenvif_handle_frag_list(queue, skb)) {
> > +   struct sk_buff *nskb =
> > +   skb_shinfo(skb)->frag_list;
> > if (net_ratelimit())
> > netdev_err(queue->vif->dev,
> >"Not enough memory to 
> > consolidate frag_list!\n");
> > +   xenvif_skb_zerocopy_prepare(queue, nskb);
> > xenvif_skb_zerocopy_prepare(queue, skb);
> > kfree_skb(skb);
> > continue;
> 
> Whilst this fix will do the job, I think it would be better to get rid of the 
> kfree_skb() from inside xenvif_handle_frag_list() and always deal with it 
> here rather than having it happen in two different places. Something like the 
> following...

+1 for having only one place. 

> 
> ---8<---
> diff --git a/drivers/net/xen-netback/netback.c 
> b/drivers/net/xen-netback/netback.c
> index 80aae3a32c2a..093c7b860772 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
> *queue,
>  /* Consolidate skb with a frag_list into a brand new one with local pages on
>   * frags. Returns 0 or -ENOMEM if can't allocate new pages.
>   */
> -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> sk_buff *skb)
> +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> sk_buff *diff --git a/drivers/net/xen-netback/netback.c 
> b/drivers/net/xen-netback/netback.c
> index 80aae3a32c2a..093c7b860772 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
> *qu
> eue,
>  /* Consolidate skb with a frag_list into a brand new one with local pages on
>   * frags. Returns 0 or -ENOMEM if can't allocate new pages.
>   */
> -static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> sk_buff *
> skb)
> +static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct 
> sk_buff *
> skb,
> +  struct sk_buff *nskb)
>  {
> unsigned int offset = skb_headlen(skb);
> skb_frag_t frags[MAX_SKB_FRAGS];
> int i, f;
> struct ubuf_info *uarg;
> -   struct sk_buff *nskb = skb_shinfo(skb)->frag_list;
> 
> queue->stats.tx_zerocopy_sent += 2;
> queue->stats.tx_frag_overflow++;
> @@ -1072,11 +1072,6 @@ static int xenvif_handle_frag_list(struct xenvif_queue 
> *q
> ueue, struct sk_buff *s
> skb_frag_size_set([i], len);
> }
> 
> -   /* Copied all the bits from the frag list -- free it. */
> -   skb_frag_list_init(skb);
> -   xenvif_skb_zerocopy_prepare(queue, nskb);
> -   kfree_skb(nskb);
> -
> /* Release all the original (foreign) frags. */
> for (f = 0; f < skb_shinfo(skb)->nr_frags; f++)
> skb_frag_unref(skb, f);
> @@ -1145,7 +1140,11 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
> xenvif_fill_frags(queue, skb);
> 
> if (unlikely(skb_has_frag_list(skb))) {
> -   if (xenvif_handle_frag_list(queue, skb)) {
> +   struct sk_buff *nskb = skb_shinfo(skb)->frag_list;
> +
> +   

RE: [PATCH] xen-netback: fix occasional leak of grant ref mappings under memory pressure

2019-02-28 Thread Paul Durrant
> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 28 February 2019 02:03
> To: xen-de...@lists.xenproject.org; net...@vger.kernel.org; 
> linux-kernel@vger.kernel.org
> Cc: Wei Liu ; Paul Durrant ; 
> da...@davemloft.net; Igor
> Druzhinin 
> Subject: [PATCH] xen-netback: fix occasional leak of grant ref mappings under 
> memory pressure
> 
> Zero-copy callback flag is not yet set on frag list skb at the moment
> xenvif_handle_frag_list() returns -ENOMEM. This eventually results in
> leaking grant ref mappings since xenvif_zerocopy_callback() is never
> called for these fragments. Those eventually build up and cause Xen
> to kill Dom0 as the slots get reused for new mappings.
> 
> That behavior is observed under certain workloads where sudden spikes
> of page cache usage for writes coexist with active atomic skb allocations.
> 
> Signed-off-by: Igor Druzhinin 
> ---
>  drivers/net/xen-netback/netback.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/xen-netback/netback.c 
> b/drivers/net/xen-netback/netback.c
> index 80aae3a..2023317 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -1146,9 +1146,12 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
> 
>   if (unlikely(skb_has_frag_list(skb))) {
>   if (xenvif_handle_frag_list(queue, skb)) {
> + struct sk_buff *nskb =
> + skb_shinfo(skb)->frag_list;
>   if (net_ratelimit())
>   netdev_err(queue->vif->dev,
>  "Not enough memory to 
> consolidate frag_list!\n");
> + xenvif_skb_zerocopy_prepare(queue, nskb);
>   xenvif_skb_zerocopy_prepare(queue, skb);
>   kfree_skb(skb);
>   continue;

Whilst this fix will do the job, I think it would be better to get rid of the 
kfree_skb() from inside xenvif_handle_frag_list() and always deal with it here 
rather than having it happen in two different places. Something like the 
following...

---8<---
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 80aae3a32c2a..093c7b860772 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
 /* Consolidate skb with a frag_list into a brand new one with local pages on
  * frags. Returns 0 or -ENOMEM if can't allocate new pages.
  */
-static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff 
*skb)
+static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff 
*diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 80aae3a32c2a..093c7b860772 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1027,13 +1027,13 @@ static void xenvif_tx_build_gops(struct xenvif_queue *qu
eue,
 /* Consolidate skb with a frag_list into a brand new one with local pages on
  * frags. Returns 0 or -ENOMEM if can't allocate new pages.
  */
-static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *
skb)
+static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *
skb,
+  struct sk_buff *nskb)
 {
unsigned int offset = skb_headlen(skb);
skb_frag_t frags[MAX_SKB_FRAGS];
int i, f;
struct ubuf_info *uarg;
-   struct sk_buff *nskb = skb_shinfo(skb)->frag_list;

queue->stats.tx_zerocopy_sent += 2;
queue->stats.tx_frag_overflow++;
@@ -1072,11 +1072,6 @@ static int xenvif_handle_frag_list(struct xenvif_queue *q
ueue, struct sk_buff *s
skb_frag_size_set([i], len);
}

-   /* Copied all the bits from the frag list -- free it. */
-   skb_frag_list_init(skb);
-   xenvif_skb_zerocopy_prepare(queue, nskb);
-   kfree_skb(nskb);
-
/* Release all the original (foreign) frags. */
for (f = 0; f < skb_shinfo(skb)->nr_frags; f++)
skb_frag_unref(skb, f);
@@ -1145,7 +1140,11 @@ static int xenvif_tx_submit(struct xenvif_queue *queue)
xenvif_fill_frags(queue, skb);

if (unlikely(skb_has_frag_list(skb))) {
-   if (xenvif_handle_frag_list(queue, skb)) {
+   struct sk_buff *nskb = skb_shinfo(skb)->frag_list;
+
+   xenvif_skb_zerocopy_prepare(queue, nskb);
+
+   if (xenvif_handle_frag_list(queue, skb, nskb)) {
if (net_ratelimit())
netdev_err(queue->vif->dev,
   "Not enough memory to 
consolidate frag_list!\n");