Re: [Xen-devel] [v3,11/41] mips: reuse asm-generic/barrier.h

2016-01-18 Thread Herbert Xu
Paul E. McKenney  wrote:
>
> You could use SYNC_ACQUIRE() to implement read_barrier_depends() and
> smp_read_barrier_depends(), but SYNC_RMB probably does not suffice.
> The reason for this is that smp_read_barrier_depends() must order the
> pointer load against any subsequent read or write through a dereference
> of that pointer.  For example:
> 
>p = READ_ONCE(gp);
>smp_rmb();
>r1 = p->a; /* ordered by smp_rmb(). */
>p->b = 42; /* NOT ordered by smp_rmb(), BUG!!! */
>r2 = x; /* ordered by smp_rmb(), but doesn't need to be. */
> 
> In contrast:
> 
>p = READ_ONCE(gp);
>smp_read_barrier_depends();
>r1 = p->a; /* ordered by smp_read_barrier_depends(). */
>p->b = 42; /* ordered by smp_read_barrier_depends(). */
>r2 = x; /* not ordered by smp_read_barrier_depends(), which is OK. */
> 
> Again, if your hardware maintains local ordering for address
> and data dependencies, you can have read_barrier_depends() and
> smp_read_barrier_depends() be no-ops like they are for most
> architectures.
> 
> Does that help?

This is crazy! smp_rmb started out being strictly stronger than
smp_read_barrier_depends, when did this stop being the case?
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] "tcp: refine TSO autosizing" causes performance regression on Xen

2015-04-15 Thread Herbert Xu
Eric Dumazet  wrote:
>
> We already have netdev->gso_max_size and netdev->gso_max_segs
> which are cached into sk->sk_gso_max_size & sk->sk_gso_max_segs

It is quite dangerous to attempt tricks like this because a
tc redirection or netfilter nat could change the destination
device rendering such hints incorrect.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] caif: Fix napi poll list corruption

2014-12-22 Thread Herbert Xu
On Mon, Dec 22, 2014 at 04:18:33PM +0800, Jason Wang wrote:
>
> btw, looks like at least caif_virtio has the same issue.

Good catch.

-- >8 --
The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
interrupt masking in NAPI) breaks caif.

It is now required that if the entire budget is consumed when poll
returns, the napi poll_list must remain empty.  However, like some
other drivers caif tries to do a last-ditch check and if there is
more work it will call napi_schedule and then immediately process
some of this new work.  Should the entire budget be consumed while
processing such new work then we will violate the new caller
contract.

This patch fixes this by not touching any work when we reschedule
in caif.

Signed-off-by: Herbert Xu 

diff --git a/drivers/net/caif/caif_virtio.c b/drivers/net/caif/caif_virtio.c
index a5fefb9..b306210 100644
--- a/drivers/net/caif/caif_virtio.c
+++ b/drivers/net/caif/caif_virtio.c
@@ -257,7 +257,6 @@ static int cfv_rx_poll(struct napi_struct *napi, int quota)
struct vringh_kiov *riov = &cfv->ctx.riov;
unsigned int skb_len;
 
-again:
do {
skb = NULL;
 
@@ -322,7 +321,6 @@ exit:
napi_schedule_prep(napi)) {
vringh_notify_disable_kern(cfv->vr_rx);
__napi_schedule(napi);
-   goto again;
}
break;

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/4] net: Rearrange loop in net_rx_action

2014-12-20 Thread Herbert Xu
This patch rearranges the loop in net_rx_action to reduce the
amount of jumping back and forth when reading the code.

Signed-off-by: Herbert Xu 
---

 net/core/dev.c |   26 --
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 22b0fb2..dd565a5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4631,9 +4631,15 @@ static void net_rx_action(struct softirq_action *h)
list_splice_init(&sd->poll_list, &list);
local_irq_enable();
 
-   while (!list_empty(&list)) {
+   for (;;) {
struct napi_struct *n;
 
+   if (list_empty(&list)) {
+   if (!sd_has_rps_ipi_waiting(sd) && list_empty(&repoll))
+   return;
+   break;
+   }
+
n = list_first_entry(&list, struct napi_struct, poll_list);
budget -= napi_poll(n, &repoll);
 
@@ -4641,15 +4647,13 @@ static void net_rx_action(struct softirq_action *h)
 * Allow this to run for 2 jiffies since which will allow
 * an average latency of 1.5/HZ.
 */
-   if (unlikely(budget <= 0 || time_after_eq(jiffies, time_limit)))
-   goto softnet_break;
+   if (unlikely(budget <= 0 ||
+time_after_eq(jiffies, time_limit))) {
+   sd->time_squeeze++;
+   break;
+   }
}
 
-   if (!sd_has_rps_ipi_waiting(sd) &&
-   list_empty(&list) &&
-   list_empty(&repoll))
-   return;
-out:
local_irq_disable();
 
list_splice_tail_init(&sd->poll_list, &list);
@@ -4659,12 +4663,6 @@ out:
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
 
net_rps_action_and_irq_enable(sd);
-
-   return;
-
-softnet_break:
-   sd->time_squeeze++;
-   goto out;
 }
 
 struct netdev_adjacent {

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/4] net: Detect drivers that reschedule NAPI and exhaust budget

2014-12-20 Thread Herbert Xu
The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
interrupt masking in NAPI) required drivers to leave poll_list
empty if the entire budget is consumed.

We have already had two broken drivers so let's add a check for
this.

Signed-off-by: Herbert Xu 
---

 net/core/dev.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index f7c4f4e..4a9c424 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4602,6 +4602,15 @@ static int napi_poll(struct napi_struct *n, struct 
list_head *repoll)
napi_gro_flush(n, HZ >= 1000);
}
 
+   /* Some drivers may have called napi_schedule
+* prior to exhausting their budget.
+*/
+   if (unlikely(!list_empty(&n->poll_list))) {
+   pr_warn_once("%s: Budget exhausted after napi rescheduled\n",
+n->dev ? n->dev->name : "backlog");
+   goto out_unlock;
+   }
+
list_add_tail(&n->poll_list, repoll);
 
 out_unlock:

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/4] net: Always poll at least one device in net_rx_action

2014-12-20 Thread Herbert Xu
We should only perform the softnet_break check after we have polled
at least one device in net_rx_action.  Otherwise a zero or negative
setting of netdev_budget can lock up the whole system.

Signed-off-by: Herbert Xu 
---

 net/core/dev.c |7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 4a9c424..22b0fb2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4634,16 +4634,15 @@ static void net_rx_action(struct softirq_action *h)
while (!list_empty(&list)) {
struct napi_struct *n;
 
+   n = list_first_entry(&list, struct napi_struct, poll_list);
+   budget -= napi_poll(n, &repoll);
+
/* If softirq window is exhausted then punt.
 * Allow this to run for 2 jiffies since which will allow
 * an average latency of 1.5/HZ.
 */
if (unlikely(budget <= 0 || time_after_eq(jiffies, time_limit)))
goto softnet_break;
-
-
-   n = list_first_entry(&list, struct napi_struct, poll_list);
-   budget -= napi_poll(n, &repoll);
}
 
if (!sd_has_rps_ipi_waiting(sd) &&

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/4] net: Move napi polling code out of net_rx_action

2014-12-20 Thread Herbert Xu
This patch creates a new function napi_poll and moves the napi
polling code from net_rx_action into it.

Signed-off-by: Herbert Xu 
---

 net/core/dev.c |   98 +++--
 1 file changed, 54 insertions(+), 44 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index f411c28..f7c4f4e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4557,6 +4557,59 @@ void netif_napi_del(struct napi_struct *napi)
 }
 EXPORT_SYMBOL(netif_napi_del);
 
+static int napi_poll(struct napi_struct *n, struct list_head *repoll)
+{
+   void *have;
+   int work, weight;
+
+   list_del_init(&n->poll_list);
+
+   have = netpoll_poll_lock(n);
+
+   weight = n->weight;
+
+   /* This NAPI_STATE_SCHED test is for avoiding a race
+* with netpoll's poll_napi().  Only the entity which
+* obtains the lock and sees NAPI_STATE_SCHED set will
+* actually make the ->poll() call.  Therefore we avoid
+* accidentally calling ->poll() when NAPI is not scheduled.
+*/
+   work = 0;
+   if (test_bit(NAPI_STATE_SCHED, &n->state)) {
+   work = n->poll(n, weight);
+   trace_napi_poll(n);
+   }
+
+   WARN_ON_ONCE(work > weight);
+
+   if (likely(work < weight))
+   goto out_unlock;
+
+   /* Drivers must not modify the NAPI state if they
+* consume the entire weight.  In such cases this code
+* still "owns" the NAPI instance and therefore can
+* move the instance around on the list at-will.
+*/
+   if (unlikely(napi_disable_pending(n))) {
+   napi_complete(n);
+   goto out_unlock;
+   }
+
+   if (n->gro_list) {
+   /* flush too old packets
+* If HZ < 1000, flush all packets.
+*/
+   napi_gro_flush(n, HZ >= 1000);
+   }
+
+   list_add_tail(&n->poll_list, repoll);
+
+out_unlock:
+   netpoll_poll_unlock(have);
+
+   return work;
+}
+
 static void net_rx_action(struct softirq_action *h)
 {
struct softnet_data *sd = this_cpu_ptr(&softnet_data);
@@ -4564,7 +4617,6 @@ static void net_rx_action(struct softirq_action *h)
int budget = netdev_budget;
LIST_HEAD(list);
LIST_HEAD(repoll);
-   void *have;
 
local_irq_disable();
list_splice_init(&sd->poll_list, &list);
@@ -4572,7 +4624,6 @@ static void net_rx_action(struct softirq_action *h)
 
while (!list_empty(&list)) {
struct napi_struct *n;
-   int work, weight;
 
/* If softirq window is exhausted then punt.
 * Allow this to run for 2 jiffies since which will allow
@@ -4583,48 +4634,7 @@ static void net_rx_action(struct softirq_action *h)
 
 
n = list_first_entry(&list, struct napi_struct, poll_list);
-   list_del_init(&n->poll_list);
-
-   have = netpoll_poll_lock(n);
-
-   weight = n->weight;
-
-   /* This NAPI_STATE_SCHED test is for avoiding a race
-* with netpoll's poll_napi().  Only the entity which
-* obtains the lock and sees NAPI_STATE_SCHED set will
-* actually make the ->poll() call.  Therefore we avoid
-* accidentally calling ->poll() when NAPI is not scheduled.
-*/
-   work = 0;
-   if (test_bit(NAPI_STATE_SCHED, &n->state)) {
-   work = n->poll(n, weight);
-   trace_napi_poll(n);
-   }
-
-   WARN_ON_ONCE(work > weight);
-
-   budget -= work;
-
-   /* Drivers must not modify the NAPI state if they
-* consume the entire weight.  In such cases this code
-* still "owns" the NAPI instance and therefore can
-* move the instance around on the list at-will.
-*/
-   if (unlikely(work == weight)) {
-   if (unlikely(napi_disable_pending(n))) {
-   napi_complete(n);
-   } else {
-   if (n->gro_list) {
-   /* flush too old packets
-* If HZ < 1000, flush all packets.
-*/
-   napi_gro_flush(n, HZ >= 1000);
-   }
-   list_add_tail(&n->poll_list, &repoll);
-   }
-   }
-
-   netpoll_poll_unlock(have);
+   budget -= napi_poll(n, &repoll);
}
 
if (!sd_has_rps_ipi_waiting(sd) &&

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [0/4] net: net_rx_action fixes and clean-ups

2014-12-20 Thread Herbert Xu
On Sat, Dec 20, 2014 at 10:00:12AM -0800, Eric Dumazet wrote:
> 
> OK, but could you :
> 1) use pr_warn_once()
> 2) split the too long line
>   pr_warn_once("%s: Budget exhausted after napi rescheduled\n",
>n->dev ? n->dev->name : "backlog");

Sure, I'll clean it up a bit too.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] net: Detect drivers that reschedule NAPI and exhaust budget

2014-12-19 Thread Herbert Xu
On Fri, Dec 19, 2014 at 09:40:00PM -0500, David Miller wrote:
> From: Eric Dumazet 
> Date: Fri, 19 Dec 2014 17:34:48 -0800
> 
> >> @@ -4620,7 +4620,11 @@ static void net_rx_action(struct softirq_action *h)
> >> */
> >>napi_gro_flush(n, HZ >= 1000);
> >>}
> >> -  list_add_tail(&n->poll_list, &repoll);
> >> +  /* Some drivers may have called napi_schedule
> >> +   * prior to exhausting their budget.
> >> +   */
> >> +  if (!WARN_ON_ONCE(!list_empty(&n->poll_list)))
> >> +  list_add_tail(&n->poll_list, &repoll);
> >>}
> >>}
> >>  
> > 
> > I do not think stack trace will point to the buggy driver ?
> > 
> > IMO it would be better to print a single line with the netdev name ?
> 
> Right, we are already back from the poll routine and will just end
> up seeing the call trace leading to the software interrupt.

Good point Eric.

-- >8 --
The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
interrupt masking in NAPI) required drivers to leave poll_list
empty if the entire budget is consumed.

We have already had two broken drivers so let's add a check for
this.

Signed-off-by: Herbert Xu 

diff --git a/net/core/dev.c b/net/core/dev.c
index f411c28..47fdc5c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4620,7 +4620,13 @@ static void net_rx_action(struct softirq_action *h)
 */
napi_gro_flush(n, HZ >= 1000);
}
-   list_add_tail(&n->poll_list, &repoll);
+   /* Some drivers may have called napi_schedule
+* prior to exhausting their budget.
+*/
+   if (unlikely(!list_empty(&n->poll_list)))
+   pr_warn("%s: Budget exhausted after 
napi rescheduled\n", n->dev ? n->dev->name : "backlog");
+   else
+   list_add_tail(&n->poll_list, &repoll);
}
}
 
Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] net: Detect drivers that reschedule NAPI and exhaust budget

2014-12-19 Thread Herbert Xu
On Sat, Dec 20, 2014 at 11:23:27AM +1100, Herbert Xu wrote:
>
> A similar bug exists in virtio_net.

In order to detect other drivers doing this we should add something
like this.

-- >8 --
The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
interrupt masking in NAPI) required drivers to leave poll_list
empty if the entire budget is consumed.

We have already had two broken drivers so let's add a check for
this.
 
Signed-off-by: Herbert Xu 

diff --git a/net/core/dev.c b/net/core/dev.c
index f411c28..88f9725 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4620,7 +4620,11 @@ static void net_rx_action(struct softirq_action *h)
 */
napi_gro_flush(n, HZ >= 1000);
}
-   list_add_tail(&n->poll_list, &repoll);
+   /* Some drivers may have called napi_schedule
+* prior to exhausting their budget.
+*/
+   if (!WARN_ON_ONCE(!list_empty(&n->poll_list)))
+   list_add_tail(&n->poll_list, &repoll);
    }
    }
 
Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] virtio_net: Fix napi poll list corruption

2014-12-19 Thread Herbert Xu
David Vrabel  wrote:
> After d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less interrupt
> masking in NAPI) the napi instance is removed from the per-cpu list
> prior to calling the n->poll(), and is only requeued if all of the
> budget was used.  This inadvertently broke netfront because netfront
> does not use NAPI correctly.

A similar bug exists in virtio_net.

-- >8 --
The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
interrupt masking in NAPI) breaks virtio_net in an insidious way.

It is now required that if the entire budget is consumed when poll
returns, the napi poll_list must remain empty.  However, like some
other drivers virtio_net tries to do a last-ditch check and if
there is more work it will call napi_schedule and then immediately
process some of this new work.  Should the entire budget be consumed
while processing such new work then we will violate the new caller
contract.

This patch fixes this by not touching any work when we reschedule
in virtio_net.

The worst part of this bug is that the list corruption causes other
napi users to be moved off-list.  In my case I was chasing a stall
in IPsec (IPsec uses netif_rx) and I only belatedly realised that it
was virtio_net which caused the stall even though the virtio_net
poll was still functioning perfectly after IPsec stalled.

Signed-off-by: Herbert Xu 

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b8bd719..5ca9771 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -760,7 +760,6 @@ static int virtnet_poll(struct napi_struct *napi, int 
budget)
container_of(napi, struct receive_queue, napi);
unsigned int r, received = 0;
 
-again:
received += virtnet_receive(rq, budget - received);
 
/* Out of packets? */
@@ -771,7 +770,6 @@ again:
napi_schedule_prep(napi)) {
virtqueue_disable_cb(rq->vq);
__napi_schedule(napi);
-   goto again;
}
    }

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel