from:"Paul Durrant"

RE: [PATCH 1/1] xen-netback: process malformed sk_buff correctly to avoid BUG_ON()

2018-03-28 Thread Paul Durrant

> -Original Message-
> From: Dongli Zhang [mailto:dongli.zh...@oracle.com]
> Sent: 28 March 2018 00:42
> To: xen-de...@lists.xenproject.org; linux-ker...@vger.kernel.org
> Cc: netdev@vger.kernel.org; Wei Liu <wei.l...@citrix.com>; Paul Durrant
> <paul.durr...@citrix.com>
> Subject: [PATCH 1/1] xen-netback: process malformed sk_buff correctly to
> avoid BUG_ON()
> 
> The "BUG_ON(!frag_iter)" in function xenvif_rx_next_chunk() is triggered if
> the received sk_buff is malformed, that is, when the sk_buff has pattern
> (skb->data_len && !skb_shinfo(skb)->nr_frags). Below is a sample call
> stack:
> 
> [  438.652658] [ cut here ]
> [  438.652660] kernel BUG at drivers/net/xen-netback/rx.c:325!
> [  438.652714] invalid opcode:  [#1] SMP NOPTI
> [  438.652813] CPU: 0 PID: 2492 Comm: vif1.0-q0-guest Tainted: G   O
> 4.16.0-rc6+ #1
> [  438.652896] RIP: e030:xenvif_rx_skb+0x3c2/0x5e0 [xen_netback]
> [  438.652926] RSP: e02b:c90040877dc8 EFLAGS: 00010246
> [  438.652956] RAX: 0160 RBX: 0022 RCX:
> 0001
> [  438.652993] RDX: c900402890d0 RSI:  RDI:
> c90040889000
> [  438.653029] RBP: 88002b460040 R08: c90040877de0 R09:
> 0100
> [  438.653065] R10: 7ff0 R11: 0002 R12:
> c90040889000
> [  438.653100] R13: 8000 R14: 0022 R15:
> 8000
> [  438.653149] FS:  7f15603778c0() GS:88003040()
> knlGS:
> [  438.653188] CS:  e033 DS:  ES:  CR0: 80050033
> [  438.653219] CR2: 01832a08 CR3: 29c12000 CR4:
> 00042660
> [  438.653262] Call Trace:
> [  438.653284]  ? xen_hypercall_event_channel_op+0xa/0x20
> [  438.653313]  xenvif_rx_action+0x41/0x80 [xen_netback]
> [  438.653341]  xenvif_kthread_guest_rx+0xb2/0x2a8 [xen_netback]
> [  438.653374]  ? __schedule+0x352/0x700
> [  438.653398]  ? wait_woken+0x80/0x80
> [  438.653421]  kthread+0xf3/0x130
> [  438.653442]  ? xenvif_rx_action+0x80/0x80 [xen_netback]
> [  438.653470]  ? kthread_destroy_worker+0x40/0x40
> [  438.653497]  ret_from_fork+0x35/0x40
> 
> The issue is hit by xen-netback when there is bug with other networking
> interface (e.g., dom0 physical NIC), who has generated and forwarded
> malformed sk_buff to dom0 vifX.Y. It is possible to reproduce the issue on
> purpose with below sample code in a kernel module:
> 
> skb->dev = dev; // dev of vifX.Y
> skb->len = 386;
> skb->data_len = 352;
> skb->tail = 98;
> skb->end = 384;
> dev->netdev_ops->ndo_start_xmit(skb, dev);
> 
> This patch stops processing sk_buff immediately if it is detected as
> malformed, that is, pkt->frag_iter is NULL but there is still remaining
> pkt->remaining_len.
> 
> Signed-off-by: Dongli Zhang <dongli.zh...@oracle.com>
> ---
>  drivers/net/xen-netback/rx.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
> index b1cf7c6..289cc82 100644
> --- a/drivers/net/xen-netback/rx.c
> +++ b/drivers/net/xen-netback/rx.c
> @@ -369,6 +369,14 @@ static void xenvif_rx_data_slot(struct xenvif_queue
> *queue,
>   offset += len;
>   pkt->remaining_len -= len;
> 
> + if (unlikely(!pkt->frag_iter && pkt->remaining_len)) {
> + pkt->remaining_len = 0;
> + pkt->extra_count = 0;
> + pr_err_ratelimited("malformed sk_buff at %s\n",
> +queue->name);
> + break;
> + }
> +

This looks fine, but I think it would also be good to indicate the error to the 
frontend by setting rsp->status below. That should cause the frontend to bin 
the packet.

  Paul

>   } while (offset < XEN_PAGE_SIZE && pkt->remaining_len > 0);
> 
>   if (pkt->remaining_len > 0)
> --
> 2.7.4

RE: [PATCH net-next v2] xen-netback: make copy batch size configurable

2017-12-21 Thread Paul Durrant

> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 21 December 2017 17:24
> To: netdev@vger.kernel.org
> Cc: Joao Martins <joao.m.mart...@oracle.com>; Wei Liu
> <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>; xen-
> de...@lists.xenproject.org
> Subject: [PATCH net-next v2] xen-netback: make copy batch size
> configurable
> 
> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
> handling and as a result decreased max grant copy ops from 4352 to 64.
> Before this commit it would drain the rx_queue (while there are
> enough slots in the ring to put packets) then copy to all pages and write
> responses on the ring. With the refactor we do almost the same albeit
> the last two steps are done every COPY_BATCH_SIZE (64) copies.
> 
> For big packets, the value of 64 means copying 3 packets best case scenario
> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
> plus head cross the 4k grant boundary) which could be the case when
> packets go from local backend process.
> 
> Instead of making it static to 64 grant copies, lets allow the user to
> select its value (while keeping the current as default) by introducing
> the `copy_batch_size` module parameter. This allows users to select
> the higher batches (i.e. for better throughput with big packets) as it
> was prior to the above mentioned commit.
> 
> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> Changes since v1:
>  * move rx_copy.{idx,op} reallocation to separate helper
>  Addressed Paul's comments:
>  * rename xenvif_copy_state#size field to batch_size
>  * argument `size` should be unsigned int
>  * vfree is safe with NULL
>  * realloc rx_copy.{idx,op} after copy op flush
> ---
>  drivers/net/xen-netback/common.h|  7 +--
>  drivers/net/xen-netback/interface.c | 16 +++-
>  drivers/net/xen-netback/netback.c   |  5 +
>  drivers/net/xen-netback/rx.c| 35
> ++-
>  4 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> netback/common.h
> index a46a1e94505d..8e4eaf3a507d 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -129,8 +129,9 @@ struct xenvif_stats {
>  #define COPY_BATCH_SIZE 64
> 
>  struct xenvif_copy_state {
> - struct gnttab_copy op[COPY_BATCH_SIZE];
> - RING_IDX idx[COPY_BATCH_SIZE];
> + struct gnttab_copy *op;
> + RING_IDX *idx;
> + unsigned int batch_size;
>   unsigned int num;
>   struct sk_buff_head *completed;
>  };
> @@ -358,6 +359,7 @@ irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data);
> 
>  void xenvif_rx_action(struct xenvif_queue *queue);
>  void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff
> *skb);
> +int xenvif_rx_copy_realloc(struct xenvif_queue *queue, unsigned int size);
> 
>  void xenvif_carrier_on(struct xenvif *vif);
> 
> @@ -381,6 +383,7 @@ extern unsigned int rx_drain_timeout_msecs;
>  extern unsigned int rx_stall_timeout_msecs;
>  extern unsigned int xenvif_max_queues;
>  extern unsigned int xenvif_hash_cache_size;
> +extern unsigned int xenvif_copy_batch_size;
> 
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *xen_netback_dbg_root;
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index 78ebe494fef0..e12eb64ab0a9 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -518,6 +518,12 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>  {
>   int err, i;
> 
> + err = xenvif_rx_copy_realloc(queue, xenvif_copy_batch_size);
> + if (err) {
> + netdev_err(queue->vif->dev, "Could not alloc rx_copy\n");
> + goto err;
> + }
> +
>   queue->credit_bytes = queue->remaining_credit = ~0UL;
>   queue->credit_usec  = 0UL;
>   timer_setup(>credit_timeout, xenvif_tx_credit_callback, 0);
> @@ -544,7 +550,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>queue->mmap_pages);
>   if (err) {
>   netdev_err(queue->vif->dev, "Could not reserve
> mmap_pages\n");
> - return -ENOMEM;
> + goto err;
>   }
> 
>   for (i = 0; i < MAX_PENDING_REQS; i++) {
> @@ -556,6 +562,11 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>   }
> 
>   return 0;
> +
> +err:
> + vfree(queue->rx_copy.op);
> + vfree(queue

RE: [PATCH] xen-netback: Fix logging message with spurious period after newline

2017-12-06 Thread Paul Durrant

> -Original Message-
> From: Joe Perches [mailto:j...@perches.com]
> Sent: 06 December 2017 06:40
> To: Wei Liu <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>
> Cc: xen-de...@lists.xenproject.org; netdev@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: [PATCH] xen-netback: Fix logging message with spurious period
> after newline
> 
> Using a period after a newline causes bad output.
> 
> Signed-off-by: Joe Perches <j...@perches.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/interface.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index d6dff347f896..78ebe494fef0 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -186,7 +186,7 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct
> net_device *dev)
>   /* Obtain the queue to be used to transmit this packet */
>   index = skb_get_queue_mapping(skb);
>   if (index >= num_queues) {
> - pr_warn_ratelimited("Invalid queue %hu for packet on
> interface %s\n.",
> + pr_warn_ratelimited("Invalid queue %hu for packet on
> interface %s\n",
>   index, vif->dev->name);
>   index %= num_queues;
>   }
> --
> 2.15.0

RE: [PATCH] xen-netfront: remove warning when unloading module

2017-11-20 Thread Paul Durrant

> -Original Message-
> From: Eduardo Otubo [mailto:ot...@redhat.com]
> Sent: 20 November 2017 10:41
> To: xen-de...@lists.xenproject.org
> Cc: netdev@vger.kernel.org; Paul Durrant <paul.durr...@citrix.com>; Wei
> Liu <wei.l...@citrix.com>; linux-ker...@vger.kernel.org;
> vkuzn...@redhat.com; cav...@redhat.com; che...@redhat.com;
> mga...@redhat.com; Eduardo Otubo <ot...@redhat.com>
> Subject: [PATCH] xen-netfront: remove warning when unloading module
> 
> When unloading module xen_netfront from guest, dmesg would output
> warning messages like below:
> 
>   [  105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use!
>   [  105.236839] deferring g.e. 0x903 (pfn 0x35805)
> 
> This problem relies on netfront and netback being out of sync. By the time
> netfront revokes the g.e.'s netback didn't have enough time to free all of
> them, hence displaying the warnings on dmesg.
> 
> The trick here is to make netfront to wait until netback frees all the g.e.'s
> and only then continue to cleanup for the module removal, and this is done
> by
> manipulating both device states.
> 
> Signed-off-by: Eduardo Otubo <ot...@redhat.com>
> ---
>  drivers/net/xen-netfront.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 8b8689c6d887..b948e2a1ce40 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -2130,6 +2130,17 @@ static int xennet_remove(struct xenbus_device
> *dev)
> 
>   dev_dbg(>dev, "%s\n", dev->nodename);
> 
> + xenbus_switch_state(dev, XenbusStateClosing);
> + while (xenbus_read_driver_state(dev->otherend) !=
> XenbusStateClosing){
> + cpu_relax();
> + schedule();
> + }
> + xenbus_switch_state(dev, XenbusStateClosed);
> + while (dev->xenbus_state != XenbusStateClosed){
> + cpu_relax();
> + schedule();
> + }
> +

Waitiing for closing should be ok but waiting for closed is risky. As soon as a 
backend is in the closed state then a toolstack can completely remove the 
backend xenstore area, resulting a state of XenbusStateUnknown, which would 
cause your second loop to spin forever.

  Paul

>   xennet_disconnect_backend(info);
> 
>   unregister_netdev(info->netdev);
> --
> 2.13.6

RE: [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant

> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 13 November 2017 16:34
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; Wei Liu <wei.l...@citrix.com>; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> On Mon, Nov 13, 2017 at 11:58:03AM +, Paul Durrant wrote:
> > On Mon, Nov 13, 2017 at 11:54:00AM +, Joao Martins wrote:
> > > On 11/13/2017 10:33 AM, Paul Durrant wrote:
> > > > On 11/10/2017 19:35 PM, Joao Martins wrote:
> 
> [snip]
> 
> > > >> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-
> netback/rx.c
> > > >> index b1cf7c6f407a..793a85f61f9d 100644
> > > >> --- a/drivers/net/xen-netback/rx.c
> > > >> +++ b/drivers/net/xen-netback/rx.c
> > > >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
> > > >> xenvif_queue *queue,
> > > >>   struct xen_netif_rx_request *req,
> > > >>   unsigned int offset, void *data, size_t 
> > > >> len)
> > > >>  {
> > > >> +  unsigned int batch_size;
> > > >>struct gnttab_copy *op;
> > > >>struct page *page;
> > > >>struct xen_page_foreign *foreign;
> > > >>
> > > >> -  if (queue->rx_copy.num == COPY_BATCH_SIZE)
> > > >> +  batch_size = min(xenvif_copy_batch_size, queue-
> >rx_copy.size);
> > > >
> > > > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> > > > identical? Why do you need this statement (and hence stack variable)?
> > > >
> > > This statement was to allow to be changed dynamically and would
> > > affect all newly created guests or running guests if value happened
> > > to be smaller than initially allocated. But I suppose I should make
> > > behaviour more consistent with the other params we have right now
> > > and just look at initially allocated one `queue->rx_copy.batch_size` ?
> >
> > Yes, that would certainly be consistent but I can see value in
> > allowing it to be dynamically tuned, so perhaps adding some re-allocation
> > code to allow the batch to be grown as well as shrunk might be nice.
> 
> The shrink one we potentially risk losing data, so we need to gate the
> reallocation whenever `rx_copy.num` is less than the new requested
> batch. Worst case means guestrx_thread simply uses the initial
> allocated value.

Can't you just re-alloc immediately after the flush (when num is guaranteed to 
be zero)?

  Paul

RE: [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant

> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 13 November 2017 11:54
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; Wei Liu <wei.l...@citrix.com>; xen-
> de...@lists.xenproject.org
> Subject: Re: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> On 11/13/2017 10:33 AM, Paul Durrant wrote:
> >> -Original Message-
> >> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> >> Sent: 10 November 2017 19:35
> >> To: netdev@vger.kernel.org
> >> Cc: Joao Martins <joao.m.mart...@oracle.com>; Wei Liu
> >> <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>; xen-
> >> de...@lists.xenproject.org
> >> Subject: [PATCH net-next v1] xen-netback: make copy batch size
> >> configurable
> >>
> >> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
> >> handling and as a result decreased max grant copy ops from 4352 to 64.
> >> Before this commit it would drain the rx_queue (while there are
> >> enough slots in the ring to put packets) then copy to all pages and write
> >> responses on the ring. With the refactor we do almost the same albeit
> >> the last two steps are done every COPY_BATCH_SIZE (64) copies.
> >>
> >> For big packets, the value of 64 means copying 3 packets best case
> scenario
> >> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
> >> plus head cross the 4k grant boundary) which could be the case when
> >> packets go from local backend process.
> >>
> >> Instead of making it static to 64 grant copies, lets allow the user to
> >> select its value (while keeping the current as default) by introducing
> >> the `copy_batch_size` module parameter. This allows users to select
> >> the higher batches (i.e. for better throughput with big packets) as it
> >> was prior to the above mentioned commit.
> >>
> >> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>
> >> ---
> >>  drivers/net/xen-netback/common.h|  6 --
> >>  drivers/net/xen-netback/interface.c | 25
> -
> >>  drivers/net/xen-netback/netback.c   |  5 +
> >>  drivers/net/xen-netback/rx.c|  5 -
> >>  4 files changed, 37 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> >> netback/common.h
> >> index a46a1e94505d..a5fe36e098a7 100644
> >> --- a/drivers/net/xen-netback/common.h
> >> +++ b/drivers/net/xen-netback/common.h
> >> @@ -129,8 +129,9 @@ struct xenvif_stats {
> >>  #define COPY_BATCH_SIZE 64
> >>
> >>  struct xenvif_copy_state {
> >> -  struct gnttab_copy op[COPY_BATCH_SIZE];
> >> -  RING_IDX idx[COPY_BATCH_SIZE];
> >> +  struct gnttab_copy *op;
> >> +  RING_IDX *idx;
> >> +  unsigned int size;
> >
> > Could you name this batch_size, or something like that to make it clear
> what it means?
> >
> Yeap, will change it.
> 
> >>unsigned int num;
> >>struct sk_buff_head *completed;
> >>  };
> >> @@ -381,6 +382,7 @@ extern unsigned int rx_drain_timeout_msecs;
> >>  extern unsigned int rx_stall_timeout_msecs;
> >>  extern unsigned int xenvif_max_queues;
> >>  extern unsigned int xenvif_hash_cache_size;
> >> +extern unsigned int xenvif_copy_batch_size;
> >>
> >>  #ifdef CONFIG_DEBUG_FS
> >>  extern struct dentry *xen_netback_dbg_root;
> >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> >> netback/interface.c
> >> index d6dff347f896..a558868a883f 100644
> >> --- a/drivers/net/xen-netback/interface.c
> >> +++ b/drivers/net/xen-netback/interface.c
> >> @@ -516,7 +516,20 @@ struct xenvif *xenvif_alloc(struct device *parent,
> >> domid_t domid,
> >>
> >>  int xenvif_init_queue(struct xenvif_queue *queue)
> >>  {
> >> +  int size = xenvif_copy_batch_size;
> >
> > unsigned int
> >>>   int err, i;
> >> +  void *addr;
> >> +
> >> +  addr = vzalloc(size * sizeof(struct gnttab_copy));
> >
> > Does the memory need to be zeroed?
> >
> It doesn't need to be but given that xenvif_queue is zeroed (which included
> this
> region) thus thought I would leave the same way.

Ok.

> 
> >> +  if (!addr)
> >> +  goto err;
> >> +

RE: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 13 November 2017 10:50
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Wei Liu <wei.l...@citrix.com>; xen-de...@lists.xenproject.org; 'Joao
> Martins' <joao.m.mart...@oracle.com>; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH net-next v1] xen-netback: make copy batch
> size configurable
> 
> >>> On 13.11.17 at 11:33, <paul.durr...@citrix.com> wrote:
> >> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> >> Sent: 10 November 2017 19:35
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -96,6 +96,11 @@ unsigned int xenvif_hash_cache_size =
> >> XENVIF_HASH_CACHE_SIZE_DEFAULT;
> >>  module_param_named(hash_cache_size, xenvif_hash_cache_size, uint,
> >> 0644);
> 
> Isn't the "owner-write" permission here ...
> 
> >> --- a/drivers/net/xen-netback/rx.c
> >> +++ b/drivers/net/xen-netback/rx.c
> >> @@ -168,11 +168,14 @@ static void xenvif_rx_copy_add(struct
> >> xenvif_queue *queue,
> >>   struct xen_netif_rx_request *req,
> >>   unsigned int offset, void *data, size_t len)
> >>  {
> >> +  unsigned int batch_size;
> >>struct gnttab_copy *op;
> >>struct page *page;
> >>struct xen_page_foreign *foreign;
> >>
> >> -  if (queue->rx_copy.num == COPY_BATCH_SIZE)
> >> +  batch_size = min(xenvif_copy_batch_size, queue->rx_copy.size);
> >
> > Surely queue->rx_copy.size and xenvif_copy_batch_size are always
> identical?
> > Why do you need this statement (and hence stack variable)?
> 
> ... the answer to your question?

Yes, I guess it could be... but since there's no re-alloc code for the arrays I 
wonder whether the intention was to make this dynamic or not.

  Paul

> 
> Jan

RE: [PATCH net-next v1] xen-netback: make copy batch size configurable

2017-11-13 Thread Paul Durrant

> -Original Message-
> From: Joao Martins [mailto:joao.m.mart...@oracle.com]
> Sent: 10 November 2017 19:35
> To: netdev@vger.kernel.org
> Cc: Joao Martins <joao.m.mart...@oracle.com>; Wei Liu
> <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>; xen-
> de...@lists.xenproject.org
> Subject: [PATCH net-next v1] xen-netback: make copy batch size
> configurable
> 
> Commit eb1723a29b9a ("xen-netback: refactor guest rx") refactored Rx
> handling and as a result decreased max grant copy ops from 4352 to 64.
> Before this commit it would drain the rx_queue (while there are
> enough slots in the ring to put packets) then copy to all pages and write
> responses on the ring. With the refactor we do almost the same albeit
> the last two steps are done every COPY_BATCH_SIZE (64) copies.
> 
> For big packets, the value of 64 means copying 3 packets best case scenario
> (17 copies) and worst-case only 1 packet (34 copies, i.e. if all frags
> plus head cross the 4k grant boundary) which could be the case when
> packets go from local backend process.
> 
> Instead of making it static to 64 grant copies, lets allow the user to
> select its value (while keeping the current as default) by introducing
> the `copy_batch_size` module parameter. This allows users to select
> the higher batches (i.e. for better throughput with big packets) as it
> was prior to the above mentioned commit.
> 
> Signed-off-by: Joao Martins <joao.m.mart...@oracle.com>
> ---
>  drivers/net/xen-netback/common.h|  6 --
>  drivers/net/xen-netback/interface.c | 25 -
>  drivers/net/xen-netback/netback.c   |  5 +
>  drivers/net/xen-netback/rx.c|  5 -
>  4 files changed, 37 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> netback/common.h
> index a46a1e94505d..a5fe36e098a7 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -129,8 +129,9 @@ struct xenvif_stats {
>  #define COPY_BATCH_SIZE 64
> 
>  struct xenvif_copy_state {
> - struct gnttab_copy op[COPY_BATCH_SIZE];
> - RING_IDX idx[COPY_BATCH_SIZE];
> + struct gnttab_copy *op;
> + RING_IDX *idx;
> + unsigned int size;

Could you name this batch_size, or something like that to make it clear what it 
means?

>   unsigned int num;
>   struct sk_buff_head *completed;
>  };
> @@ -381,6 +382,7 @@ extern unsigned int rx_drain_timeout_msecs;
>  extern unsigned int rx_stall_timeout_msecs;
>  extern unsigned int xenvif_max_queues;
>  extern unsigned int xenvif_hash_cache_size;
> +extern unsigned int xenvif_copy_batch_size;
> 
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *xen_netback_dbg_root;
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index d6dff347f896..a558868a883f 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -516,7 +516,20 @@ struct xenvif *xenvif_alloc(struct device *parent,
> domid_t domid,
> 
>  int xenvif_init_queue(struct xenvif_queue *queue)
>  {
> + int size = xenvif_copy_batch_size;

unsigned int

>   int err, i;
> + void *addr;
> +
> + addr = vzalloc(size * sizeof(struct gnttab_copy));

Does the memory need to be zeroed?

> + if (!addr)
> + goto err;
> + queue->rx_copy.op = addr;
> +
> + addr = vzalloc(size * sizeof(RING_IDX));

Likewise.

> + if (!addr)
> + goto err;
> + queue->rx_copy.idx = addr;
> + queue->rx_copy.size = size;
> 
>   queue->credit_bytes = queue->remaining_credit = ~0UL;
>   queue->credit_usec  = 0UL;
> @@ -544,7 +557,7 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>queue->mmap_pages);
>   if (err) {
>   netdev_err(queue->vif->dev, "Could not reserve
> mmap_pages\n");
> - return -ENOMEM;
> + goto err;
>   }
> 
>   for (i = 0; i < MAX_PENDING_REQS; i++) {
> @@ -556,6 +569,13 @@ int xenvif_init_queue(struct xenvif_queue *queue)
>   }
> 
>   return 0;
> +
> +err:
> + if (queue->rx_copy.op)
> + vfree(queue->rx_copy.op);

vfree is safe to be called with NULL.

> + if (queue->rx_copy.idx)
> + vfree(queue->rx_copy.idx);
> + return -ENOMEM;
>  }
> 
>  void xenvif_carrier_on(struct xenvif *vif)
> @@ -788,6 +808,9 @@ void xenvif_disconnect_ctrl(struct xenvif *vif)
>   */
>  void xenvif_deinit_queue(struct xenvif_queue *queue)
>  {
> +

RE: [PATCH net] xen-netback: correctly schedule rate-limited queues

2017-06-21 Thread Paul Durrant

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: 21 June 2017 10:21
> To: netdev@vger.kernel.org
> Cc: Xen-devel <xen-de...@lists.xenproject.org>; Paul Durrant
> <paul.durr...@citrix.com>; David Miller <da...@davemloft.net>; jean-
> lo...@dupond.be; Wei Liu <wei.l...@citrix.com>
> Subject: [PATCH net] xen-netback: correctly schedule rate-limited queues
> 
> Add a flag to indicate if a queue is rate-limited. Test the flag in
> NAPI poll handler and avoid rescheduling the queue if true, otherwise
> we risk locking up the host. The rescheduling will be done in the
> timer callback function.
> 
> Reported-by: Jean-Louis Dupond <jean-lo...@dupond.be>
> Signed-off-by: Wei Liu <wei.l...@citrix.com>
> Tested-by: Jean-Louis Dupond <jean-lo...@dupond.be>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/common.h| 1 +
>  drivers/net/xen-netback/interface.c | 6 +-
>  drivers/net/xen-netback/netback.c   | 6 +-
>  3 files changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> netback/common.h
> index 530586be05b4..5b1d2e8402d9 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -199,6 +199,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>   unsigned long   remaining_credit;
>   struct timer_list credit_timeout;
>   u64 credit_window_start;
> + bool rate_limited;
> 
>   /* Statistics */
>   struct xenvif_stats stats;
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index 8397f6c92451..e322a862ddfe 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -106,7 +106,11 @@ static int xenvif_poll(struct napi_struct *napi, int
> budget)
> 
>   if (work_done < budget) {
>   napi_complete_done(napi, work_done);
> - xenvif_napi_schedule_or_enable_events(queue);
> + /* If the queue is rate-limited, it shall be
> +  * rescheduled in the timer callback.
> +  */
> + if (likely(!queue->rate_limited))
> + xenvif_napi_schedule_or_enable_events(queue);
>   }
> 
>   return work_done;
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> index 602d408fa25e..5042ff8d449a 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -180,6 +180,7 @@ static void tx_add_credit(struct xenvif_queue
> *queue)
>   max_credit = ULONG_MAX; /* wrapped: clamp to
> ULONG_MAX */
> 
>   queue->remaining_credit = min(max_credit, max_burst);
> + queue->rate_limited = false;
>  }
> 
>  void xenvif_tx_credit_callback(unsigned long data)
> @@ -686,8 +687,10 @@ static bool tx_credit_exceeded(struct xenvif_queue
> *queue, unsigned size)
>   msecs_to_jiffies(queue->credit_usec / 1000);
> 
>   /* Timer could already be pending in rare cases. */
> - if (timer_pending(>credit_timeout))
> + if (timer_pending(>credit_timeout)) {
> + queue->rate_limited = true;
>   return true;
> + }
> 
>   /* Passed the point where we can replenish credit? */
>   if (time_after_eq64(now, next_credit)) {
> @@ -702,6 +705,7 @@ static bool tx_credit_exceeded(struct xenvif_queue
> *queue, unsigned size)
>   mod_timer(>credit_timeout,
> next_credit);
>   queue->credit_window_start = next_credit;
> + queue->rate_limited = true;
> 
>   return true;
>   }
> --
> 2.11.0

RE: [PATCH net v3] xen-netback: fix race condition on XenBus disconnect

2017-03-09 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 09 March 2017 19:42
> To: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> Cc: Paul Durrant <paul.durr...@citrix.com>; jgr...@suse.com; Wei Liu
> <wei.l...@citrix.com>; Igor Druzhinin <igor.druzhi...@citrix.com>
> Subject: [PATCH net v3] xen-netback: fix race condition on XenBus
> disconnect
> 
> In some cases during XenBus disconnect event handling and subsequent
> queue resource release there may be some TX handlers active on
> other processors. Use RCU in order to synchronize with them.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> v3:
>  * Fix unintended semantic change in xenvif_get_ethtool_stats
>  * Dropped extra code
> 
> v2:
>  * Add protection for xenvif_get_ethtool_stats
>  * Additional comments and fixes
> ---
>  drivers/net/xen-netback/interface.c | 26 +-
>  drivers/net/xen-netback/netback.c   |  2 +-
>  drivers/net/xen-netback/xenbus.c| 20 ++--
>  3 files changed, 28 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index 829b26c..a3c018e 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -165,13 +165,17 @@ static int xenvif_start_xmit(struct sk_buff *skb,
> struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> - unsigned int num_queues = vif->num_queues;
> + unsigned int num_queues;
>   u16 index;
>   struct xenvif_rx_cb *cb;
> 
>   BUG_ON(skb->dev != dev);
> 
> - /* Drop the packet if queues are not set up */
> + /* Drop the packet if queues are not set up.
> +  * This handler should be called inside an RCU read section
> +  * so we don't need to enter it here explicitly.
> +  */
> + num_queues = rcu_dereference(vif)->num_queues;
>   if (num_queues < 1)
>   goto drop;
> 
> @@ -222,18 +226,18 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> + unsigned int num_queues;
>   u64 rx_bytes = 0;
>   u64 rx_packets = 0;
>   u64 tx_bytes = 0;
>   u64 tx_packets = 0;
>   unsigned int index;
> 
> - spin_lock(>lock);
> - if (vif->queues == NULL)
> - goto out;
> + rcu_read_lock();
> + num_queues = rcu_dereference(vif)->num_queues;
> 
>   /* Aggregate tx and rx stats from each queue */
> - for (index = 0; index < vif->num_queues; ++index) {
> + for (index = 0; index < num_queues; ++index) {
>   queue = >queues[index];
>   rx_bytes += queue->stats.rx_bytes;
>   rx_packets += queue->stats.rx_packets;
> @@ -241,8 +245,7 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>   tx_packets += queue->stats.tx_packets;
>   }
> 
> -out:
> - spin_unlock(>lock);
> + rcu_read_unlock();
> 
>   vif->dev->stats.rx_bytes = rx_bytes;
>   vif->dev->stats.rx_packets = rx_packets;
> @@ -378,10 +381,13 @@ static void xenvif_get_ethtool_stats(struct
> net_device *dev,
>struct ethtool_stats *stats, u64 * data)
>  {
>   struct xenvif *vif = netdev_priv(dev);
> - unsigned int num_queues = vif->num_queues;
> + unsigned int num_queues;
>   int i;
>   unsigned int queue_index;
> 
> + rcu_read_lock();
> + num_queues = rcu_dereference(vif)->num_queues;
> +
>   for (i = 0; i < ARRAY_SIZE(xenvif_stats); i++) {
>   unsigned long accum = 0;
>   for (queue_index = 0; queue_index < num_queues;
> ++queue_index) {
> @@ -390,6 +396,8 @@ static void xenvif_get_ethtool_stats(struct
> net_device *dev,
>   }
>   data[i] = accum;
>   }
> +
> + rcu_read_unlock();
>  }
> 
>  static void xenvif_get_strings(struct net_device *dev, u32 stringset, u8 *
> data)
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback.c
> index f9bcf4a..602d408 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -214,7 +214,7 @@ static void xenvif_fatal_tx_err(struct xenvif *vif)
>   netdev_err(vif->dev, "fatal error; disabling device\n");
>   v

RE: [PATCH net v2] xen-netback: fix race condition on XenBus disconnect

2017-03-06 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 03 March 2017 20:23
> To: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> Cc: Paul Durrant <paul.durr...@citrix.com>; jgr...@suse.com; Wei Liu
> <wei.l...@citrix.com>; Igor Druzhinin <igor.druzhi...@citrix.com>
> Subject: [PATCH net v2] xen-netback: fix race condition on XenBus
> disconnect
> 
> In some cases during XenBus disconnect event handling and subsequent
> queue resource release there may be some TX handlers active on
> other processors. Use RCU in order to synchronize with them.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>
> ---
> v2:
>  * Add protection for xenvif_get_ethtool_stats
>  * Additional comments and fixes
> ---
>  drivers/net/xen-netback/interface.c | 29 ++---
>  drivers/net/xen-netback/netback.c   |  2 +-
>  drivers/net/xen-netback/xenbus.c| 20 ++--
>  3 files changed, 33 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index a2d32676..266b7cd 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -164,13 +164,17 @@ static int xenvif_start_xmit(struct sk_buff *skb,
> struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> - unsigned int num_queues = vif->num_queues;
> + unsigned int num_queues;
>   u16 index;
>   struct xenvif_rx_cb *cb;
> 
>   BUG_ON(skb->dev != dev);
> 
> - /* Drop the packet if queues are not set up */
> + /* Drop the packet if queues are not set up.
> +  * This handler should be called inside an RCU read section
> +  * so we don't need to enter it here explicitly.
> +  */
> + num_queues = rcu_dereference(vif)->num_queues;
>   if (num_queues < 1)
>   goto drop;
> 
> @@ -221,18 +225,21 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> + unsigned int num_queues;
>   u64 rx_bytes = 0;
>   u64 rx_packets = 0;
>   u64 tx_bytes = 0;
>   u64 tx_packets = 0;
>   unsigned int index;
> 
> - spin_lock(>lock);
> - if (vif->queues == NULL)
> + rcu_read_lock();
> +
> + num_queues = rcu_dereference(vif)->num_queues;
> + if (num_queues < 1)
>   goto out;

Is this if clause worth it? All it does is jump over the for loop, which would 
not be executed anyway, since the initial test (0 < 0) would fail.

> 
>   /* Aggregate tx and rx stats from each queue */
> - for (index = 0; index < vif->num_queues; ++index) {
> + for (index = 0; index < num_queues; ++index) {
>   queue = >queues[index];
>   rx_bytes += queue->stats.rx_bytes;
>   rx_packets += queue->stats.rx_packets;
> @@ -241,7 +248,7 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>   }
> 
>  out:
> - spin_unlock(>lock);
> + rcu_read_unlock();
> 
>   vif->dev->stats.rx_bytes = rx_bytes;
>   vif->dev->stats.rx_packets = rx_packets;
> @@ -377,10 +384,16 @@ static void xenvif_get_ethtool_stats(struct
> net_device *dev,
>struct ethtool_stats *stats, u64 * data)
>  {
>   struct xenvif *vif = netdev_priv(dev);
> - unsigned int num_queues = vif->num_queues;
> + unsigned int num_queues;
>   int i;
>   unsigned int queue_index;
> 
> + rcu_read_lock();
> +
> + num_queues = rcu_dereference(vif)->num_queues;
> + if (num_queues < 1)
> + goto out;
> +

You have introduced a semantic change with the above if clause. The 
xenvif_stats array was previously zeroed if num_queues < 1. It appears that 
ethtool does actually allocate a zeroed array to pass in here, but I wonder 
whether it is still safer to have this function zero it anyway. 

>   for (i = 0; i < ARRAY_SIZE(xenvif_stats); i++) {
>   unsigned long accum = 0;
>   for (queue_index = 0; queue_index < num_queues;
> ++queue_index) {
> @@ -389,6 +402,8 @@ static void xenvif_get_ethtool_stats(struct
> net_device *dev,
>   }
>   data[i] = accum;
>   }
> +out:
> + rcu_read_unlock();
>  }
> 
>  static void xenvif_get_strings(struct net_device *dev, u32 stringset, u8 *
> data)
> diff --git a/drivers/net/xen-netback/netback.c b/

RE: [PATCH] xen-netback: fix race condition on XenBus disconnect

2017-03-03 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin
> Sent: 03 March 2017 13:54
> To: Paul Durrant <paul.durr...@citrix.com>; netdev@vger.kernel.org; xen-
> de...@lists.xenproject.org
> Cc: jgr...@suse.com; Wei Liu <wei.l...@citrix.com>
> Subject: Re: [PATCH] xen-netback: fix race condition on XenBus disconnect
> 
> On 03/03/17 09:18, Paul Durrant wrote:
> >> -Original Message-
> >> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> >> Sent: 02 March 2017 22:57
> >> To: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> >> Cc: Paul Durrant <paul.durr...@citrix.com>; jgr...@suse.com; Wei Liu
> >> <wei.l...@citrix.com>; Igor Druzhinin <igor.druzhi...@citrix.com>
> >> Subject: [PATCH] xen-netback: fix race condition on XenBus disconnect
> >>
> >> In some cases during XenBus disconnect event handling and subsequent
> >> queue resource release there may be some TX handlers active on
> >> other processors. Use RCU in order to synchronize with them.
> >>
> >> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>
> >> ---
> >>  drivers/net/xen-netback/interface.c | 13 -
> >>  drivers/net/xen-netback/xenbus.c| 17 +++--
> >>  2 files changed, 15 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> >> netback/interface.c
> >> index a2d32676..32e2cc6 100644
> >> --- a/drivers/net/xen-netback/interface.c
> >> +++ b/drivers/net/xen-netback/interface.c
> >> @@ -164,7 +164,7 @@ static int xenvif_start_xmit(struct sk_buff *skb,
> struct
> >> net_device *dev)
> >>  {
> >>struct xenvif *vif = netdev_priv(dev);
> >>struct xenvif_queue *queue = NULL;
> >> -  unsigned int num_queues = vif->num_queues;
> >
> > Do you not need an rcu_read_lock() around this and use of the
> num_queues value (as you have below)?
> >
> >> +  unsigned int num_queues = rcu_dereference(vif)->num_queues;
> >>u16 index;
> >>struct xenvif_rx_cb *cb;
> >>
> >> @@ -221,18 +221,21 @@ static struct net_device_stats
> >> *xenvif_get_stats(struct net_device *dev)
> >>  {
> >>struct xenvif *vif = netdev_priv(dev);
> >>struct xenvif_queue *queue = NULL;
> >> +  unsigned int num_queues;
> >>u64 rx_bytes = 0;
> >>u64 rx_packets = 0;
> >>u64 tx_bytes = 0;
> >>u64 tx_packets = 0;
> >>unsigned int index;
> >>
> >> -  spin_lock(>lock);
> >> -  if (vif->queues == NULL)
> >> +  rcu_read_lock();
> >> +
> >> +  num_queues = rcu_dereference(vif)->num_queues;
> >> +  if (num_queues < 1)
> >>goto out;
> >>
> >>/* Aggregate tx and rx stats from each queue */
> >> -  for (index = 0; index < vif->num_queues; ++index) {
> >> +  for (index = 0; index < num_queues; ++index) {
> >>queue = >queues[index];
> >>rx_bytes += queue->stats.rx_bytes;
> >>rx_packets += queue->stats.rx_packets;
> >> @@ -241,7 +244,7 @@ static struct net_device_stats
> >> *xenvif_get_stats(struct net_device *dev)
> >>}
> >>
> >>  out:
> >> -  spin_unlock(>lock);
> >> +  rcu_read_unlock();
> >>
> >>vif->dev->stats.rx_bytes = rx_bytes;
> >>vif->dev->stats.rx_packets = rx_packets;
> >> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> >> netback/xenbus.c
> >> index d2d7cd9..76efb01 100644
> >> --- a/drivers/net/xen-netback/xenbus.c
> >> +++ b/drivers/net/xen-netback/xenbus.c
> >> @@ -495,26 +495,23 @@ static void backend_disconnect(struct
> >> backend_info *be)
> >>struct xenvif *vif = be->vif;
> >>
> >>if (vif) {
> >> +  unsigned int num_queues = vif->num_queues;
> >>unsigned int queue_index;
> >> -  struct xenvif_queue *queues;
> >>
> >>xen_unregister_watchers(vif);
> >>  #ifdef CONFIG_DEBUG_FS
> >>xenvif_debugfs_delif(vif);
> >>  #endif /* CONFIG_DEBUG_FS */
> >>xenvif_disconnect_data(vif);
> >> -  for (queue_index = 0;
> >> -   queue_index < vif->num_queues;
> >> -   ++queue_index)
> >> -  xenvif_d

RE: [PATCH] xen-netback: fix race condition on XenBus disconnect

2017-03-03 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin
> Sent: 03 March 2017 13:56
> To: Paul Durrant <paul.durr...@citrix.com>; netdev@vger.kernel.org; xen-
> de...@lists.xenproject.org
> Cc: jgr...@suse.com; Wei Liu <wei.l...@citrix.com>
> Subject: Re: [PATCH] xen-netback: fix race condition on XenBus disconnect
> 
> On 03/03/17 09:18, Paul Durrant wrote:
> >> -Original Message-
> >> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> >> Sent: 02 March 2017 22:57
> >> To: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> >> Cc: Paul Durrant <paul.durr...@citrix.com>; jgr...@suse.com; Wei Liu
> >> <wei.l...@citrix.com>; Igor Druzhinin <igor.druzhi...@citrix.com>
> >> Subject: [PATCH] xen-netback: fix race condition on XenBus disconnect
> >>
> >> In some cases during XenBus disconnect event handling and subsequent
> >> queue resource release there may be some TX handlers active on
> >> other processors. Use RCU in order to synchronize with them.
> >>
> >> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>
> >> ---
> >>  drivers/net/xen-netback/interface.c | 13 -
> >>  drivers/net/xen-netback/xenbus.c| 17 +++--
> >>  2 files changed, 15 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> >> netback/interface.c
> >> index a2d32676..32e2cc6 100644
> >> --- a/drivers/net/xen-netback/interface.c
> >> +++ b/drivers/net/xen-netback/interface.c
> >> @@ -164,7 +164,7 @@ static int xenvif_start_xmit(struct sk_buff *skb,
> struct
> >> net_device *dev)
> >>  {
> >>struct xenvif *vif = netdev_priv(dev);
> >>struct xenvif_queue *queue = NULL;
> >> -  unsigned int num_queues = vif->num_queues;
> >
> > Do you not need an rcu_read_lock() around this and use of the
> num_queues value (as you have below)?
> 
> Huh, missed this one. Point is that xenvif_start_xmit is already in RCU
> read section.
> 

Ok. Probably worth a comment then since the rcu_deref looks wrong out on its 
own like that.

  Paul

> Igor
> 
> >
> >> +  unsigned int num_queues = rcu_dereference(vif)->num_queues;
> >>u16 index;
> >>struct xenvif_rx_cb *cb;
> >>
> >> @@ -221,18 +221,21 @@ static struct net_device_stats
> >> *xenvif_get_stats(struct net_device *dev)
> >>  {
> >>struct xenvif *vif = netdev_priv(dev);
> >>struct xenvif_queue *queue = NULL;
> >> +  unsigned int num_queues;
> >>u64 rx_bytes = 0;
> >>u64 rx_packets = 0;
> >>u64 tx_bytes = 0;
> >>u64 tx_packets = 0;
> >>unsigned int index;
> >>
> >> -  spin_lock(>lock);
> >> -  if (vif->queues == NULL)
> >> +  rcu_read_lock();
> >> +
> >> +  num_queues = rcu_dereference(vif)->num_queues;
> >> +  if (num_queues < 1)
> >>goto out;
> >>
> >>/* Aggregate tx and rx stats from each queue */
> >> -  for (index = 0; index < vif->num_queues; ++index) {
> >> +  for (index = 0; index < num_queues; ++index) {
> >>queue = >queues[index];
> >>rx_bytes += queue->stats.rx_bytes;
> >>rx_packets += queue->stats.rx_packets;
> >> @@ -241,7 +244,7 @@ static struct net_device_stats
> >> *xenvif_get_stats(struct net_device *dev)
> >>}
> >>
> >>  out:
> >> -  spin_unlock(>lock);
> >> +  rcu_read_unlock();
> >>
> >>vif->dev->stats.rx_bytes = rx_bytes;
> >>vif->dev->stats.rx_packets = rx_packets;
> >> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> >> netback/xenbus.c
> >> index d2d7cd9..76efb01 100644
> >> --- a/drivers/net/xen-netback/xenbus.c
> >> +++ b/drivers/net/xen-netback/xenbus.c
> >> @@ -495,26 +495,23 @@ static void backend_disconnect(struct
> >> backend_info *be)
> >>struct xenvif *vif = be->vif;
> >>
> >>if (vif) {
> >> +  unsigned int num_queues = vif->num_queues;
> >>unsigned int queue_index;
> >> -  struct xenvif_queue *queues;
> >>
> >>xen_unregister_watchers(vif);
> >>  #ifdef CONFIG_DEBUG_FS
> >>xenvif_debugfs_delif(vif);
> >>  #endif /* CONFIG_DEBUG_FS */
> >

RE: [PATCH] xen-netback: fix race condition on XenBus disconnect

2017-03-03 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 02 March 2017 22:57
> To: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> Cc: Paul Durrant <paul.durr...@citrix.com>; jgr...@suse.com; Wei Liu
> <wei.l...@citrix.com>; Igor Druzhinin <igor.druzhi...@citrix.com>
> Subject: [PATCH] xen-netback: fix race condition on XenBus disconnect
> 
> In some cases during XenBus disconnect event handling and subsequent
> queue resource release there may be some TX handlers active on
> other processors. Use RCU in order to synchronize with them.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>
> ---
>  drivers/net/xen-netback/interface.c | 13 -
>  drivers/net/xen-netback/xenbus.c| 17 +++--
>  2 files changed, 15 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index a2d32676..32e2cc6 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -164,7 +164,7 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct
> net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> - unsigned int num_queues = vif->num_queues;

Do you not need an rcu_read_lock() around this and use of the num_queues value 
(as you have below)?

> + unsigned int num_queues = rcu_dereference(vif)->num_queues;
>   u16 index;
>   struct xenvif_rx_cb *cb;
> 
> @@ -221,18 +221,21 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> + unsigned int num_queues;
>   u64 rx_bytes = 0;
>   u64 rx_packets = 0;
>   u64 tx_bytes = 0;
>   u64 tx_packets = 0;
>   unsigned int index;
> 
> - spin_lock(>lock);
> - if (vif->queues == NULL)
> + rcu_read_lock();
> +
> + num_queues = rcu_dereference(vif)->num_queues;
> + if (num_queues < 1)
>   goto out;
> 
>   /* Aggregate tx and rx stats from each queue */
> - for (index = 0; index < vif->num_queues; ++index) {
> + for (index = 0; index < num_queues; ++index) {
>   queue = >queues[index];
>   rx_bytes += queue->stats.rx_bytes;
>   rx_packets += queue->stats.rx_packets;
> @@ -241,7 +244,7 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>   }
> 
>  out:
> - spin_unlock(>lock);
> + rcu_read_unlock();
> 
>   vif->dev->stats.rx_bytes = rx_bytes;
>   vif->dev->stats.rx_packets = rx_packets;
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> netback/xenbus.c
> index d2d7cd9..76efb01 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -495,26 +495,23 @@ static void backend_disconnect(struct
> backend_info *be)
>   struct xenvif *vif = be->vif;
> 
>   if (vif) {
> + unsigned int num_queues = vif->num_queues;
>   unsigned int queue_index;
> - struct xenvif_queue *queues;
> 
>   xen_unregister_watchers(vif);
>  #ifdef CONFIG_DEBUG_FS
>   xenvif_debugfs_delif(vif);
>  #endif /* CONFIG_DEBUG_FS */
>   xenvif_disconnect_data(vif);
> - for (queue_index = 0;
> -  queue_index < vif->num_queues;
> -  ++queue_index)
> - xenvif_deinit_queue(>queues[queue_index]);
> 
> - spin_lock(>lock);
> - queues = vif->queues;
>   vif->num_queues = 0;
> - vif->queues = NULL;
> - spin_unlock(>lock);
> + synchronize_net();

So, num_queues is your RCU protected value, rather than the queues pointer, in 
which case I think you probably need to change code such as

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/net/xen-netback/netback.c?id=refs/tags/v4.10#n216

to be gated on num_queues.

Also shouldn't xenvif_up(), xenvif_down() and xenvif_get_ethtool_stats() not be 
using rcu_read_lock() and rcu_dereference() of num_queues as well?

  Paul

> 
> - vfree(queues);
> + for (queue_index = 0; queue_index < num_queues;
> ++queue_index)
> + xenvif_deinit_queue(>queues[queue_index]);
> +
> + vfree(vif->queues);
> + vif->queues = NULL;
> 
>   xenvif_disconnect_ctrl(vif);
>   }
> --
> 1.8.3.1

[PATCH net 1/2] xen-netback: keep a local pointer for vif in backend_disconnect()

2017-03-02 Thread Paul Durrant

This patch replaces use of 'be->vif' with 'vif' and hence generally
makes the function look tidier. No semantic change.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/xenbus.c | 32 ++--
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index bb854f9..d82ddc9 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -492,24 +492,28 @@ static int backend_create_xenvif(struct backend_info *be)
 
 static void backend_disconnect(struct backend_info *be)
 {
-   if (be->vif) {
+   struct xenvif *vif = be->vif;
+
+   if (vif) {
unsigned int queue_index;
 
-   xen_unregister_watchers(be->vif);
+   xen_unregister_watchers(vif);
 #ifdef CONFIG_DEBUG_FS
-   xenvif_debugfs_delif(be->vif);
+   xenvif_debugfs_delif(vif);
 #endif /* CONFIG_DEBUG_FS */
-   xenvif_disconnect_data(be->vif);
-   for (queue_index = 0; queue_index < be->vif->num_queues; 
++queue_index)
-   xenvif_deinit_queue(>vif->queues[queue_index]);
-
-   spin_lock(>vif->lock);
-   vfree(be->vif->queues);
-   be->vif->num_queues = 0;
-   be->vif->queues = NULL;
-   spin_unlock(>vif->lock);
-
-   xenvif_disconnect_ctrl(be->vif);
+   xenvif_disconnect_data(vif);
+   for (queue_index = 0;
+queue_index < vif->num_queues;
+++queue_index)
+   xenvif_deinit_queue(>queues[queue_index]);
+
+   spin_lock(>lock);
+   vfree(vif->queues);
+   vif->num_queues = 0;
+   vif->queues = NULL;
+   spin_unlock(>lock);
+
+   xenvif_disconnect_ctrl(vif);
}
 }
 
-- 
2.1.4

[PATCH net 2/2] xen-netback: don't vfree() queues under spinlock

2017-03-02 Thread Paul Durrant

This leads to a BUG of the following form:

[  174.512861] switch: port 2(vif3.0) entered disabled state
[  174.522735] BUG: sleeping function called from invalid context at
/home/build/linux-linus/mm/vmalloc.c:1441
[  174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch
[  174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW
4.10.0upstream-11073-g4977ab6-dirty #1
[  174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS V17.0
03/14/2011
[  174.525517] Call Trace:
[  174.526217]  show_stack+0x23/0x60
[  174.526899]  dump_stack+0x5b/0x88
[  174.527562]  ___might_sleep+0xde/0x130
[  174.528208]  __might_sleep+0x35/0xa0
[  174.528840]  ? _raw_spin_unlock_irqrestore+0x13/0x20
[  174.529463]  ? __wake_up+0x40/0x50
[  174.530089]  remove_vm_area+0x20/0x90
[  174.530724]  __vunmap+0x1d/0xc0
[  174.531346]  ? delete_object_full+0x13/0x20
[  174.531973]  vfree+0x40/0x80
[  174.532594]  set_backend_state+0x18a/0xa90
[  174.533221]  ? dwc_scan_descriptors+0x24d/0x430
[  174.533850]  ? kfree+0x5b/0xc0
[  174.534476]  ? xenbus_read+0x3d/0x50
[  174.535101]  ? xenbus_read+0x3d/0x50
[  174.535718]  ? xenbus_gather+0x31/0x90
[  174.536332]  ? ___might_sleep+0xf6/0x130
[  174.536945]  frontend_changed+0x6b/0xd0
[  174.537565]  xenbus_otherend_changed+0x7d/0x80
[  174.538185]  frontend_changed+0x12/0x20
[  174.538803]  xenwatch_thread+0x74/0x110
[  174.539417]  ? woken_wake_function+0x20/0x20
[  174.540049]  kthread+0xe5/0x120
[  174.540663]  ? xenbus_printf+0x50/0x50
[  174.541278]  ? __kthread_init_worker+0x40/0x40
[  174.541898]  ret_from_fork+0x21/0x2c
[  174.548635] switch: port 2(vif3.0) entered disabled state

This patch defers the vfree() until after the spinlock is released.

Reported-by: Juergen Gross <jgr...@suse.com>
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Juergen Gross <jgr...@suse.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/xenbus.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index d82ddc9..d2d7cd9 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -496,6 +496,7 @@ static void backend_disconnect(struct backend_info *be)
 
if (vif) {
unsigned int queue_index;
+   struct xenvif_queue *queues;
 
xen_unregister_watchers(vif);
 #ifdef CONFIG_DEBUG_FS
@@ -508,11 +509,13 @@ static void backend_disconnect(struct backend_info *be)
xenvif_deinit_queue(>queues[queue_index]);
 
spin_lock(>lock);
-   vfree(vif->queues);
+   queues = vif->queues;
vif->num_queues = 0;
vif->queues = NULL;
spin_unlock(>lock);
 
+   vfree(queues);
+
xenvif_disconnect_ctrl(vif);
}
 }
-- 
2.1.4

[PATCH net 0/2] xen-netback: update memory leak fix to avoid BUG

2017-03-02 Thread Paul Durrant

Commit 9a6cdf52b85e "xen-netback: fix memory leaks on XenBus disconnect"
added missing code to fix a memory leak by calling vfree() in the
appropriate place.
Unfortunately subsequent commit f16f1df65f1c "xen-netback: protect
resource cleaning on XenBus disconnect" then wrapped this call to vfree()
in a spin lock, leading to a BUG due to incorrect context.

Patch #1 makes the existing code more readable
Patch #2 fixes the problem

Paul Durrant (2):
  xen-netback: keep a local pointer for vif in backend_disconnect()
  xen-netback: don't vfree() queues under spinlock

 drivers/net/xen-netback/xenbus.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

-- 
2.1.4

RE: BUG due to "xen-netback: protect resource cleaning on XenBus disconnect"

2017-03-02 Thread Paul Durrant

> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 02 March 2017 12:13
> To: Wei Liu <wei.l...@citrix.com>
> Cc: Igor Druzhinin <igor.druzhi...@citrix.com>; xen-devel  de...@lists.xenproject.org>; Linux Kernel Mailing List  ker...@vger.kernel.org>; netdev@vger.kernel.org; Boris Ostrovsky
> <boris.ostrov...@oracle.com>; David Miller <da...@davemloft.net>; Paul
> Durrant <paul.durr...@citrix.com>
> Subject: Re: BUG due to "xen-netback: protect resource cleaning on XenBus
> disconnect"
> 
> On 02/03/17 13:06, Wei Liu wrote:
> > On Thu, Mar 02, 2017 at 12:56:20PM +0100, Juergen Gross wrote:
> >> With commits f16f1df65 and 9a6cdf52b we get in our Xen testing:
> >>
> >> [  174.512861] switch: port 2(vif3.0) entered disabled state
> >> [  174.522735] BUG: sleeping function called from invalid context at
> >> /home/build/linux-linus/mm/vmalloc.c:1441
> >> [  174.523451] in_atomic(): 1, irqs_disabled(): 0, pid: 28, name: xenwatch
> >> [  174.524131] CPU: 1 PID: 28 Comm: xenwatch Tainted: GW
> >> 4.10.0upstream-11073-g4977ab6-dirty #1
> >> [  174.524819] Hardware name: MSI MS-7680/H61M-P23 (MS-7680), BIOS
> V17.0
> >> 03/14/2011
> >> [  174.525517] Call Trace:
> >> [  174.526217]  show_stack+0x23/0x60
> >> [  174.526899]  dump_stack+0x5b/0x88
> >> [  174.527562]  ___might_sleep+0xde/0x130
> >> [  174.528208]  __might_sleep+0x35/0xa0
> >> [  174.528840]  ? _raw_spin_unlock_irqrestore+0x13/0x20
> >> [  174.529463]  ? __wake_up+0x40/0x50
> >> [  174.530089]  remove_vm_area+0x20/0x90
> >> [  174.530724]  __vunmap+0x1d/0xc0
> >> [  174.531346]  ? delete_object_full+0x13/0x20
> >> [  174.531973]  vfree+0x40/0x80
> >> [  174.532594]  set_backend_state+0x18a/0xa90
> >> [  174.533221]  ? dwc_scan_descriptors+0x24d/0x430
> >> [  174.533850]  ? kfree+0x5b/0xc0
> >> [  174.534476]  ? xenbus_read+0x3d/0x50
> >> [  174.535101]  ? xenbus_read+0x3d/0x50
> >> [  174.535718]  ? xenbus_gather+0x31/0x90
> >> [  174.536332]  ? ___might_sleep+0xf6/0x130
> >> [  174.536945]  frontend_changed+0x6b/0xd0
> >> [  174.537565]  xenbus_otherend_changed+0x7d/0x80
> >> [  174.538185]  frontend_changed+0x12/0x20
> >> [  174.538803]  xenwatch_thread+0x74/0x110
> >> [  174.539417]  ? woken_wake_function+0x20/0x20
> >> [  174.540049]  kthread+0xe5/0x120
> >> [  174.540663]  ? xenbus_printf+0x50/0x50
> >> [  174.541278]  ? __kthread_init_worker+0x40/0x40
> >> [  174.541898]  ret_from_fork+0x21/0x2c
> >> [  174.548635] switch: port 2(vif3.0) entered disabled state
> >>
> >> I believe calling vfree() when holding a spin_lock isn't a good idea.
> >>
> >
> > Use vfree_atomic instead?
> 
> Hmm, isn't this overkill here?
> 
> You can just set a local variable with the address and do vfree() after
> releasing the lock.
> 

Yep, that's what I was thinking. Patch coming shortly.

  Paul

> 
> Juergen

RE: [PATCH net] xen-netback: Use GFP_ATOMIC to allocate hash

2017-03-02 Thread Paul Durrant

> -Original Message-
> From: Anoob Soman [mailto:anoob.so...@citrix.com]
> Sent: 02 March 2017 10:50
> To: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> Cc: Paul Durrant <paul.durr...@citrix.com>; Wei Liu <wei.l...@citrix.com>;
> Anoob Soman <anoob.so...@citrix.com>
> Subject: [PATCH net] xen-netback: Use GFP_ATOMIC to allocate hash
> 
> Allocation of new_hash, inside xenvif_new_hash(), always happen
> in softirq context, so use GFP_ATOMIC instead of GFP_KERNEL for new
> hash allocation.
> 
> Signed-off-by: Anoob Soman <anoob.so...@citrix.com>

Acked-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/hash.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-
> netback/hash.c
> index e8c5ddd..3c4c58b 100644
> --- a/drivers/net/xen-netback/hash.c
> +++ b/drivers/net/xen-netback/hash.c
> @@ -39,7 +39,7 @@ static void xenvif_add_hash(struct xenvif *vif, const u8
> *tag,
>   unsigned long flags;
>   bool found;
> 
> - new = kmalloc(sizeof(*entry), GFP_KERNEL);
> + new = kmalloc(sizeof(*entry), GFP_ATOMIC);
>   if (!new)
>   return;
> 
> --
> 2.7.4

RE: [Xen-devel] [PATCH] xen-netback: vif counters from int/long to u64

2017-02-10 Thread Paul Durrant

> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> Mart van Santen
> Sent: 10 February 2017 12:02
> To: Wei Liu <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>;
> xen-de...@lists.xenproject.org; netdev@vger.kernel.org
> Cc: Mart van Santen <m...@greenhost.nl>
> Subject: [Xen-devel] [PATCH] xen-netback: vif counters from int/long to u64
> 
> This patch fixes an issue where the type of counters in the queue(s)
> and interface are not in sync (queue counters are int, interface
> counters are long), causing incorrect reporting of tx/rx values
> of the vif interface and unclear counter overflows.
> This patch sets both counters to the u64 type.
> 
> Signed-off-by: Mart van Santen <m...@greenhost.nl>

Looks sensible to me.

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/common.h| 8 
>  drivers/net/xen-netback/interface.c | 8 
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> netback/common.h
> index 3ce1f7d..530586b 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -113,10 +113,10 @@ struct xenvif_stats {
>* A subset of struct net_device_stats that contains only the
>* fields that are updated in netback.c for each queue.
>*/
> - unsigned int rx_bytes;
> - unsigned int rx_packets;
> - unsigned int tx_bytes;
> - unsigned int tx_packets;
> + u64 rx_bytes;
> + u64 rx_packets;
> + u64 tx_bytes;
> + u64 tx_packets;
> 
>   /* Additional stats used by xenvif */
>   unsigned long rx_gso_checksum_fixup;
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index 5795213..50fa169 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -221,10 +221,10 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> - unsigned long rx_bytes = 0;
> - unsigned long rx_packets = 0;
> - unsigned long tx_bytes = 0;
> - unsigned long tx_packets = 0;
> + u64 rx_bytes = 0;
> + u64 rx_packets = 0;
> + u64 tx_bytes = 0;
> + u64 tx_packets = 0;
>   unsigned int index;
> 
>   spin_lock(>lock);
> --
> 2.1.4
> 
> 
> ___
> Xen-devel mailing list
> xen-de...@lists.xen.org
> https://lists.xen.org/xen-devel

[PATCH net-next] xen-netfront: reject short packets and handle non-linear packets

2017-01-25 Thread Paul Durrant

Sowmini points out two vulnerabilities in xen-netfront:

a) The code assumes that skb->len is at least ETH_HLEN.
b) The code assumes that at least ETH_HLEN octets are in the linear
   port of the socket buffer.

This patch adds tests for both of these, and in the case of the latter
pulls sufficient bytes into the linear area.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reported-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
Tested-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
---
 drivers/net/xen-netfront.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 40f26b6..0478809 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -567,6 +567,10 @@ static int xennet_start_xmit(struct sk_buff *skb, struct 
net_device *dev)
u16 queue_index;
struct sk_buff *nskb;
 
+   /* Basic sanity check */
+   if (unlikely(skb->len < ETH_HLEN))
+   goto drop;
+
/* Drop the packet if no queues are set up */
if (num_queues < 1)
goto drop;
@@ -609,6 +613,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct 
net_device *dev)
}
 
len = skb_headlen(skb);
+   if (unlikely(len < ETH_HLEN)) {
+   if (!__pskb_pull_tail(skb, ETH_HLEN - len))
+   goto drop;
+   len = ETH_HLEN;
+   }
 
spin_lock_irqsave(>tx_lock, flags);
 
-- 
2.1.4

RE: [Xen-devel] xennet_start_xmit assumptions

2017-01-25 Thread Paul Durrant

> -Original Message-
> From: Sowmini Varadhan [mailto:sowmini.varad...@oracle.com]
> Sent: 19 January 2017 11:14
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>; Wei Liu
> <wei.l...@citrix.com>; netdev@vger.kernel.org; xen-
> de...@lists.xenproject.org
> Subject: Re: [Xen-devel] xennet_start_xmit assumptions
> 
> On (01/19/17 09:36), Paul Durrant wrote:
> >
> > Hi Sowmini,
> >
> >   Sounds like a straightforward bug to me... netfront should be able
> > to handle an empty skb and clearly, if it's relying on skb_headlen()
> > being non-zero, that's not the case.
> >
> >   Paul
> 
> I see. Seems like there are 2 things broken here: recovering
> from skb->len = 0, and recovering from  the more complex
> case of (skb->len > 0 && skb_headlen(skb) == 0)
> 
> Do you folks want to take a shot at fixing this,
> since you know the code better? If you are interested,
> I can share my test program to help you reproduce the
> simpler skb->len == 0 case, but it's the fully non-linear
> skbs that may be more interesting to reproduce/fix.
> 
> I'll probably work on fixing packet_snd to return -EINVAL
> or similar when the len is zero this week.
> 

Sowmini,

  I knocked together the following patch, which seems to work for me:

---8<---
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 40f26b6..a957c89 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -567,6 +567,10 @@ static int xennet_start_xmit(struct sk_buff *skb, struct 
net_device *dev)
u16 queue_index;
struct sk_buff *nskb;

+   /* Drop packets that are not at least ETH_HLEN in length */
+   if (skb->len < ETH_HLEN)
+   goto drop;
+
/* Drop the packet if no queues are set up */
if (num_queues < 1)
goto drop;
@@ -609,6 +613,8 @@ static int xennet_start_xmit(struct sk_buff *skb, struct 
net_device *dev)
}

len = skb_headlen(skb);
+   if ((len < ETH_HLEN) && !__pskb_pull_tail(skb, ETH_HLEN))
+   goto drop;

spin_lock_irqsave(>tx_lock, flags);

---8<---

  Making netfront cope with a fully non-linear skb looks like it would be quite 
intrusive and probably not worth it so I opted for just doing the ETH_HLEN 
pull-tail if necessary. Can you check it works for you?

  Paul

RE: [Xen-devel] xennet_start_xmit assumptions

2017-01-19 Thread Paul Durrant

> -Original Message-
> From: Sowmini Varadhan [mailto:sowmini.varad...@oracle.com]
> Sent: 19 January 2017 11:14
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>; Wei Liu
> <wei.l...@citrix.com>; netdev@vger.kernel.org; xen-
> de...@lists.xenproject.org
> Subject: Re: [Xen-devel] xennet_start_xmit assumptions
> 
> On (01/19/17 09:36), Paul Durrant wrote:
> >
> > Hi Sowmini,
> >
> >   Sounds like a straightforward bug to me... netfront should be able
> > to handle an empty skb and clearly, if it's relying on skb_headlen()
> > being non-zero, that's not the case.
> >
> >   Paul
> 
> I see. Seems like there are 2 things broken here: recovering
> from skb->len = 0, and recovering from  the more complex
> case of (skb->len > 0 && skb_headlen(skb) == 0)
> 
> Do you folks want to take a shot at fixing this,
> since you know the code better? If you are interested,
> I can share my test program to help you reproduce the
> simpler skb->len == 0 case, but it's the fully non-linear
> skbs that may be more interesting to reproduce/fix.

Sowmini,

Yeah, it would be useful to verify any change fixes the particular issue you're 
seeing so please share the program. For the non-empty non-linear case I'd hope 
that catching this and doing a pull of some sensible amount of header (which 
might coincide with the least amount that netback expects to see in the first 
frag) would be enough.
I can take a shot at a patch for this in the next few days; I'll add your 
'Reported-by' so you should get cc-ed.

Cheers,

  Paul

> 
> I'll probably work on fixing packet_snd to return -EINVAL
> or similar when the len is zero this week.
> 
> --Sowmini

RE: [Xen-devel] xennet_start_xmit assumptions

2017-01-19 Thread Paul Durrant

> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: 18 January 2017 19:25
> To: Sowmini Varadhan <sowmini.varad...@oracle.com>; Wei Liu
> <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> Subject: Re: [Xen-devel] xennet_start_xmit assumptions
> 
> On Wed, Jan 18, 2017 at 10:31:32AM -0500, Sowmini Varadhan wrote:
> > As I was playing around with pf_packet, I accidentally wrote
> > a buggy application program that bzero'ed the msghdr, then set
> > up the msg_name, msg_namelen correctly, and then did a sendmsg
> > on the pf_packet/SOCK_RAW fd.
> >
> > This causes packet_snd to set up an skb with a lot of issues,
> > e.g., skb->len = 0, skb_headlen(skb) is 0, etc. I think we can/should
> > drop the packet in packet_snd if the skb->len is 0, but there
> > may be other driver bugs going on:
> >
> > Turns out that ixgbe and sunvnet handle this problematic
> > skb correctly (they drop it and system remains stable),
> > but it creates a panic in xen_netfront (xennet_start_xmit()
> > hits a null pointer deref when xennet_make_first_txreq() returns
> > NULL)
> >
> > I'm new to the xen driver code, so I'm hoping that
> > the experts can comment here: reading the code in xennet_start_xmit,
> > it seems like it mandatorily requires the skb_headlen() to be
> > non-zero in order to create the first_tx? That may not always be
> > true, how does the code recover for purely non-linear skbs?

Hi Sowmini,

  Sounds like a straightforward bug to me... netfront should be able to handle 
an empty skb and clearly, if it's relying on skb_headlen() being non-zero, 
that's not the case.

  Paul

> >
> > --Sowmini
> 
> CC-ing the two folks from the MAINTAINERS file.

RE: [PATCH v2 2/2] xen-netback: protect resource cleaning on XenBus disconnect

2017-01-18 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 17 January 2017 20:50
> To: Wei Liu <wei.l...@citrix.com>
> Cc: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Igor Druzhinin
> <igor.druzhi...@citrix.com>
> Subject: [PATCH v2 2/2] xen-netback: protect resource cleaning on XenBus
> disconnect
> 
> vif->lock is used to protect statistics gathering agents from using the
> queue structure during cleaning.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/interface.c | 6 --
>  drivers/net/xen-netback/xenbus.c| 2 ++
>  2 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index 41c69b3..c48252a 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -230,18 +230,18 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> - unsigned int num_queues = vif->num_queues;
>   unsigned long rx_bytes = 0;
>   unsigned long rx_packets = 0;
>   unsigned long tx_bytes = 0;
>   unsigned long tx_packets = 0;
>   unsigned int index;
> 
> + spin_lock(>lock);
>   if (vif->queues == NULL)
>   goto out;
> 
>   /* Aggregate tx and rx stats from each queue */
> - for (index = 0; index < num_queues; ++index) {
> + for (index = 0; index < vif->num_queues; ++index) {
>   queue = >queues[index];
>   rx_bytes += queue->stats.rx_bytes;
>   rx_packets += queue->stats.rx_packets;
> @@ -250,6 +250,8 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>   }
> 
>  out:
> + spin_unlock(>lock);
> +
>   vif->dev->stats.rx_bytes = rx_bytes;
>   vif->dev->stats.rx_packets = rx_packets;
>   vif->dev->stats.tx_bytes = tx_bytes;
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> netback/xenbus.c
> index 3e99071..d82cd71 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -503,9 +503,11 @@ static void backend_disconnect(struct backend_info
> *be)
>   for (queue_index = 0; queue_index < be->vif-
> >num_queues; ++queue_index)
>   xenvif_deinit_queue(>vif-
> >queues[queue_index]);
> 
> + spin_lock(>vif->lock);
>   vfree(be->vif->queues);
>   be->vif->num_queues = 0;
>   be->vif->queues = NULL;
> + spin_unlock(>vif->lock);
> 
>   xenvif_disconnect_ctrl(be->vif);
>   }
> --
> 1.8.3.1

RE: [PATCH v2 1/2] xen-netback: fix memory leaks on XenBus disconnect

2017-01-18 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 17 January 2017 20:50
> To: Wei Liu <wei.l...@citrix.com>
> Cc: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Igor Druzhinin
> <igor.druzhi...@citrix.com>
> Subject: [PATCH v2 1/2] xen-netback: fix memory leaks on XenBus
> disconnect
> 
> Eliminate memory leaks introduced several years ago by cleaning the
> queue resources which are allocated on XenBus connection event. Namely,
> queue
> structure array and pages used for IO rings.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/xenbus.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> netback/xenbus.c
> index 6c57b02..3e99071 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -493,11 +493,20 @@ static int backend_create_xenvif(struct
> backend_info *be)
>  static void backend_disconnect(struct backend_info *be)
>  {
>   if (be->vif) {
> + unsigned int queue_index;
> +
>   xen_unregister_watchers(be->vif);
>  #ifdef CONFIG_DEBUG_FS
>   xenvif_debugfs_delif(be->vif);
>  #endif /* CONFIG_DEBUG_FS */
>   xenvif_disconnect_data(be->vif);
> + for (queue_index = 0; queue_index < be->vif-
> >num_queues; ++queue_index)
> + xenvif_deinit_queue(>vif-
> >queues[queue_index]);
> +
> + vfree(be->vif->queues);
> + be->vif->num_queues = 0;
> + be->vif->queues = NULL;
> +
>   xenvif_disconnect_ctrl(be->vif);
>   }
>  }
> @@ -1026,6 +1035,8 @@ static void connect(struct backend_info *be)
>  err:
>   if (be->vif->num_queues > 0)
>   xenvif_disconnect_data(be->vif); /* Clean up existing
> queues */
> + for (queue_index = 0; queue_index < be->vif->num_queues;
> ++queue_index)
> + xenvif_deinit_queue(>vif->queues[queue_index]);
>   vfree(be->vif->queues);
>   be->vif->queues = NULL;
>   be->vif->num_queues = 0;
> --
> 1.8.3.1

RE: [PATCH] xen-netback: fix memory leaks on XenBus disconnect

2017-01-13 Thread Paul Durrant

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: 13 January 2017 10:38
> To: Igor Druzhinin <igor.druzhi...@citrix.com>
> Cc: Wei Liu <wei.l...@citrix.com>; xen-de...@lists.xenproject.org; Paul
> Durrant <paul.durr...@citrix.com>; netdev@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: Re: [PATCH] xen-netback: fix memory leaks on XenBus disconnect
> 
> On Thu, Jan 12, 2017 at 05:51:56PM +, Igor Druzhinin wrote:
> > Eliminate memory leaks introduced several years ago by cleaning the
> queue
> > resources which are allocated on XenBus connection event. Namely, queue
> > structure array and pages used for IO rings.
> > vif->lock is used to protect statistics gathering agents from using the
> > queue structure during cleaning.
> >
> 
> There is code in netback_remove which eventually calls xenvif_free to
> free up the resources, maybe you should modify xenvif_free instead? That
> seems more symmetric to me. What do you think?

The connect code vallocs the queue array because the size is not known until 
then so it makes sense that disconnect vfrees it.

  Paul

> 
> > Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>
> > ---
> >  drivers/net/xen-netback/interface.c |  6 --
> >  drivers/net/xen-netback/xenbus.c| 13 +
> >  2 files changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> > index e30ffd2..5795213 100644
> > --- a/drivers/net/xen-netback/interface.c
> > +++ b/drivers/net/xen-netback/interface.c
> > @@ -221,18 +221,18 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
> >  {
> > struct xenvif *vif = netdev_priv(dev);
> > struct xenvif_queue *queue = NULL;
> > -   unsigned int num_queues = vif->num_queues;
> > unsigned long rx_bytes = 0;
> > unsigned long rx_packets = 0;
> > unsigned long tx_bytes = 0;
> > unsigned long tx_packets = 0;
> > unsigned int index;
> >
> > +   spin_lock(>lock);
> > if (vif->queues == NULL)
> > goto out;
> >
> > /* Aggregate tx and rx stats from each queue */
> > -   for (index = 0; index < num_queues; ++index) {
> > +   for (index = 0; index < vif->num_queues; ++index) {
> > queue = >queues[index];
> > rx_bytes += queue->stats.rx_bytes;
> > rx_packets += queue->stats.rx_packets;
> > @@ -241,6 +241,8 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
> > }
> >
> >  out:
> > +   spin_unlock(>lock);
> > +
> 
> Good catch, this is definitely needed. And it would probably be in a
> separate patch.
> 
> Wei.

RE: [PATCH] xen-netback: fix memory leaks on XenBus disconnect

2017-01-13 Thread Paul Durrant

> -Original Message-
> From: Igor Druzhinin [mailto:igor.druzhi...@citrix.com]
> Sent: 12 January 2017 17:52
> To: Wei Liu <wei.l...@citrix.com>; xen-de...@lists.xenproject.org; Paul
> Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; linux-ker...@vger.kernel.org; Igor Druzhinin
> <igor.druzhi...@citrix.com>
> Subject: [PATCH] xen-netback: fix memory leaks on XenBus disconnect
> 
> Eliminate memory leaks introduced several years ago by cleaning the queue
> resources which are allocated on XenBus connection event. Namely, queue
> structure array and pages used for IO rings.
> vif->lock is used to protect statistics gathering agents from using the
> queue structure during cleaning.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhi...@citrix.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

...although I was involved with discussions concerning this, so Wei should 
probably look it over too.

> ---
>  drivers/net/xen-netback/interface.c |  6 --
>  drivers/net/xen-netback/xenbus.c| 13 +
>  2 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-
> netback/interface.c
> index e30ffd2..5795213 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -221,18 +221,18 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>  {
>   struct xenvif *vif = netdev_priv(dev);
>   struct xenvif_queue *queue = NULL;
> - unsigned int num_queues = vif->num_queues;
>   unsigned long rx_bytes = 0;
>   unsigned long rx_packets = 0;
>   unsigned long tx_bytes = 0;
>   unsigned long tx_packets = 0;
>   unsigned int index;
> 
> + spin_lock(>lock);
>   if (vif->queues == NULL)
>   goto out;
> 
>   /* Aggregate tx and rx stats from each queue */
> - for (index = 0; index < num_queues; ++index) {
> + for (index = 0; index < vif->num_queues; ++index) {
>   queue = >queues[index];
>   rx_bytes += queue->stats.rx_bytes;
>   rx_packets += queue->stats.rx_packets;
> @@ -241,6 +241,8 @@ static struct net_device_stats
> *xenvif_get_stats(struct net_device *dev)
>   }
> 
>  out:
> + spin_unlock(>lock);
> +
>   vif->dev->stats.rx_bytes = rx_bytes;
>   vif->dev->stats.rx_packets = rx_packets;
>   vif->dev->stats.tx_bytes = tx_bytes;
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> netback/xenbus.c
> index 3124eae..85b742e 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -493,11 +493,22 @@ static int backend_create_xenvif(struct
> backend_info *be)
>  static void backend_disconnect(struct backend_info *be)
>  {
>   if (be->vif) {
> + unsigned int queue_index;
> +
>   xen_unregister_watchers(be->vif);
>  #ifdef CONFIG_DEBUG_FS
>   xenvif_debugfs_delif(be->vif);
>  #endif /* CONFIG_DEBUG_FS */
>   xenvif_disconnect_data(be->vif);
> + for (queue_index = 0; queue_index < be->vif-
> >num_queues; ++queue_index)
> + xenvif_deinit_queue(>vif-
> >queues[queue_index]);
> +
> + spin_lock(>vif->lock);
> + vfree(be->vif->queues);
> + be->vif->num_queues = 0;
> + be->vif->queues = NULL;
> + spin_unlock(>vif->lock);
> +
>   xenvif_disconnect_ctrl(be->vif);
>   }
>  }
> @@ -1034,6 +1045,8 @@ static void connect(struct backend_info *be)
>  err:
>   if (be->vif->num_queues > 0)
>   xenvif_disconnect_data(be->vif); /* Clean up existing
> queues */
> + for (queue_index = 0; queue_index < be->vif->num_queues;
> ++queue_index)
> + xenvif_deinit_queue(>vif->queues[queue_index]);
>   vfree(be->vif->queues);
>   be->vif->queues = NULL;
>   be->vif->num_queues = 0;
> --
> 1.8.3.1

RE: [PATCH 2/3] xen: modify xenstore watch event interface

2017-01-06 Thread Paul Durrant

> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 06 January 2017 15:06
> To: linux-ker...@vger.kernel.org; xen-de...@lists.xenproject.org
> Cc: boris.ostrov...@oracle.com; Juergen Gross <jgr...@suse.com>;
> konrad.w...@oracle.com; Roger Pau Monne <roger@citrix.com>; Wei Liu
> <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>;
> netdev@vger.kernel.org
> Subject: [PATCH 2/3] xen: modify xenstore watch event interface
> 
> Today a Xenstore watch event is delivered via a callback function
> declared as:
> 
> void (*callback)(struct xenbus_watch *,
>  const char **vec, unsigned int len);
> 
> As all watch events only ever come with two parameters (path and token)
> changing the prototype to:
> 
> void (*callback)(struct xenbus_watch *,
>  const char *path, const char *token);
> 
> is the natural thing to do.
> 
> Apply this change and adapt all users.
> 
> Cc: konrad.w...@oracle.com
> Cc: roger@citrix.com
> Cc: wei.l...@citrix.com
> Cc: paul.durr...@citrix.com
> Cc: netdev@vger.kernel.org
> 
> Signed-off-by: Juergen Gross <jgr...@suse.com>

xen-netback changes...

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/block/xen-blkback/xenbus.c |  6 +++---
>  drivers/net/xen-netback/xenbus.c   |  8 
>  drivers/xen/cpu_hotplug.c  |  5 ++---
>  drivers/xen/manage.c   |  6 +++---
>  drivers/xen/xen-balloon.c  |  2 +-
>  drivers/xen/xen-pciback/xenbus.c   |  2 +-
>  drivers/xen/xenbus/xenbus.h|  6 +++---
>  drivers/xen/xenbus/xenbus_client.c |  4 ++--
>  drivers/xen/xenbus/xenbus_dev_frontend.c   | 21 -
>  drivers/xen/xenbus/xenbus_probe.c  | 11 ---
>  drivers/xen/xenbus/xenbus_probe_backend.c  |  8 
>  drivers/xen/xenbus/xenbus_probe_frontend.c | 14 +++---
>  drivers/xen/xenbus/xenbus_xs.c | 29 ++---
>  include/xen/xenbus.h   |  6 +++---
>  14 files changed, 59 insertions(+), 69 deletions(-)
> 
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> blkback/xenbus.c
> index 415e79b..8fe61b5 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -38,8 +38,8 @@ struct backend_info {
>  static struct kmem_cache *xen_blkif_cachep;
>  static void connect(struct backend_info *);
>  static int connect_ring(struct backend_info *);
> -static void backend_changed(struct xenbus_watch *, const char **,
> - unsigned int);
> +static void backend_changed(struct xenbus_watch *, const char *,
> + const char *);
>  static void xen_blkif_free(struct xen_blkif *blkif);
>  static void xen_vbd_free(struct xen_vbd *vbd);
> 
> @@ -661,7 +661,7 @@ static int xen_blkbk_probe(struct xenbus_device
> *dev,
>   * ready, connect.
>   */
>  static void backend_changed(struct xenbus_watch *watch,
> - const char **vec, unsigned int len)
> + const char *path, const char *token)
>  {
>   int err;
>   unsigned major;
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> netback/xenbus.c
> index 3124eae..d8a40fa 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -723,7 +723,7 @@ static int xen_net_read_mac(struct xenbus_device
> *dev, u8 mac[])
>  }
> 
>  static void xen_net_rate_changed(struct xenbus_watch *watch,
> - const char **vec, unsigned int len)
> +  const char *path, const char *token)
>  {
>   struct xenvif *vif = container_of(watch, struct xenvif, credit_watch);
>   struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
> @@ -780,7 +780,7 @@ static void xen_unregister_credit_watch(struct xenvif
> *vif)
>  }
> 
>  static void xen_mcast_ctrl_changed(struct xenbus_watch *watch,
> -const char **vec, unsigned int len)
> +const char *path, const char *token)
>  {
>   struct xenvif *vif = container_of(watch, struct xenvif,
> mcast_ctrl_watch);
> @@ -855,8 +855,8 @@ static void unregister_hotplug_status_watch(struct
> backend_info *be)
>  }
> 
>  static void hotplug_status_changed(struct xenbus_watch *watch,
> -const char **vec,
> -unsigned int vec_size)
> +const char *path,
> +

RE: [Xen PATCH] xen-netback: fix error handling output

2016-11-08 Thread Paul Durrant

> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Sent: 08 November 2016 13:35
> To: David Vrabel <david.vra...@citrix.com>
> Cc: Arnd Bergmann <a...@arndb.de>; Wei Liu <wei.l...@citrix.com>; Paul
> Durrant <paul.durr...@citrix.com>; David S. Miller
> <da...@davemloft.net>; Juergen Gross <jgr...@suse.com>; Filipe Manco
> <filipe.ma...@neclab.eu>; xen-de...@lists.xenproject.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: [Xen PATCH] xen-netback: fix error handling output
> 
> The connect function prints an unintialized error code after an
> earlier initialization was removed:
> 
> drivers/net/xen-netback/xenbus.c: In function 'connect':
> drivers/net/xen-netback/xenbus.c:938:3: error: 'err' may be used
> uninitialized in this function [-Werror=maybe-uninitialized]
> 
> This prints it as -EINVAL instead, which seems to be the most
> appropriate error code. Before the patch that caused the warning,
> this would print a positive number returned by vsscanf() instead,
> which is also wrong. We probably don't need a backport though,
> as fixing the warning here should be sufficient.
> 
> Fixes: f95842e7a9f2 ("xen: make use of xenbus_read_unsigned() in xen-
> netback")
> Fixes: 8d3d53b3e433 ("xen-netback: Add support for multiple queues")
> Signed-off-by: Arnd Bergmann <a...@arndb.de>

Yes, I'd say EINVAL was most appropriate.

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/xenbus.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> netback/xenbus.c
> index 7356e00fac54..bfed79877b8a 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -935,7 +935,7 @@ static void connect(struct backend_info *be)
>   "multi-queue-num-queues", 1);
>   if (requested_num_queues > xenvif_max_queues) {
>   /* buggy or malicious guest */
> - xenbus_dev_fatal(dev, err,
> + xenbus_dev_fatal(dev, -EINVAL,
>"guest requested %u queues, exceeding the
> maximum of %u.",
>requested_num_queues,
> xenvif_max_queues);
>   return;
> --
> 2.9.0

RE: [PATCH v3] xen-netback: prefer xenbus_scanf() over xenbus_gather()

2016-11-08 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 08 November 2016 07:46
> To: Paul Durrant <paul.durr...@citrix.com>; Wei Liu <wei.l...@citrix.com>
> Cc: xen-devel <xen-de...@lists.xenproject.org>; netdev@vger.kernel.org
> Subject: [PATCH v3] xen-netback: prefer xenbus_scanf() over
> xenbus_gather()
> 
> For single items being collected this should be preferred as being more
> typesafe (as the compiler can check format string and to-be-written-to
> variable match) and more efficient (requiring one less parameter to be
> passed).
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>

LGTM

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
> v3: For consistency with other code don't consider zero an error
> (utilizing that xenbus_scanf() at present won't return zero).
> v2: Avoid commit message to continue from subject.
> ---
>  drivers/net/xen-netback/xenbus.c |   12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> --- 4.9-rc4/drivers/net/xen-netback/xenbus.c
> +++ 4.9-rc4-xen-netback-prefer-xenbus_scanf/drivers/net/xen-
> netback/xenbus.c
> @@ -889,16 +889,16 @@ static int connect_ctrl_ring(struct back
>   unsigned int evtchn;
>   int err;
> 
> - err = xenbus_gather(XBT_NIL, dev->otherend,
> - "ctrl-ring-ref", "%u", , NULL);
> - if (err)
> + err = xenbus_scanf(XBT_NIL, dev->otherend,
> +"ctrl-ring-ref", "%u", );
> + if (err < 0)
>   goto done; /* The frontend does not have a control ring */
> 
>   ring_ref = val;
> 
> - err = xenbus_gather(XBT_NIL, dev->otherend,
> - "event-channel-ctrl", "%u", , NULL);
> - if (err) {
> + err = xenbus_scanf(XBT_NIL, dev->otherend,
> +"event-channel-ctrl", "%u", );
> + if (err < 0) {
>   xenbus_dev_fatal(dev, err,
>"reading %s/event-channel-ctrl",
>dev->otherend);
> 
>

RE: [PATCH 06/12] xen: make use of xenbus_read_unsigned() in xen-netback

2016-11-01 Thread Paul Durrant

> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 31 October 2016 16:48
> To: linux-ker...@vger.kernel.org; xen-de...@lists.xen.org
> Cc: David Vrabel <david.vra...@citrix.com>; boris.ostrov...@oracle.com;
> Juergen Gross <jgr...@suse.com>; Wei Liu <wei.l...@citrix.com>; Paul
> Durrant <paul.durr...@citrix.com>; netdev@vger.kernel.org
> Subject: [PATCH 06/12] xen: make use of xenbus_read_unsigned() in xen-
> netback
> 
> Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
> This requires to change the type of some reads from int to unsigned,
> but these cases have been wrong before: negative values are not allowed
> for the modified cases.
> 
> Cc: wei.l...@citrix.com
> Cc: paul.durr...@citrix.com

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> Cc: netdev@vger.kernel.org
> 
> Signed-off-by: Juergen Gross <jgr...@suse.com>
> ---
>  drivers/net/xen-netback/xenbus.c | 50 +++---
> --
>  1 file changed, 14 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-
> netback/xenbus.c
> index 8674e18..7356e00 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -785,12 +785,9 @@ static void xen_mcast_ctrl_changed(struct
> xenbus_watch *watch,
>   struct xenvif *vif = container_of(watch, struct xenvif,
> mcast_ctrl_watch);
>   struct xenbus_device *dev = xenvif_to_xenbus_device(vif);
> - int val;
> 
> - if (xenbus_scanf(XBT_NIL, dev->otherend,
> -  "request-multicast-control", "%d", ) < 0)
> - val = 0;
> - vif->multicast_control = !!val;
> + vif->multicast_control = !!xenbus_read_unsigned(dev->otherend,
> + "request-multicast-control", 0);
>  }
> 
>  static int xen_register_mcast_ctrl_watch(struct xenbus_device *dev,
> @@ -934,12 +931,9 @@ static void connect(struct backend_info *be)
>   /* Check whether the frontend requested multiple queues
>* and read the number requested.
>*/
> - err = xenbus_scanf(XBT_NIL, dev->otherend,
> -"multi-queue-num-queues",
> -"%u", _num_queues);
> - if (err < 0) {
> - requested_num_queues = 1; /* Fall back to single queue */
> - } else if (requested_num_queues > xenvif_max_queues) {
> + requested_num_queues = xenbus_read_unsigned(dev->otherend,
> + "multi-queue-num-queues", 1);
> + if (requested_num_queues > xenvif_max_queues) {
>   /* buggy or malicious guest */
>   xenbus_dev_fatal(dev, err,
>"guest requested %u queues, exceeding the
> maximum of %u.",
> @@ -1134,7 +1128,7 @@ static int read_xenbus_vif_flags(struct
> backend_info *be)
>   struct xenvif *vif = be->vif;
>   struct xenbus_device *dev = be->dev;
>   unsigned int rx_copy;
> - int err, val;
> + int err;
> 
>   err = xenbus_scanf(XBT_NIL, dev->otherend, "request-rx-copy",
> "%u",
>  _copy);
> @@ -1150,10 +1144,7 @@ static int read_xenbus_vif_flags(struct
> backend_info *be)
>   if (!rx_copy)
>   return -EOPNOTSUPP;
> 
> - if (xenbus_scanf(XBT_NIL, dev->otherend,
> -  "feature-rx-notify", "%d", ) < 0)
> - val = 0;
> - if (!val) {
> + if (!xenbus_read_unsigned(dev->otherend, "feature-rx-notify", 0)) {
>   /* - Reduce drain timeout to poll more frequently for
>*   Rx requests.
>* - Disable Rx stall detection.
> @@ -1162,34 +1153,21 @@ static int read_xenbus_vif_flags(struct
> backend_info *be)
>   be->vif->stall_timeout = 0;
>   }
> 
> - if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-sg",
> -  "%d", ) < 0)
> - val = 0;
> - vif->can_sg = !!val;
> + vif->can_sg = !!xenbus_read_unsigned(dev->otherend, "feature-
> sg", 0);
> 
>   vif->gso_mask = 0;
> 
> - if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
> -  "%d", ) < 0)
> - val = 0;
> - if (val)
> + if (xenbus_read_unsigned(dev->otherend, "feature-gso-tcpv4", 0))
>   vif->gso_mask |= GSO_BIT

RE: [PATCH v2 RESEND] xen-netback: prefer xenbus_scanf() over xenbus_gather()

2016-10-25 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 25 October 2016 09:23
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: David Vrabel <david.vra...@citrix.com>; Wei Liu <wei.l...@citrix.com>;
> xen-de...@lists.xenproject.org; boris.ostrov...@oracle.com; Juergen Gross
> <jgr...@suse.com>; netdev@vger.kernel.org
> Subject: RE: [PATCH v2 RESEND] xen-netback: prefer xenbus_scanf() over
> xenbus_gather()
> 
> >>> On 25.10.16 at 09:52, <paul.durr...@citrix.com> wrote:
> >> From: Jan Beulich [mailto:jbeul...@suse.com]
> >> Sent: 24 October 2016 16:08
> >> --- 4.9-rc2/drivers/net/xen-netback/xenbus.c
> >> +++ 4.9-rc2-xen-netback-prefer-xenbus_scanf/drivers/net/xen-
> netback/xenbus.c
> >> @@ -889,16 +889,16 @@ static int connect_ctrl_ring(struct back
> >>unsigned int evtchn;
> >>int err;
> >>
> >> -  err = xenbus_gather(XBT_NIL, dev->otherend,
> >> -  "ctrl-ring-ref", "%u", , NULL);
> >> -  if (err)
> >> +  err = xenbus_scanf(XBT_NIL, dev->otherend,
> >> + "ctrl-ring-ref", "%u", );
> >> +  if (err <= 0)
> >
> > Looking at other uses of xenbus_scanf() in the same code I think the check
> > here should be if (err < 0). It's a nit, since xenbus_scanf() cannot return 
> > 0,
> > but it would be better for consistency I think.
> 
> Hmm, this goes back to the discussion following from
> https://lists.xenproject.org/archives/html/xen-devel/2016-
> 07/msg00678.html
> which in fact you had given your R-b back then. I continue to be
> of the opinion that callers should not leverage the fact that
> xenbus_scanf() can't return zero. They instead should check for
> an explicit success indicator (which only positive values are). But
> you're the maintainer of the code, so if you now think the same
> way David does, I guess I'll have to make the adjustment.
> 
> >>goto done; /* The frontend does not have a control ring */
> >>
> >>ring_ref = val;
> >>
> >> -  err = xenbus_gather(XBT_NIL, dev->otherend,
> >> -  "event-channel-ctrl", "%u", , NULL);
> >> -  if (err) {
> >> +  err = xenbus_scanf(XBT_NIL, dev->otherend,
> >> + "event-channel-ctrl", "%u", );
> >> +  if (err <= 0) {
> >>xenbus_dev_fatal(dev, err,
> >> "reading %s/event-channel-ctrl",
> >> dev->otherend);
> >> @@ -919,7 +919,7 @@ done:
> >>return 0;
> >>
> >>  fail:
> >> -  return err;
> >> +  return err ?: -ENODATA;
> >
> > I don't think you need this.
> 
> If the other change gets made, then indeed this isn't needed.

Yes, and that's why I prefer to opt for consistency with other code in this 
case.

  Paul

> 
> Jan

RE: [PATCH v2 RESEND] xen-netback: prefer xenbus_scanf() over xenbus_gather()

2016-10-25 Thread Paul Durrant

> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 24 October 2016 16:08
> To: Paul Durrant <paul.durr...@citrix.com>; Wei Liu <wei.l...@citrix.com>
> Cc: David Vrabel <david.vra...@citrix.com>; xen-de...@lists.xenproject.org;
> boris.ostrov...@oracle.com; Juergen Gross <jgr...@suse.com>;
> netdev@vger.kernel.org
> Subject: [PATCH v2 RESEND] xen-netback: prefer xenbus_scanf() over
> xenbus_gather()
> 
> For single items being collected this should be preferred as being more
> typesafe (as the compiler can check format string and to-be-written-to
> variable match) and more efficient (requiring one less parameter to be
> passed).
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> ---
> v2: Avoid commit message to continue from subject.
> ---
>  drivers/net/xen-netback/xenbus.c |   14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> --- 4.9-rc2/drivers/net/xen-netback/xenbus.c
> +++ 4.9-rc2-xen-netback-prefer-xenbus_scanf/drivers/net/xen-
> netback/xenbus.c
> @@ -889,16 +889,16 @@ static int connect_ctrl_ring(struct back
>   unsigned int evtchn;
>   int err;
> 
> - err = xenbus_gather(XBT_NIL, dev->otherend,
> - "ctrl-ring-ref", "%u", , NULL);
> - if (err)
> + err = xenbus_scanf(XBT_NIL, dev->otherend,
> +"ctrl-ring-ref", "%u", );
> + if (err <= 0)

Looking at other uses of xenbus_scanf() in the same code I think the check here 
should be if (err < 0). It's a nit, since xenbus_scanf() cannot return 0, but 
it would be better for consistency I think.

>   goto done; /* The frontend does not have a control ring */
> 
>   ring_ref = val;
> 
> - err = xenbus_gather(XBT_NIL, dev->otherend,
> - "event-channel-ctrl", "%u", , NULL);
> - if (err) {
> + err = xenbus_scanf(XBT_NIL, dev->otherend,
> +"event-channel-ctrl", "%u", );
> + if (err <= 0) {
>   xenbus_dev_fatal(dev, err,
>"reading %s/event-channel-ctrl",
>dev->otherend);
> @@ -919,7 +919,7 @@ done:
>   return 0;
> 
>  fail:
> - return err;
> + return err ?: -ENODATA;

I don't think you need this.

  Paul

>  }
> 
>  static void connect(struct backend_info *be)
> 
>

RE: [PATCH] xen-netback: fix type mismatch warning

2016-10-12 Thread Paul Durrant

> -Original Message-
> From: Arnd Bergmann [mailto:a...@arndb.de]
> Sent: 12 October 2016 10:54
> To: Wei Liu <wei.l...@citrix.com>; Paul Durrant <paul.durr...@citrix.com>
> Cc: Arnd Bergmann <a...@arndb.de>; David S. Miller
> <da...@davemloft.net>; David Vrabel <david.vra...@citrix.com>; xen-
> de...@lists.xenproject.org; netdev@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: [PATCH] xen-netback: fix type mismatch warning
> 
> Wiht the latest rework of the xen-netback driver, we get a warning
> on ARM about the types passed into min():
> 
> drivers/net/xen-netback/rx.c: In function 'xenvif_rx_next_chunk':
> include/linux/kernel.h:739:16: error: comparison of distinct pointer types
> lacks a cast [-Werror]
> 
> The reason is that XEN_PAGE_SIZE is not size_t here. There
> is no actual bug, and we can easily avoid the warning using the
> min_t() macro instead of min().
> 
> Fixes: eb1723a29b9a ("xen-netback: refactor guest rx")
> Signed-off-by: Arnd Bergmann <a...@arndb.de>

LGTM

Acked-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/rx.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
> index 8e9ade6ccf18..aeb150258c6c 100644
> --- a/drivers/net/xen-netback/rx.c
> +++ b/drivers/net/xen-netback/rx.c
> @@ -337,9 +337,9 @@ static void xenvif_rx_next_chunk(struct
> xenvif_queue *queue,
>   frag_data += pkt->frag_offset;
>   frag_len -= pkt->frag_offset;
> 
> - chunk_len = min(frag_len, XEN_PAGE_SIZE - offset);
> - chunk_len = min(chunk_len,
> - XEN_PAGE_SIZE -
>   xen_offset_in_page(frag_data));
> + chunk_len = min_t(size_t, frag_len, XEN_PAGE_SIZE - offset);
> + chunk_len = min_t(size_t, chunk_len, XEN_PAGE_SIZE -
> +  xen_offset_in_page(frag_data));
> 
>   pkt->frag_offset += chunk_len;
> 
> --
> 2.9.0

RE: [PATCHv1 net] xen-netback: fix guest Rx stall detection (after guest Rx refactor)

2016-10-11 Thread Paul Durrant

> -Original Message-
> From: David Vrabel [mailto:david.vra...@citrix.com]
> Sent: 11 October 2016 16:48
> To: netdev@vger.kernel.org
> Cc: David Vrabel <david.vra...@citrix.com>; xen-de...@lists.xenproject.org;
> Paul Durrant <paul.durr...@citrix.com>; Wei Liu <wei.l...@citrix.com>
> Subject: [PATCHv1 net] xen-netback: fix guest Rx stall detection (after guest
> Rx refactor)
> 
> If a VIF has been ready for rx_stall_timeout (60s by default) and an
> Rx ring is drained of all requests an Rx stall will be incorrectly
> detected.  When this occurs and the guest Rx queue is empty, the Rx
> ring's event index will not be set and the frontend will not raise an
> event when new requests are placed on the ring, permanently stalling
> the VIF.
> 
> This is a regression introduced by eb1723a29b9a7 (xen-netback:
> refactor guest rx).
> 
> Fix this by reinstating the setting of queue->last_rx_time when
> placing a packet onto the guest Rx ring.
> 
> Signed-off-by: David Vrabel <david.vra...@citrix.com>
Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/rx.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
> index 8e9ade6..d69f2a9 100644
> --- a/drivers/net/xen-netback/rx.c
> +++ b/drivers/net/xen-netback/rx.c
> @@ -425,6 +425,8 @@ void xenvif_rx_skb(struct xenvif_queue *queue)
> 
>   xenvif_rx_next_skb(queue, );
> 
> + queue->last_rx_time = jiffies;
> +
>   do {
>   struct xen_netif_rx_request *req;
>   struct xen_netif_rx_response *rsp;
> --
> 2.1.4

[PATCH net] xen-netback: (re-)create a debugfs node for hash information

2016-10-10 Thread Paul Durrant

From: Paul Durrant <paul.durr...@citrix.com>

It is useful to be able to see the hash configuration when running tests.
This patch adds a debugfs node for that purpose.

The original version of this patch (commit c0c64c152389) was reverted due
to build failures caused by a conflict with commit 0364a8824c02
("xen-netback: switch to threaded irq for control ring"). This new version
of the patch is nearly identical to the original, the only difference
being that creation of the debugfs node is predicated on 'ctrl_irq' being
non-zero rather then the now non-existent 'ctrl_task'.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
Cc: David S. Miller <da...@davemloft.net>
---
 drivers/net/xen-netback/common.h |  4 +++
 drivers/net/xen-netback/hash.c   | 68 
 drivers/net/xen-netback/xenbus.c | 37 --
 3 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index cf68149..3ce1f7d 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -407,4 +407,8 @@ u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, 
u32 len,
 
 void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
 
+#ifdef CONFIG_DEBUG_FS
+void xenvif_dump_hash_info(struct xenvif *vif, struct seq_file *m);
+#endif
+
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
index 613bac0..e8c5ddd 100644
--- a/drivers/net/xen-netback/hash.c
+++ b/drivers/net/xen-netback/hash.c
@@ -360,6 +360,74 @@ u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, 
u32 len,
return XEN_NETIF_CTRL_STATUS_SUCCESS;
 }
 
+#ifdef CONFIG_DEBUG_FS
+void xenvif_dump_hash_info(struct xenvif *vif, struct seq_file *m)
+{
+   unsigned int i;
+
+   switch (vif->hash.alg) {
+   case XEN_NETIF_CTRL_HASH_ALGORITHM_TOEPLITZ:
+   seq_puts(m, "Hash Algorithm: TOEPLITZ\n");
+   break;
+
+   case XEN_NETIF_CTRL_HASH_ALGORITHM_NONE:
+   seq_puts(m, "Hash Algorithm: NONE\n");
+   /* FALLTHRU */
+   default:
+   return;
+   }
+
+   if (vif->hash.flags) {
+   seq_puts(m, "\nHash Flags:\n");
+
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV4)
+   seq_puts(m, "- IPv4\n");
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV4_TCP)
+   seq_puts(m, "- IPv4 + TCP\n");
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV6)
+   seq_puts(m, "- IPv6\n");
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV6_TCP)
+   seq_puts(m, "- IPv6 + TCP\n");
+   }
+
+   seq_puts(m, "\nHash Key:\n");
+
+   for (i = 0; i < XEN_NETBK_MAX_HASH_KEY_SIZE; ) {
+   unsigned int j, n;
+
+   n = 8;
+   if (i + n >= XEN_NETBK_MAX_HASH_KEY_SIZE)
+   n = XEN_NETBK_MAX_HASH_KEY_SIZE - i;
+
+   seq_printf(m, "[%2u - %2u]: ", i, i + n - 1);
+
+   for (j = 0; j < n; j++, i++)
+   seq_printf(m, "%02x ", vif->hash.key[i]);
+
+   seq_puts(m, "\n");
+   }
+
+   if (vif->hash.size != 0) {
+   seq_puts(m, "\nHash Mapping:\n");
+
+   for (i = 0; i < vif->hash.size; ) {
+   unsigned int j, n;
+
+   n = 8;
+   if (i + n >= vif->hash.size)
+   n = vif->hash.size - i;
+
+   seq_printf(m, "[%4u - %4u]: ", i, i + n - 1);
+
+   for (j = 0; j < n; j++, i++)
+   seq_printf(m, "%4u ", vif->hash.mapping[i]);
+
+   seq_puts(m, "\n");
+   }
+   }
+}
+#endif /* CONFIG_DEBUG_FS */
+
 void xenvif_init_hash(struct xenvif *vif)
 {
if (xenvif_hash_cache_size == 0)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 7056404..8674e18 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -165,7 +165,7 @@ xenvif_write_io_ring(struct file *filp, const char __user 
*buf, size_t count,
return count;
 }
 
-static int xenvif_dump_open(struct inode *inode, struct file *filp)
+static int xenvif_io_ring_open(struct inode *inode, struct file *filp)
 {
int ret;
void *queue = NULL;
@@ -179,13 +179,35 @@ static int xenvif_dump_open(struct inode *inode, struct 
file *filp)
 
 static const struct file_operations xenvif_dbg_io_ring_ops_fops = {
.owner =

[PATCH net-next] MAINTAINERS: add myself as a maintainer of xen-netback

2016-10-07 Thread Paul Durrant

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 464437d..4491841 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13061,6 +13061,7 @@ F:  arch/arm64/include/asm/xen/
 
 XEN NETWORK BACKEND DRIVER
 M: Wei Liu <wei.l...@citrix.com>
+M: Paul Durrant <paul.durr...@citrix.com>
 L: xen-de...@lists.xenproject.org (moderated for non-subscribers)
 L: netdev@vger.kernel.org
 S: Supported
-- 
2.1.4

Reversion of "xen-netback: create a debugfs node for hash information"

2016-10-07 Thread Paul Durrant

Dave,

  I notice that you have made the above reversion of commit c0c64c15 (debugfs 
node) due to a build failure, despite the failure being caused by commit 
0364a882 (switch to threaded irq) which was made subsequently. I assume you 
want me to re-spin a new patch for the debugfs node to fix the build problem?

  Cheers,

Paul

[PATCH v2 net] xen-netback: make sure that hashes are not send to unaware frontends

2016-10-07 Thread Paul Durrant

In the case when a frontend only negotiates a single queue with xen-
netback it is possible for a skbuff with a s/w hash to result in a
hash extra_info segment being sent to the frontend even when no hash
algorithm has been configured. (The ndo_select_queue() entry point makes
sure the hash is not set if no algorithm is configured, but this entry
point is not called when there is only a single queue). This can result
in a frontend that is unable to handle extra_info segments being given
such a segment, causing it to crash.

This patch fixes the problem by clearing the hash in ndo_start_xmit()
instead, which is clearly guaranteed to be called irrespective of the
number of queues.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---

v2:
 - Simplified and re-based onto re-factored net branch
---
 drivers/net/xen-netback/interface.c | 20 +---
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 4af532a..74dc2bf 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -149,17 +149,8 @@ static u16 xenvif_select_queue(struct net_device *dev, 
struct sk_buff *skb,
struct xenvif *vif = netdev_priv(dev);
unsigned int size = vif->hash.size;
 
-   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) {
-   u16 index = fallback(dev, skb) % dev->real_num_tx_queues;
-
-   /* Make sure there is no hash information in the socket
-* buffer otherwise it would be incorrectly forwarded
-* to the frontend.
-*/
-   skb_clear_hash(skb);
-
-   return index;
-   }
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
+   return fallback(dev, skb) % dev->real_num_tx_queues;
 
xenvif_set_skb_hash(vif, skb);
 
@@ -208,6 +199,13 @@ static int xenvif_start_xmit(struct sk_buff *skb, struct 
net_device *dev)
cb = XENVIF_RX_CB(skb);
cb->expires = jiffies + vif->drain_timeout;
 
+   /* If there is no hash algorithm configured then make sure there
+* is no hash information in the socket buffer otherwise it
+* would be incorrectly forwarded to the frontend.
+*/
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
+   skb_clear_hash(skb);
+
xenvif_rx_queue_tail(queue, skb);
xenvif_kick_thread(queue);
 
-- 
2.1.4

RE: [PATCH net] xen-netback: make sure that hashes are not send to unaware frontends

2016-10-07 Thread Paul Durrant

> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: 07 October 2016 06:38
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; xen-de...@lists.xenproject.org; Wei Liu
> <wei.l...@citrix.com>
> Subject: Re: [PATCH net] xen-netback: make sure that hashes are not send
> to unaware frontends
> 
> From: Paul Durrant <paul.durr...@citrix.com>
> Date: Thu, 6 Oct 2016 15:47:10 +0100
> 
> > In the case when a frontend only negotiates a single queue with xen-
> > netback it is possible for a skbuff with a s/w hash to result in a
> > hash extra_info segment being sent to the frontend even when no hash
> > algorithm has been configured. (The ndo_select_queue() entry point
> > makes sure the hash is not set if no algorithm is configured, but this
> > entry point is not called when there is only a single queue). This can
> > result in a frontend that isunable to handle extra_info segments being
> > given such a segment, causing it to crash.
> >
> > This patch fixes the problem by gating whether the extra_info is sent
> > not only on the presence of a s/w hash, but also on whether the hash
> > algorithm has been configured.
> >
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > Cc: Wei Liu <wei.l...@citrix.com>
> 
> This doesn't apply cleanly to the current 'net' tree, please respin.
> 

Sure. V2 coming.

  Paul

> Thanks.

[PATCH net] xen-netback: make sure that hashes are not send to unaware frontends

2016-10-06 Thread Paul Durrant

In the case when a frontend only negotiates a single queue with xen-
netback it is possible for a skbuff with a s/w hash to result in a
hash extra_info segment being sent to the frontend even when no hash
algorithm has been configured. (The ndo_select_queue() entry point makes
sure the hash is not set if no algorithm is configured, but this entry
point is not called when there is only a single queue). This can result
in a frontend that isunable to handle extra_info segments being given
such a segment, causing it to crash.

This patch fixes the problem by gating whether the extra_info is sent
not only on the presence of a s/w hash, but also on whether the hash
algorithm has been configured.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c | 13 ++---
 drivers/net/xen-netback/netback.c   | 23 ++-
 2 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index fb50c6d..1034139 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -149,17 +149,8 @@ static u16 xenvif_select_queue(struct net_device *dev, 
struct sk_buff *skb,
struct xenvif *vif = netdev_priv(dev);
unsigned int size = vif->hash.size;
 
-   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) {
-   u16 index = fallback(dev, skb) % dev->real_num_tx_queues;
-
-   /* Make sure there is no hash information in the socket
-* buffer otherwise it would be incorrectly forwarded
-* to the frontend.
-*/
-   skb_clear_hash(skb);
-
-   return index;
-   }
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
+   return fallback(dev, skb) % dev->real_num_tx_queues;
 
xenvif_set_skb_hash(vif, skb);
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 3d0c989..2cd4a8e 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -168,6 +168,10 @@ static bool xenvif_rx_ring_slots_available(struct 
xenvif_queue *queue)
needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
if (skb_is_gso(skb))
needed++;
+   /* Assume the frontend is capable of handling the hash
+* extra_info at this point. This will only ever lead to an
+* accurate value or over-estimation.
+*/
if (skb->sw_hash)
needed++;
 
@@ -378,9 +382,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
.npo = npo,
.head = *head,
.gso_type = XEN_NETIF_GSO_TYPE_NONE,
-   /* xenvif_set_skb_hash() will have either set a s/w
-* hash or cleared the hash depending on
-* whether the the frontend wants a hash for this skb.
+   /* xenvif_rx_action() will have cleared any hash if
+* the frontend is not capable of handling it.
 */
.hash_present = skb->sw_hash,
};
@@ -593,6 +596,14 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
   && (skb = xenvif_rx_dequeue(queue)) != NULL) {
queue->last_rx_time = jiffies;
 
+   /* If there is no hash algorithm configured make sure
+* there is no hash information in the socket buffer
+* otherwise it would be incorrectly forwarded to the
+* frontend.
+*/
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
+   skb_clear_hash(skb);
+
XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, , 
queue);
 
__skb_queue_tail(, skb);
@@ -667,12 +678,6 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
}
 
if (skb->sw_hash) {
-   /* Since the skb got here via xenvif_select_queue()
-* we know that the hash has been re-calculated
-* according to a configuration set by the frontend
-* and therefore we know that it is legitimate to
-* pass it to the frontend.
-*/
if (resp->flags & XEN_NETRXF_extra_info)
extra->flags |= XEN_NETIF_EXTRA_FLAG_MORE;
else
-- 
2.1.4

RE: [Xen-devel] [PATCH v2 net-next 4/7] xen-netback: immediately wake tx queue when guest rx queue has space

2016-10-04 Thread Paul Durrant

> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: 04 October 2016 13:49
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; xen-de...@lists.xenproject.org; Wei Liu
> <wei.l...@citrix.com>; David Vrabel <david.vra...@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v2 net-next 4/7] xen-netback: immediately
> wake tx queue when guest rx queue has space
> 
> On Tue, Oct 04, 2016 at 02:29:15AM -0700, Paul Durrant wrote:
> > From: David Vrabel <david.vra...@citrix.com>
> >
> > When an skb is removed from the guest rx queue, immediately wake the
> > tx queue, instead of after processing them.
> 
> Please, could the description explain why?
> 

Is it not reasonably obvious that it improves parallelism between filling and 
draining the queue? I could add a comment if you think it needs spelling out.

  Paul

> >
> > Signed-off-by: David Vrabel <david.vra...@citrix.com> [re-based]
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > ---
> > Cc: Wei Liu <wei.l...@citrix.com>
> > ---
> >  drivers/net/xen-netback/rx.c | 24 
> >  1 file changed, 8 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/net/xen-netback/rx.c
> > b/drivers/net/xen-netback/rx.c index b0ce4c6..9548709 100644
> > --- a/drivers/net/xen-netback/rx.c
> > +++ b/drivers/net/xen-netback/rx.c
> > @@ -92,27 +92,21 @@ static struct sk_buff *xenvif_rx_dequeue(struct
> xenvif_queue *queue)
> > spin_lock_irq(>rx_queue.lock);
> >
> > skb = __skb_dequeue(>rx_queue);
> > -   if (skb)
> > +   if (skb) {
> > queue->rx_queue_len -= skb->len;
> > +   if (queue->rx_queue_len < queue->rx_queue_max) {
> > +   struct netdev_queue *txq;
> > +
> > +   txq = netdev_get_tx_queue(queue->vif->dev,
> queue->id);
> > +   netif_tx_wake_queue(txq);
> > +   }
> > +   }
> >
> > spin_unlock_irq(>rx_queue.lock);
> >
> > return skb;
> >  }
> >
> > -static void xenvif_rx_queue_maybe_wake(struct xenvif_queue *queue) -
> {
> > -   spin_lock_irq(>rx_queue.lock);
> > -
> > -   if (queue->rx_queue_len < queue->rx_queue_max) {
> > -   struct net_device *dev = queue->vif->dev;
> > -
> > -   netif_tx_wake_queue(netdev_get_tx_queue(dev, queue-
> >id));
> > -   }
> > -
> > -   spin_unlock_irq(>rx_queue.lock);
> > -}
> > -
> >  static void xenvif_rx_queue_purge(struct xenvif_queue *queue)  {
> > struct sk_buff *skb;
> > @@ -585,8 +579,6 @@ int xenvif_kthread_guest_rx(void *data)
> >  */
> > xenvif_rx_queue_drop_expired(queue);
> >
> > -   xenvif_rx_queue_maybe_wake(queue);
> > -
> > cond_resched();
> > }
> >
> > --
> > 2.1.4
> >
> >
> > ___
> > Xen-devel mailing list
> > xen-de...@lists.xen.org
> > https://lists.xen.org/xen-devel

RE: [Xen-devel] [PATCH v2 net-next 5/7] xen-netback: process guest rx packets in batches

2016-10-04 Thread Paul Durrant

> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: 04 October 2016 13:48
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; xen-de...@lists.xenproject.org; Wei Liu
> <wei.l...@citrix.com>; David Vrabel <david.vra...@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v2 net-next 5/7] xen-netback: process
> guest rx packets in batches
> 
> On Tue, Oct 04, 2016 at 10:29:16AM +0100, Paul Durrant wrote:
> > From: David Vrabel <david.vra...@citrix.com>
> >
> > Instead of only placing one skb on the guest rx ring at a time,
> > process a batch of up-to 64.  This improves performance by ~10% in some
> tests.

I believe the tests are mainly throughput tests, but David would know the 
specifics.

> 
> And does it regress latency workloads?
> 

It shouldn't, although I have not run ping-pong tests to verify. If packets are 
only placed on the vif queue singly though then the batching should have no 
effect, since rx_action will complete and do the push as before.

  Paul

> What are those 'some tests' you speak off?
> 
> Thanks.
> >
> > Signed-off-by: David Vrabel <david.vra...@citrix.com> [re-based]
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > ---
> > Cc: Wei Liu <wei.l...@citrix.com>
> > ---
> >  drivers/net/xen-netback/rx.c | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/xen-netback/rx.c
> > b/drivers/net/xen-netback/rx.c index 9548709..ae822b8 100644
> > --- a/drivers/net/xen-netback/rx.c
> > +++ b/drivers/net/xen-netback/rx.c
> > @@ -399,7 +399,7 @@ static void xenvif_rx_extra_slot(struct
> xenvif_queue *queue,
> > BUG();
> >  }
> >
> > -void xenvif_rx_action(struct xenvif_queue *queue)
> > +void xenvif_rx_skb(struct xenvif_queue *queue)
> >  {
> > struct xenvif_pkt_state pkt;
> >
> > @@ -425,6 +425,19 @@ void xenvif_rx_action(struct xenvif_queue
> *queue)
> > xenvif_rx_complete(queue, );
> >  }
> >
> > +#define RX_BATCH_SIZE 64
> > +
> > +void xenvif_rx_action(struct xenvif_queue *queue) {
> > +   unsigned int work_done = 0;
> > +
> > +   while (xenvif_rx_ring_slots_available(queue) &&
> > +  work_done < RX_BATCH_SIZE) {
> > +   xenvif_rx_skb(queue);
> > +   work_done++;
> > +   }
> > +}
> > +
> >  static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)  {
> > RING_IDX prod, cons;
> > --
> > 2.1.4
> >
> >
> > ___
> > Xen-devel mailing list
> > xen-de...@lists.xen.org
> > https://lists.xen.org/xen-devel

RE: [Xen-devel] [PATCH v2 net-next 2/7] xen-netback: retire guest rx side prefix GSO feature

2016-10-04 Thread Paul Durrant

> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: 04 October 2016 13:52
> To: Paul Durrant <paul.durr...@citrix.com>; annie...@oracle.com;
> joao.m.mart...@oracle.com
> Cc: netdev@vger.kernel.org; xen-de...@lists.xenproject.org; Wei Liu
> <wei.l...@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v2 net-next 2/7] xen-netback: retire guest
> rx side prefix GSO feature
> 
> On Tue, Oct 04, 2016 at 10:29:13AM +0100, Paul Durrant wrote:
> > As far as I am aware only very old Windows network frontends make use
> > of this style of passing GSO packets from backend to frontend. These
> > frontends can easily be replaced by the freely available Xen Project
> > Windows PV network frontend, which uses the 'default' mechanism for
> > passing GSO packets, which is also used by all Linux frontends.
> 
> It is not that simple. Some companies have extra juice in their Windows
> frontends so can't easily swap over to the Xen Project one.

Ok, then those frontends will continue to work, but they won't get GSO packets 
any more. Prefix GSO has never been specified in the canonical netif header and 
so has been in a limbo state forever so such frontends have always been on 
borrowed time and only just happened to work against a linux backend. If 
someone wants to actually specify prefix GSO properly then it could be added 
back in, but it should not be necessary now that the RX side req<->rsp identity 
relation is documented 
(http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/include/public/io/netif.h;hb=HEAD#l729).

> 
> Either way CC-ing Annie
> 
> Also would it make sense to CC the FreeBSD and NetBSD maintainers of their
> PV drivers just to make sure? (Or has that been confirmed)
> 

I could do that, but I'd hope that they would be subscribed to xen-devel and 
will chime in if there's likely to be a problem.

> >
> > NOTE: Removal of this feature will not cause breakage in old Windows
> >   frontends. They simply will no longer receive GSO packets - the
> >   packets instead being fragmented in the backend.
> 
> Did you also test this with SuSE/Novell Windows PV drivers?
> 

No, I don't have copies of these. Internal XenServer testing has not shown up 
any issues with 'legacy' PV drivers though (which do still have the prefix GSO 
code in).

  Paul

> Thanks.
> >
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > ---
> > Cc: Wei Liu <wei.l...@citrix.com>
> > ---
> >  drivers/net/xen-netback/common.h|  1 -
> >  drivers/net/xen-netback/interface.c |  4 ++--
> >  drivers/net/xen-netback/rx.c| 26 --
> >  drivers/net/xen-netback/xenbus.c| 21 -
> >  4 files changed, 2 insertions(+), 50 deletions(-)
> >
> > diff --git a/drivers/net/xen-netback/common.h
> > b/drivers/net/xen-netback/common.h
> > index b38fb2c..0ba5910 100644
> > --- a/drivers/net/xen-netback/common.h
> > +++ b/drivers/net/xen-netback/common.h
> > @@ -260,7 +260,6 @@ struct xenvif {
> >
> > /* Frontend feature information. */
> > int gso_mask;
> > -   int gso_prefix_mask;
> >
> > u8 can_sg:1;
> > u8 ip_csum:1;
> > diff --git a/drivers/net/xen-netback/interface.c
> > b/drivers/net/xen-netback/interface.c
> > index fb50c6d..211d542 100644
> > --- a/drivers/net/xen-netback/interface.c
> > +++ b/drivers/net/xen-netback/interface.c
> > @@ -319,9 +319,9 @@ static netdev_features_t
> > xenvif_fix_features(struct net_device *dev,
> >
> > if (!vif->can_sg)
> > features &= ~NETIF_F_SG;
> > -   if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV4))
> > +   if (~(vif->gso_mask) & GSO_BIT(TCPV4))
> > features &= ~NETIF_F_TSO;
> > -   if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV6))
> > +   if (~(vif->gso_mask) & GSO_BIT(TCPV6))
> > features &= ~NETIF_F_TSO6;
> > if (!vif->ip_csum)
> > features &= ~NETIF_F_IP_CSUM;
> > diff --git a/drivers/net/xen-netback/rx.c
> > b/drivers/net/xen-netback/rx.c index 03836aa..6bd7d6e 100644
> > --- a/drivers/net/xen-netback/rx.c
> > +++ b/drivers/net/xen-netback/rx.c
> > @@ -347,16 +347,6 @@ static int xenvif_gop_skb(struct sk_buff *skb,
> > gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
> > }
> >
> > -   /* Set up a GSO prefix descriptor, if necessary */
> > -   if ((1 << gso_type) & vif->gso_prefix_mask) {
> > -   RING_COPY_REQUEST(>rx, queue->rx.req

[PATCH v2 net-next 5/7] xen-netback: process guest rx packets in batches

2016-10-04 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

Instead of only placing one skb on the guest rx ring at a time, process
a batch of up-to 64.  This improves performance by ~10% in some tests.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/rx.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 9548709..ae822b8 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -399,7 +399,7 @@ static void xenvif_rx_extra_slot(struct xenvif_queue *queue,
BUG();
 }
 
-void xenvif_rx_action(struct xenvif_queue *queue)
+void xenvif_rx_skb(struct xenvif_queue *queue)
 {
struct xenvif_pkt_state pkt;
 
@@ -425,6 +425,19 @@ void xenvif_rx_action(struct xenvif_queue *queue)
xenvif_rx_complete(queue, );
 }
 
+#define RX_BATCH_SIZE 64
+
+void xenvif_rx_action(struct xenvif_queue *queue)
+{
+   unsigned int work_done = 0;
+
+   while (xenvif_rx_ring_slots_available(queue) &&
+  work_done < RX_BATCH_SIZE) {
+   xenvif_rx_skb(queue);
+   work_done++;
+   }
+}
+
 static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)
 {
RING_IDX prod, cons;
-- 
2.1.4

[PATCH v2 net-next 0/7] xen-netback: guest rx side refactor

2016-10-04 Thread Paul Durrant

This series refactors the guest rx side of xen-netback:

- The code is moved into its own source module.

- The prefix variant of GSO handling is retired (since it is no longer
  in common use, and alternatives exist).

- The code is then simplified and modifications made to improve
  performance.

v2:
- Rebased onto refreshed net-next

David Vrabel (4):
  xen-netback: refactor guest rx
  xen-netback: immediately wake tx queue when guest rx queue has space
  xen-netback: process guest rx packets in batches
  xen-netback: batch copies for multiple to-guest rx packets

Paul Durrant (2):
  xen-netback: separate guest side rx code into separate module
  xen-netback: retire guest rx side prefix GSO feature

Ross Lagerwall (1):
  xen/netback: add fraglist support for to-guest rx

 drivers/net/xen-netback/Makefile|   2 +-
 drivers/net/xen-netback/common.h|  25 +-
 drivers/net/xen-netback/interface.c |   6 +-
 drivers/net/xen-netback/netback.c   | 754 
 drivers/net/xen-netback/rx.c| 628 ++
 drivers/net/xen-netback/xenbus.c|  21 -
 6 files changed, 643 insertions(+), 793 deletions(-)
 create mode 100644 drivers/net/xen-netback/rx.c

-- 
2.1.4

[PATCH v2 net-next 7/7] xen/netback: add fraglist support for to-guest rx

2016-10-04 Thread Paul Durrant

From: Ross Lagerwall <ross.lagerw...@citrix.com>

This allows full 64K skbuffs (with 1500 mtu ethernet, composed of 45
fragments) to be handled by netback for to-guest rx.

Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c |  2 +-
 drivers/net/xen-netback/rx.c| 38 -
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 211d542..4af532a 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -467,7 +467,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t 
domid,
dev->netdev_ops = _netdev_ops;
dev->hw_features = NETIF_F_SG |
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
-   NETIF_F_TSO | NETIF_F_TSO6;
+   NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_FRAGLIST;
dev->features = dev->hw_features | NETIF_F_RXCSUM;
dev->ethtool_ops = _ethtool_ops;
 
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 8c8c5b5..8e9ade6 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -215,7 +215,8 @@ static unsigned int xenvif_gso_type(struct sk_buff *skb)
 struct xenvif_pkt_state {
struct sk_buff *skb;
size_t remaining_len;
-   int frag; /* frag == -1 => skb->head */
+   struct sk_buff *frag_iter;
+   int frag; /* frag == -1 => frag_iter->head */
unsigned int frag_offset;
struct xen_netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX - 1];
unsigned int extra_count;
@@ -237,6 +238,7 @@ static void xenvif_rx_next_skb(struct xenvif_queue *queue,
memset(pkt, 0, sizeof(struct xenvif_pkt_state));
 
pkt->skb = skb;
+   pkt->frag_iter = skb;
pkt->remaining_len = skb->len;
pkt->frag = -1;
 
@@ -293,20 +295,40 @@ static void xenvif_rx_complete(struct xenvif_queue *queue,
__skb_queue_tail(queue->rx_copy.completed, pkt->skb);
 }
 
+static void xenvif_rx_next_frag(struct xenvif_pkt_state *pkt)
+{
+   struct sk_buff *frag_iter = pkt->frag_iter;
+   unsigned int nr_frags = skb_shinfo(frag_iter)->nr_frags;
+
+   pkt->frag++;
+   pkt->frag_offset = 0;
+
+   if (pkt->frag >= nr_frags) {
+   if (frag_iter == pkt->skb)
+   pkt->frag_iter = skb_shinfo(frag_iter)->frag_list;
+   else
+   pkt->frag_iter = frag_iter->next;
+
+   pkt->frag = -1;
+   }
+}
+
 static void xenvif_rx_next_chunk(struct xenvif_queue *queue,
 struct xenvif_pkt_state *pkt,
 unsigned int offset, void **data,
 size_t *len)
 {
-   struct sk_buff *skb = pkt->skb;
+   struct sk_buff *frag_iter = pkt->frag_iter;
void *frag_data;
size_t frag_len, chunk_len;
 
+   BUG_ON(!frag_iter);
+
if (pkt->frag == -1) {
-   frag_data = skb->data;
-   frag_len = skb_headlen(skb);
+   frag_data = frag_iter->data;
+   frag_len = skb_headlen(frag_iter);
} else {
-   skb_frag_t *frag = _shinfo(skb)->frags[pkt->frag];
+   skb_frag_t *frag = _shinfo(frag_iter)->frags[pkt->frag];
 
frag_data = skb_frag_address(frag);
frag_len = skb_frag_size(frag);
@@ -322,10 +344,8 @@ static void xenvif_rx_next_chunk(struct xenvif_queue 
*queue,
pkt->frag_offset += chunk_len;
 
/* Advance to next frag? */
-   if (frag_len == chunk_len) {
-   pkt->frag++;
-   pkt->frag_offset = 0;
-   }
+   if (frag_len == chunk_len)
+   xenvif_rx_next_frag(pkt);
 
*data = frag_data;
*len = chunk_len;
-- 
2.1.4

[PATCH v2 net-next 6/7] xen-netback: batch copies for multiple to-guest rx packets

2016-10-04 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

Instead of flushing the copy ops when an packet is complete, complete
packets when their copy ops are done.  This improves performance by
reducing the number of grant copy hypercalls.

Latency is still limited by the relatively small size of the copy
batch.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h |  1 +
 drivers/net/xen-netback/rx.c | 27 +--
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 7d12a38..cf68149 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -132,6 +132,7 @@ struct xenvif_copy_state {
struct gnttab_copy op[COPY_BATCH_SIZE];
RING_IDX idx[COPY_BATCH_SIZE];
unsigned int num;
+   struct sk_buff_head *completed;
 };
 
 struct xenvif_queue { /* Per-queue data for xenvif */
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index ae822b8..8c8c5b5 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -133,6 +133,7 @@ static void xenvif_rx_queue_drop_expired(struct 
xenvif_queue *queue)
 static void xenvif_rx_copy_flush(struct xenvif_queue *queue)
 {
unsigned int i;
+   int notify;
 
gnttab_batch_copy(queue->rx_copy.op, queue->rx_copy.num);
 
@@ -154,6 +155,13 @@ static void xenvif_rx_copy_flush(struct xenvif_queue 
*queue)
}
 
queue->rx_copy.num = 0;
+
+   /* Push responses for all completed packets. */
+   RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(>rx, notify);
+   if (notify)
+   notify_remote_via_irq(queue->rx_irq);
+
+   __skb_queue_purge(queue->rx_copy.completed);
 }
 
 static void xenvif_rx_copy_add(struct xenvif_queue *queue,
@@ -279,18 +287,10 @@ static void xenvif_rx_next_skb(struct xenvif_queue *queue,
 static void xenvif_rx_complete(struct xenvif_queue *queue,
   struct xenvif_pkt_state *pkt)
 {
-   int notify;
-
-   /* Complete any outstanding copy ops for this skb. */
-   xenvif_rx_copy_flush(queue);
-
-   /* Push responses and notify. */
+   /* All responses are ready to be pushed. */
queue->rx.rsp_prod_pvt = queue->rx.req_cons;
-   RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(>rx, notify);
-   if (notify)
-   notify_remote_via_irq(queue->rx_irq);
 
-   dev_kfree_skb(pkt->skb);
+   __skb_queue_tail(queue->rx_copy.completed, pkt->skb);
 }
 
 static void xenvif_rx_next_chunk(struct xenvif_queue *queue,
@@ -429,13 +429,20 @@ void xenvif_rx_skb(struct xenvif_queue *queue)
 
 void xenvif_rx_action(struct xenvif_queue *queue)
 {
+   struct sk_buff_head completed_skbs;
unsigned int work_done = 0;
 
+   __skb_queue_head_init(_skbs);
+   queue->rx_copy.completed = _skbs;
+
while (xenvif_rx_ring_slots_available(queue) &&
   work_done < RX_BATCH_SIZE) {
xenvif_rx_skb(queue);
work_done++;
}
+
+   /* Flush any pending copies and complete all skbs. */
+   xenvif_rx_copy_flush(queue);
 }
 
 static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)
-- 
2.1.4

[PATCH v2 net-next 3/7] xen-netback: refactor guest rx

2016-10-04 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

Refactor the to-guest (rx) path to:

1. Push responses for completed skbs earlier, reducing latency.

2. Reduce the per-queue memory overhead by greatly reducing the
   maximum number of grant copy ops in each hypercall (from 4352 to
   64).  Each struct xenvif_queue is now only 44 kB instead of 220 kB.

3. Make the code more maintainable.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h |  23 +-
 drivers/net/xen-netback/rx.c | 654 +++
 2 files changed, 254 insertions(+), 423 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 0ba5910..7d12a38 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -91,13 +91,6 @@ struct xenvif_rx_meta {
  */
 #define MAX_XEN_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
 
-/* It's possible for an skb to have a maximal number of frags
- * but still be less than MAX_BUFFER_OFFSET in size. Thus the
- * worst-case number of copy operations is MAX_XEN_SKB_FRAGS per
- * ring slot.
- */
-#define MAX_GRANT_COPY_OPS (MAX_XEN_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE)
-
 #define NETBACK_INVALID_HANDLE -1
 
 /* To avoid confusion, we define XEN_NETBK_LEGACY_SLOTS_MAX indicating
@@ -133,6 +126,14 @@ struct xenvif_stats {
unsigned long tx_frag_overflow;
 };
 
+#define COPY_BATCH_SIZE 64
+
+struct xenvif_copy_state {
+   struct gnttab_copy op[COPY_BATCH_SIZE];
+   RING_IDX idx[COPY_BATCH_SIZE];
+   unsigned int num;
+};
+
 struct xenvif_queue { /* Per-queue data for xenvif */
unsigned int id; /* Queue ID, 0-based */
char name[QUEUE_NAME_SIZE]; /* DEVNAME-qN */
@@ -189,12 +190,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
unsigned long last_rx_time;
bool stalled;
 
-   struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
-
-   /* We create one meta structure per ring request we consume, so
-* the maximum number is the same as the ring size.
-*/
-   struct xenvif_rx_meta meta[XEN_NETIF_RX_RING_SIZE];
+   struct xenvif_copy_state rx_copy;
 
/* Transmit shaping: allow 'credit_bytes' every 'credit_usec'. */
unsigned long   credit_bytes;
@@ -358,6 +354,7 @@ int xenvif_dealloc_kthread(void *data);
 
 irqreturn_t xenvif_ctrl_irq_fn(int irq, void *data);
 
+void xenvif_rx_action(struct xenvif_queue *queue);
 void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
 
 void xenvif_carrier_on(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 6bd7d6e..b0ce4c6 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -26,7 +26,6 @@
  * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
  * IN THE SOFTWARE.
  */
-
 #include "common.h"
 
 #include 
@@ -137,464 +136,299 @@ static void xenvif_rx_queue_drop_expired(struct 
xenvif_queue *queue)
}
 }
 
-struct netrx_pending_operations {
-   unsigned int copy_prod, copy_cons;
-   unsigned int meta_prod, meta_cons;
-   struct gnttab_copy *copy;
-   struct xenvif_rx_meta *meta;
-   int copy_off;
-   grant_ref_t copy_gref;
-};
-
-static struct xenvif_rx_meta *get_next_rx_buffer(
-   struct xenvif_queue *queue,
-   struct netrx_pending_operations *npo)
+static void xenvif_rx_copy_flush(struct xenvif_queue *queue)
 {
-   struct xenvif_rx_meta *meta;
-   struct xen_netif_rx_request req;
+   unsigned int i;
 
-   RING_COPY_REQUEST(>rx, queue->rx.req_cons++, );
+   gnttab_batch_copy(queue->rx_copy.op, queue->rx_copy.num);
 
-   meta = npo->meta + npo->meta_prod++;
-   meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
-   meta->gso_size = 0;
-   meta->size = 0;
-   meta->id = req.id;
+   for (i = 0; i < queue->rx_copy.num; i++) {
+   struct gnttab_copy *op;
 
-   npo->copy_off = 0;
-   npo->copy_gref = req.gref;
+   op = >rx_copy.op[i];
 
-   return meta;
+   /* If the copy failed, overwrite the status field in
+* the corresponding response.
+*/
+   if (unlikely(op->status != GNTST_okay)) {
+   struct xen_netif_rx_response *rsp;
+
+   rsp = RING_GET_RESPONSE(>rx,
+   queue->rx_copy.idx[i]);
+   rsp->status = op->status;
+   }
+   }
+
+   queue->rx_copy.num = 0;
 }
 
-struct gop_frag_copy {
-   struct xenvif_queue *queue;
-   struct netrx_pending_operations *npo;
-   struct xenvif_rx_meta *meta;
-   int head;
-   int gso_type

[PATCH v2 net-next 4/7] xen-netback: immediately wake tx queue when guest rx queue has space

2016-10-04 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

When an skb is removed from the guest rx queue, immediately wake the
tx queue, instead of after processing them.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/rx.c | 24 
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index b0ce4c6..9548709 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -92,27 +92,21 @@ static struct sk_buff *xenvif_rx_dequeue(struct 
xenvif_queue *queue)
spin_lock_irq(>rx_queue.lock);
 
skb = __skb_dequeue(>rx_queue);
-   if (skb)
+   if (skb) {
queue->rx_queue_len -= skb->len;
+   if (queue->rx_queue_len < queue->rx_queue_max) {
+   struct netdev_queue *txq;
+
+   txq = netdev_get_tx_queue(queue->vif->dev, queue->id);
+   netif_tx_wake_queue(txq);
+   }
+   }
 
spin_unlock_irq(>rx_queue.lock);
 
return skb;
 }
 
-static void xenvif_rx_queue_maybe_wake(struct xenvif_queue *queue)
-{
-   spin_lock_irq(>rx_queue.lock);
-
-   if (queue->rx_queue_len < queue->rx_queue_max) {
-   struct net_device *dev = queue->vif->dev;
-
-   netif_tx_wake_queue(netdev_get_tx_queue(dev, queue->id));
-   }
-
-   spin_unlock_irq(>rx_queue.lock);
-}
-
 static void xenvif_rx_queue_purge(struct xenvif_queue *queue)
 {
struct sk_buff *skb;
@@ -585,8 +579,6 @@ int xenvif_kthread_guest_rx(void *data)
 */
xenvif_rx_queue_drop_expired(queue);
 
-   xenvif_rx_queue_maybe_wake(queue);
-
cond_resched();
}
 
-- 
2.1.4

[PATCH v2 net-next 1/7] xen-netback: separate guest side rx code into separate module

2016-10-04 Thread Paul Durrant

The netback source module has become very large and somewhat confusing.
This patch simply moves all code related to the backend to frontend (i.e
guest side rx) data-path into a separate rx source module.

This patch contains no functional change, it is code movement and
minimal changes to avoid patch style-check issues.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/Makefile  |   2 +-
 drivers/net/xen-netback/netback.c | 754 
 drivers/net/xen-netback/rx.c  | 789 ++
 3 files changed, 790 insertions(+), 755 deletions(-)
 create mode 100644 drivers/net/xen-netback/rx.c

diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile
index 11e02be..d49798a 100644
--- a/drivers/net/xen-netback/Makefile
+++ b/drivers/net/xen-netback/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o
 
-xen-netback-y := netback.o xenbus.o interface.o hash.o
+xen-netback-y := netback.o xenbus.o interface.o hash.o rx.o
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 3d0c989..47b4810 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -106,13 +106,6 @@ static void push_tx_responses(struct xenvif_queue *queue);
 
 static inline int tx_work_todo(struct xenvif_queue *queue);
 
-static struct xen_netif_rx_response *make_rx_response(struct xenvif_queue 
*queue,
-u16  id,
-s8   st,
-u16  offset,
-u16  size,
-u16  flags);
-
 static inline unsigned long idx_to_pfn(struct xenvif_queue *queue,
   u16 idx)
 {
@@ -155,571 +148,11 @@ static inline pending_ring_idx_t pending_index(unsigned 
i)
return i & (MAX_PENDING_REQS-1);
 }
 
-static bool xenvif_rx_ring_slots_available(struct xenvif_queue *queue)
-{
-   RING_IDX prod, cons;
-   struct sk_buff *skb;
-   int needed;
-
-   skb = skb_peek(>rx_queue);
-   if (!skb)
-   return false;
-
-   needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
-   if (skb_is_gso(skb))
-   needed++;
-   if (skb->sw_hash)
-   needed++;
-
-   do {
-   prod = queue->rx.sring->req_prod;
-   cons = queue->rx.req_cons;
-
-   if (prod - cons >= needed)
-   return true;
-
-   queue->rx.sring->req_event = prod + 1;
-
-   /* Make sure event is visible before we check prod
-* again.
-*/
-   mb();
-   } while (queue->rx.sring->req_prod != prod);
-
-   return false;
-}
-
-void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(>rx_queue.lock, flags);
-
-   __skb_queue_tail(>rx_queue, skb);
-
-   queue->rx_queue_len += skb->len;
-   if (queue->rx_queue_len > queue->rx_queue_max)
-   netif_tx_stop_queue(netdev_get_tx_queue(queue->vif->dev, 
queue->id));
-
-   spin_unlock_irqrestore(>rx_queue.lock, flags);
-}
-
-static struct sk_buff *xenvif_rx_dequeue(struct xenvif_queue *queue)
-{
-   struct sk_buff *skb;
-
-   spin_lock_irq(>rx_queue.lock);
-
-   skb = __skb_dequeue(>rx_queue);
-   if (skb)
-   queue->rx_queue_len -= skb->len;
-
-   spin_unlock_irq(>rx_queue.lock);
-
-   return skb;
-}
-
-static void xenvif_rx_queue_maybe_wake(struct xenvif_queue *queue)
-{
-   spin_lock_irq(>rx_queue.lock);
-
-   if (queue->rx_queue_len < queue->rx_queue_max)
-   netif_tx_wake_queue(netdev_get_tx_queue(queue->vif->dev, 
queue->id));
-
-   spin_unlock_irq(>rx_queue.lock);
-}
-
-
-static void xenvif_rx_queue_purge(struct xenvif_queue *queue)
-{
-   struct sk_buff *skb;
-   while ((skb = xenvif_rx_dequeue(queue)) != NULL)
-   kfree_skb(skb);
-}
-
-static void xenvif_rx_queue_drop_expired(struct xenvif_queue *queue)
-{
-   struct sk_buff *skb;
-
-   for(;;) {
-   skb = skb_peek(>rx_queue);
-   if (!skb)
-   break;
-   if (time_before(jiffies, XENVIF_RX_CB(skb)->expires))
-   break;
-   xenvif_rx_dequeue(queue);
-   kfree_skb(skb);
-   }
-}
-
-struct netrx_pending_operations {
-   unsigned copy_prod, copy_cons;
-   unsigned meta_prod, meta_cons;
-   struct gnttab_copy *copy;
-   struct xenvif_rx_meta *meta;
-   int copy_off;
-   grant_ref_t copy_gref;
-};
-
-stati

[PATCH v2 net-next 2/7] xen-netback: retire guest rx side prefix GSO feature

2016-10-04 Thread Paul Durrant

As far as I am aware only very old Windows network frontends make use of
this style of passing GSO packets from backend to frontend. These
frontends can easily be replaced by the freely available Xen Project
Windows PV network frontend, which uses the 'default' mechanism for
passing GSO packets, which is also used by all Linux frontends.

NOTE: Removal of this feature will not cause breakage in old Windows
  frontends. They simply will no longer receive GSO packets - the
  packets instead being fragmented in the backend.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h|  1 -
 drivers/net/xen-netback/interface.c |  4 ++--
 drivers/net/xen-netback/rx.c| 26 --
 drivers/net/xen-netback/xenbus.c| 21 -
 4 files changed, 2 insertions(+), 50 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index b38fb2c..0ba5910 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -260,7 +260,6 @@ struct xenvif {
 
/* Frontend feature information. */
int gso_mask;
-   int gso_prefix_mask;
 
u8 can_sg:1;
u8 ip_csum:1;
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index fb50c6d..211d542 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -319,9 +319,9 @@ static netdev_features_t xenvif_fix_features(struct 
net_device *dev,
 
if (!vif->can_sg)
features &= ~NETIF_F_SG;
-   if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV4))
+   if (~(vif->gso_mask) & GSO_BIT(TCPV4))
features &= ~NETIF_F_TSO;
-   if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV6))
+   if (~(vif->gso_mask) & GSO_BIT(TCPV6))
features &= ~NETIF_F_TSO6;
if (!vif->ip_csum)
features &= ~NETIF_F_IP_CSUM;
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 03836aa..6bd7d6e 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -347,16 +347,6 @@ static int xenvif_gop_skb(struct sk_buff *skb,
gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
}
 
-   /* Set up a GSO prefix descriptor, if necessary */
-   if ((1 << gso_type) & vif->gso_prefix_mask) {
-   RING_COPY_REQUEST(>rx, queue->rx.req_cons++, );
-   meta = npo->meta + npo->meta_prod++;
-   meta->gso_type = gso_type;
-   meta->gso_size = skb_shinfo(skb)->gso_size;
-   meta->size = 0;
-   meta->id = req.id;
-   }
-
RING_COPY_REQUEST(>rx, queue->rx.req_cons++, );
meta = npo->meta + npo->meta_prod++;
 
@@ -511,22 +501,6 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
while ((skb = __skb_dequeue()) != NULL) {
struct xen_netif_extra_info *extra = NULL;
 
-   if ((1 << queue->meta[npo.meta_cons].gso_type) &
-   vif->gso_prefix_mask) {
-   resp = RING_GET_RESPONSE(>rx,
-queue->rx.rsp_prod_pvt++);
-
-   resp->flags = XEN_NETRXF_gso_prefix |
- XEN_NETRXF_more_data;
-
-   resp->offset = queue->meta[npo.meta_cons].gso_size;
-   resp->id = queue->meta[npo.meta_cons].id;
-   resp->status = XENVIF_RX_CB(skb)->meta_slots_used;
-
-   npo.meta_cons++;
-   XENVIF_RX_CB(skb)->meta_slots_used--;
-   }
-
queue->stats.tx_bytes += skb->len;
queue->stats.tx_packets++;
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index daf4c78..7056404 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -1135,7 +1135,6 @@ static int read_xenbus_vif_flags(struct backend_info *be)
vif->can_sg = !!val;
 
vif->gso_mask = 0;
-   vif->gso_prefix_mask = 0;
 
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
 "%d", ) < 0)
@@ -1143,32 +1142,12 @@ static int read_xenbus_vif_flags(struct backend_info 
*be)
if (val)
vif->gso_mask |= GSO_BIT(TCPV4);
 
-   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-prefix",
-"%d", ) < 0)
-   val = 0;
-   if (val)
-   vif->gso_prefix_mask |= GSO_BIT(TCPV4);
-
if (xenbus_scanf(XBT_NIL, dev->otherend, "featur

RE: [PATCH net-next 0/7] xen-netback: guest rx side refactor

2016-10-04 Thread Paul Durrant

> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: 04 October 2016 05:52
> To: Paul Durrant <paul.durr...@citrix.com>
> Cc: netdev@vger.kernel.org; xen-de...@lists.xenproject.org
> Subject: Re: [PATCH net-next 0/7] xen-netback: guest rx side refactor
> 
> From: Paul Durrant <paul.durr...@citrix.com>
> Date: Mon, 3 Oct 2016 08:31:05 +0100
> 
> > This series refactors the guest rx side of xen-netback:
> >
> > - The code is moved into its own source module.
> >
> > - The prefix variant of GSO handling is retired (since it is no longer
> >   in common use, and alternatives exist).
> >
> > - The code is then simplified and modifications made to improve
> >   performance.
> 
> This doesn't apply cleanly to net-next, please respin.

Sure. V2 coming up.

  Paul

[PATCH net-next 7/7] xen/netback: add fraglist support for to-guest rx

2016-10-03 Thread Paul Durrant

From: Ross Lagerwall <ross.lagerw...@citrix.com>

This allows full 64K skbuffs (with 1500 mtu ethernet, composed of 45
fragments) to be handled by netback for to-guest rx.

Signed-off-by: Ross Lagerwall <ross.lagerw...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c |  2 +-
 drivers/net/xen-netback/rx.c| 38 -
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 1a009e7..8fef4fe 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -476,7 +476,7 @@ struct xenvif *xenvif_alloc(struct device *parent, domid_t 
domid,
dev->netdev_ops = _netdev_ops;
dev->hw_features = NETIF_F_SG |
NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
-   NETIF_F_TSO | NETIF_F_TSO6;
+   NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_FRAGLIST;
dev->features = dev->hw_features | NETIF_F_RXCSUM;
dev->ethtool_ops = _ethtool_ops;
 
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 8c8c5b5..8e9ade6 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -215,7 +215,8 @@ static unsigned int xenvif_gso_type(struct sk_buff *skb)
 struct xenvif_pkt_state {
struct sk_buff *skb;
size_t remaining_len;
-   int frag; /* frag == -1 => skb->head */
+   struct sk_buff *frag_iter;
+   int frag; /* frag == -1 => frag_iter->head */
unsigned int frag_offset;
struct xen_netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX - 1];
unsigned int extra_count;
@@ -237,6 +238,7 @@ static void xenvif_rx_next_skb(struct xenvif_queue *queue,
memset(pkt, 0, sizeof(struct xenvif_pkt_state));
 
pkt->skb = skb;
+   pkt->frag_iter = skb;
pkt->remaining_len = skb->len;
pkt->frag = -1;
 
@@ -293,20 +295,40 @@ static void xenvif_rx_complete(struct xenvif_queue *queue,
__skb_queue_tail(queue->rx_copy.completed, pkt->skb);
 }
 
+static void xenvif_rx_next_frag(struct xenvif_pkt_state *pkt)
+{
+   struct sk_buff *frag_iter = pkt->frag_iter;
+   unsigned int nr_frags = skb_shinfo(frag_iter)->nr_frags;
+
+   pkt->frag++;
+   pkt->frag_offset = 0;
+
+   if (pkt->frag >= nr_frags) {
+   if (frag_iter == pkt->skb)
+   pkt->frag_iter = skb_shinfo(frag_iter)->frag_list;
+   else
+   pkt->frag_iter = frag_iter->next;
+
+   pkt->frag = -1;
+   }
+}
+
 static void xenvif_rx_next_chunk(struct xenvif_queue *queue,
 struct xenvif_pkt_state *pkt,
 unsigned int offset, void **data,
 size_t *len)
 {
-   struct sk_buff *skb = pkt->skb;
+   struct sk_buff *frag_iter = pkt->frag_iter;
void *frag_data;
size_t frag_len, chunk_len;
 
+   BUG_ON(!frag_iter);
+
if (pkt->frag == -1) {
-   frag_data = skb->data;
-   frag_len = skb_headlen(skb);
+   frag_data = frag_iter->data;
+   frag_len = skb_headlen(frag_iter);
} else {
-   skb_frag_t *frag = _shinfo(skb)->frags[pkt->frag];
+   skb_frag_t *frag = _shinfo(frag_iter)->frags[pkt->frag];
 
frag_data = skb_frag_address(frag);
frag_len = skb_frag_size(frag);
@@ -322,10 +344,8 @@ static void xenvif_rx_next_chunk(struct xenvif_queue 
*queue,
pkt->frag_offset += chunk_len;
 
/* Advance to next frag? */
-   if (frag_len == chunk_len) {
-   pkt->frag++;
-   pkt->frag_offset = 0;
-   }
+   if (frag_len == chunk_len)
+   xenvif_rx_next_frag(pkt);
 
*data = frag_data;
*len = chunk_len;
-- 
2.1.4

[PATCH net-next 0/7] xen-netback: guest rx side refactor

2016-10-03 Thread Paul Durrant

This series refactors the guest rx side of xen-netback:

- The code is moved into its own source module.

- The prefix variant of GSO handling is retired (since it is no longer
  in common use, and alternatives exist).

- The code is then simplified and modifications made to improve
  performance.

David Vrabel (4):
  xen-netback: refactor guest rx
  xen-netback: immediately wake tx queue when guest rx queue has space
  xen-netback: process guest rx packets in batches
  xen-netback: batch copies for multiple to-guest rx packets

Paul Durrant (2):
  xen-netback: separate guest side rx code into separate module
  xen-netback: retire guest rx side prefix GSO feature

Ross Lagerwall (1):
  xen/netback: add fraglist support for to-guest rx

 drivers/net/xen-netback/Makefile|   2 +-
 drivers/net/xen-netback/common.h|  25 +-
 drivers/net/xen-netback/interface.c |   6 +-
 drivers/net/xen-netback/netback.c   | 754 
 drivers/net/xen-netback/rx.c| 628 ++
 drivers/net/xen-netback/xenbus.c|  21 -
 6 files changed, 643 insertions(+), 793 deletions(-)
 create mode 100644 drivers/net/xen-netback/rx.c

-- 
2.1.4

[PATCH net-next 1/7] xen-netback: separate guest side rx code into separate module

2016-10-03 Thread Paul Durrant

The netback source module has become very large and somewhat confusing.
This patch simply moves all code related to the backend to frontend (i.e
guest side rx) data-path into a separate rx source module.

This patch contains no functional change, it is code movement and
minimal changes to avoid patch style-check issues.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/Makefile  |   2 +-
 drivers/net/xen-netback/netback.c | 754 
 drivers/net/xen-netback/rx.c  | 789 ++
 3 files changed, 790 insertions(+), 755 deletions(-)
 create mode 100644 drivers/net/xen-netback/rx.c

diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile
index 11e02be..d49798a 100644
--- a/drivers/net/xen-netback/Makefile
+++ b/drivers/net/xen-netback/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o
 
-xen-netback-y := netback.o xenbus.o interface.o hash.o
+xen-netback-y := netback.o xenbus.o interface.o hash.o rx.o
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index edbae0b..1f9d92e 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -106,13 +106,6 @@ static void push_tx_responses(struct xenvif_queue *queue);
 
 static inline int tx_work_todo(struct xenvif_queue *queue);
 
-static struct xen_netif_rx_response *make_rx_response(struct xenvif_queue 
*queue,
-u16  id,
-s8   st,
-u16  offset,
-u16  size,
-u16  flags);
-
 static inline unsigned long idx_to_pfn(struct xenvif_queue *queue,
   u16 idx)
 {
@@ -155,571 +148,11 @@ static inline pending_ring_idx_t pending_index(unsigned 
i)
return i & (MAX_PENDING_REQS-1);
 }
 
-static bool xenvif_rx_ring_slots_available(struct xenvif_queue *queue)
-{
-   RING_IDX prod, cons;
-   struct sk_buff *skb;
-   int needed;
-
-   skb = skb_peek(>rx_queue);
-   if (!skb)
-   return false;
-
-   needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
-   if (skb_is_gso(skb))
-   needed++;
-   if (skb->sw_hash)
-   needed++;
-
-   do {
-   prod = queue->rx.sring->req_prod;
-   cons = queue->rx.req_cons;
-
-   if (prod - cons >= needed)
-   return true;
-
-   queue->rx.sring->req_event = prod + 1;
-
-   /* Make sure event is visible before we check prod
-* again.
-*/
-   mb();
-   } while (queue->rx.sring->req_prod != prod);
-
-   return false;
-}
-
-void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb)
-{
-   unsigned long flags;
-
-   spin_lock_irqsave(>rx_queue.lock, flags);
-
-   __skb_queue_tail(>rx_queue, skb);
-
-   queue->rx_queue_len += skb->len;
-   if (queue->rx_queue_len > queue->rx_queue_max)
-   netif_tx_stop_queue(netdev_get_tx_queue(queue->vif->dev, 
queue->id));
-
-   spin_unlock_irqrestore(>rx_queue.lock, flags);
-}
-
-static struct sk_buff *xenvif_rx_dequeue(struct xenvif_queue *queue)
-{
-   struct sk_buff *skb;
-
-   spin_lock_irq(>rx_queue.lock);
-
-   skb = __skb_dequeue(>rx_queue);
-   if (skb)
-   queue->rx_queue_len -= skb->len;
-
-   spin_unlock_irq(>rx_queue.lock);
-
-   return skb;
-}
-
-static void xenvif_rx_queue_maybe_wake(struct xenvif_queue *queue)
-{
-   spin_lock_irq(>rx_queue.lock);
-
-   if (queue->rx_queue_len < queue->rx_queue_max)
-   netif_tx_wake_queue(netdev_get_tx_queue(queue->vif->dev, 
queue->id));
-
-   spin_unlock_irq(>rx_queue.lock);
-}
-
-
-static void xenvif_rx_queue_purge(struct xenvif_queue *queue)
-{
-   struct sk_buff *skb;
-   while ((skb = xenvif_rx_dequeue(queue)) != NULL)
-   kfree_skb(skb);
-}
-
-static void xenvif_rx_queue_drop_expired(struct xenvif_queue *queue)
-{
-   struct sk_buff *skb;
-
-   for(;;) {
-   skb = skb_peek(>rx_queue);
-   if (!skb)
-   break;
-   if (time_before(jiffies, XENVIF_RX_CB(skb)->expires))
-   break;
-   xenvif_rx_dequeue(queue);
-   kfree_skb(skb);
-   }
-}
-
-struct netrx_pending_operations {
-   unsigned copy_prod, copy_cons;
-   unsigned meta_prod, meta_cons;
-   struct gnttab_copy *copy;
-   struct xenvif_rx_meta *meta;
-   int copy_off;
-   grant_ref_t copy_gref;
-};
-
-stati

[PATCH net-next 3/7] xen-netback: refactor guest rx

2016-10-03 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

Refactor the to-guest (rx) path to:

1. Push responses for completed skbs earlier, reducing latency.

2. Reduce the per-queue memory overhead by greatly reducing the
   maximum number of grant copy ops in each hypercall (from 4352 to
   64).  Each struct xenvif_queue is now only 44 kB instead of 220 kB.

3. Make the code more maintainable.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h |  23 +-
 drivers/net/xen-netback/rx.c | 654 +++
 2 files changed, 254 insertions(+), 423 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index e16004a..adef482 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -91,13 +91,6 @@ struct xenvif_rx_meta {
  */
 #define MAX_XEN_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
 
-/* It's possible for an skb to have a maximal number of frags
- * but still be less than MAX_BUFFER_OFFSET in size. Thus the
- * worst-case number of copy operations is MAX_XEN_SKB_FRAGS per
- * ring slot.
- */
-#define MAX_GRANT_COPY_OPS (MAX_XEN_SKB_FRAGS * XEN_NETIF_RX_RING_SIZE)
-
 #define NETBACK_INVALID_HANDLE -1
 
 /* To avoid confusion, we define XEN_NETBK_LEGACY_SLOTS_MAX indicating
@@ -133,6 +126,14 @@ struct xenvif_stats {
unsigned long tx_frag_overflow;
 };
 
+#define COPY_BATCH_SIZE 64
+
+struct xenvif_copy_state {
+   struct gnttab_copy op[COPY_BATCH_SIZE];
+   RING_IDX idx[COPY_BATCH_SIZE];
+   unsigned int num;
+};
+
 struct xenvif_queue { /* Per-queue data for xenvif */
unsigned int id; /* Queue ID, 0-based */
char name[QUEUE_NAME_SIZE]; /* DEVNAME-qN */
@@ -189,12 +190,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
unsigned long last_rx_time;
bool stalled;
 
-   struct gnttab_copy grant_copy_op[MAX_GRANT_COPY_OPS];
-
-   /* We create one meta structure per ring request we consume, so
-* the maximum number is the same as the ring size.
-*/
-   struct xenvif_rx_meta meta[XEN_NETIF_RX_RING_SIZE];
+   struct xenvif_copy_state rx_copy;
 
/* Transmit shaping: allow 'credit_bytes' every 'credit_usec'. */
unsigned long   credit_bytes;
@@ -360,6 +356,7 @@ int xenvif_dealloc_kthread(void *data);
 
 int xenvif_ctrl_kthread(void *data);
 
+void xenvif_rx_action(struct xenvif_queue *queue);
 void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
 
 void xenvif_carrier_on(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 6bd7d6e..b0ce4c6 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -26,7 +26,6 @@
  * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
  * IN THE SOFTWARE.
  */
-
 #include "common.h"
 
 #include 
@@ -137,464 +136,299 @@ static void xenvif_rx_queue_drop_expired(struct 
xenvif_queue *queue)
}
 }
 
-struct netrx_pending_operations {
-   unsigned int copy_prod, copy_cons;
-   unsigned int meta_prod, meta_cons;
-   struct gnttab_copy *copy;
-   struct xenvif_rx_meta *meta;
-   int copy_off;
-   grant_ref_t copy_gref;
-};
-
-static struct xenvif_rx_meta *get_next_rx_buffer(
-   struct xenvif_queue *queue,
-   struct netrx_pending_operations *npo)
+static void xenvif_rx_copy_flush(struct xenvif_queue *queue)
 {
-   struct xenvif_rx_meta *meta;
-   struct xen_netif_rx_request req;
+   unsigned int i;
 
-   RING_COPY_REQUEST(>rx, queue->rx.req_cons++, );
+   gnttab_batch_copy(queue->rx_copy.op, queue->rx_copy.num);
 
-   meta = npo->meta + npo->meta_prod++;
-   meta->gso_type = XEN_NETIF_GSO_TYPE_NONE;
-   meta->gso_size = 0;
-   meta->size = 0;
-   meta->id = req.id;
+   for (i = 0; i < queue->rx_copy.num; i++) {
+   struct gnttab_copy *op;
 
-   npo->copy_off = 0;
-   npo->copy_gref = req.gref;
+   op = >rx_copy.op[i];
 
-   return meta;
+   /* If the copy failed, overwrite the status field in
+* the corresponding response.
+*/
+   if (unlikely(op->status != GNTST_okay)) {
+   struct xen_netif_rx_response *rsp;
+
+   rsp = RING_GET_RESPONSE(>rx,
+   queue->rx_copy.idx[i]);
+   rsp->status = op->status;
+   }
+   }
+
+   queue->rx_copy.num = 0;
 }
 
-struct gop_frag_copy {
-   struct xenvif_queue *queue;
-   struct netrx_pending_operations *npo;
-   struct xenvif_rx_meta *meta;
-   int head;
-   int gso_type;
-   int protocol;
-   int

[PATCH net-next 6/7] xen-netback: batch copies for multiple to-guest rx packets

2016-10-03 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

Instead of flushing the copy ops when an packet is complete, complete
packets when their copy ops are done.  This improves performance by
reducing the number of grant copy hypercalls.

Latency is still limited by the relatively small size of the copy
batch.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h |  1 +
 drivers/net/xen-netback/rx.c | 27 +--
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index adef482..5d40603 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -132,6 +132,7 @@ struct xenvif_copy_state {
struct gnttab_copy op[COPY_BATCH_SIZE];
RING_IDX idx[COPY_BATCH_SIZE];
unsigned int num;
+   struct sk_buff_head *completed;
 };
 
 struct xenvif_queue { /* Per-queue data for xenvif */
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index ae822b8..8c8c5b5 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -133,6 +133,7 @@ static void xenvif_rx_queue_drop_expired(struct 
xenvif_queue *queue)
 static void xenvif_rx_copy_flush(struct xenvif_queue *queue)
 {
unsigned int i;
+   int notify;
 
gnttab_batch_copy(queue->rx_copy.op, queue->rx_copy.num);
 
@@ -154,6 +155,13 @@ static void xenvif_rx_copy_flush(struct xenvif_queue 
*queue)
}
 
queue->rx_copy.num = 0;
+
+   /* Push responses for all completed packets. */
+   RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(>rx, notify);
+   if (notify)
+   notify_remote_via_irq(queue->rx_irq);
+
+   __skb_queue_purge(queue->rx_copy.completed);
 }
 
 static void xenvif_rx_copy_add(struct xenvif_queue *queue,
@@ -279,18 +287,10 @@ static void xenvif_rx_next_skb(struct xenvif_queue *queue,
 static void xenvif_rx_complete(struct xenvif_queue *queue,
   struct xenvif_pkt_state *pkt)
 {
-   int notify;
-
-   /* Complete any outstanding copy ops for this skb. */
-   xenvif_rx_copy_flush(queue);
-
-   /* Push responses and notify. */
+   /* All responses are ready to be pushed. */
queue->rx.rsp_prod_pvt = queue->rx.req_cons;
-   RING_PUSH_RESPONSES_AND_CHECK_NOTIFY(>rx, notify);
-   if (notify)
-   notify_remote_via_irq(queue->rx_irq);
 
-   dev_kfree_skb(pkt->skb);
+   __skb_queue_tail(queue->rx_copy.completed, pkt->skb);
 }
 
 static void xenvif_rx_next_chunk(struct xenvif_queue *queue,
@@ -429,13 +429,20 @@ void xenvif_rx_skb(struct xenvif_queue *queue)
 
 void xenvif_rx_action(struct xenvif_queue *queue)
 {
+   struct sk_buff_head completed_skbs;
unsigned int work_done = 0;
 
+   __skb_queue_head_init(_skbs);
+   queue->rx_copy.completed = _skbs;
+
while (xenvif_rx_ring_slots_available(queue) &&
   work_done < RX_BATCH_SIZE) {
xenvif_rx_skb(queue);
work_done++;
}
+
+   /* Flush any pending copies and complete all skbs. */
+   xenvif_rx_copy_flush(queue);
 }
 
 static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)
-- 
2.1.4

[PATCH net-next 5/7] xen-netback: process guest rx packets in batches

2016-10-03 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

Instead of only placing one skb on the guest rx ring at a time, process
a batch of up-to 64.  This improves performance by ~10% in some tests.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/rx.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 9548709..ae822b8 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -399,7 +399,7 @@ static void xenvif_rx_extra_slot(struct xenvif_queue *queue,
BUG();
 }
 
-void xenvif_rx_action(struct xenvif_queue *queue)
+void xenvif_rx_skb(struct xenvif_queue *queue)
 {
struct xenvif_pkt_state pkt;
 
@@ -425,6 +425,19 @@ void xenvif_rx_action(struct xenvif_queue *queue)
xenvif_rx_complete(queue, );
 }
 
+#define RX_BATCH_SIZE 64
+
+void xenvif_rx_action(struct xenvif_queue *queue)
+{
+   unsigned int work_done = 0;
+
+   while (xenvif_rx_ring_slots_available(queue) &&
+  work_done < RX_BATCH_SIZE) {
+   xenvif_rx_skb(queue);
+   work_done++;
+   }
+}
+
 static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)
 {
RING_IDX prod, cons;
-- 
2.1.4

[PATCH net-next 4/7] xen-netback: immediately wake tx queue when guest rx queue has space

2016-10-03 Thread Paul Durrant

From: David Vrabel <david.vra...@citrix.com>

When an skb is removed from the guest rx queue, immediately wake the
tx queue, instead of after processing them.

Signed-off-by: David Vrabel <david.vra...@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/rx.c | 24 
 1 file changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index b0ce4c6..9548709 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -92,27 +92,21 @@ static struct sk_buff *xenvif_rx_dequeue(struct 
xenvif_queue *queue)
spin_lock_irq(>rx_queue.lock);
 
skb = __skb_dequeue(>rx_queue);
-   if (skb)
+   if (skb) {
queue->rx_queue_len -= skb->len;
+   if (queue->rx_queue_len < queue->rx_queue_max) {
+   struct netdev_queue *txq;
+
+   txq = netdev_get_tx_queue(queue->vif->dev, queue->id);
+   netif_tx_wake_queue(txq);
+   }
+   }
 
spin_unlock_irq(>rx_queue.lock);
 
return skb;
 }
 
-static void xenvif_rx_queue_maybe_wake(struct xenvif_queue *queue)
-{
-   spin_lock_irq(>rx_queue.lock);
-
-   if (queue->rx_queue_len < queue->rx_queue_max) {
-   struct net_device *dev = queue->vif->dev;
-
-   netif_tx_wake_queue(netdev_get_tx_queue(dev, queue->id));
-   }
-
-   spin_unlock_irq(>rx_queue.lock);
-}
-
 static void xenvif_rx_queue_purge(struct xenvif_queue *queue)
 {
struct sk_buff *skb;
@@ -585,8 +579,6 @@ int xenvif_kthread_guest_rx(void *data)
 */
xenvif_rx_queue_drop_expired(queue);
 
-   xenvif_rx_queue_maybe_wake(queue);
-
cond_resched();
}
 
-- 
2.1.4

[PATCH net-next 2/7] xen-netback: retire guest rx side prefix GSO feature

2016-10-03 Thread Paul Durrant

As far as I am aware only very old Windows network frontends make use of
this style of passing GSO packets from backend to frontend. These
frontends can easily be replaced by the freely available Xen Project
Windows PV network frontend, which uses the 'default' mechanism for
passing GSO packets, which is also used by all Linux frontends.

NOTE: Removal of this feature will not cause breakage in old Windows
  frontends. They simply will no longer receive GSO packets - the
  packets instead being fragmented in the backend.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
---
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h|  1 -
 drivers/net/xen-netback/interface.c |  4 ++--
 drivers/net/xen-netback/rx.c| 26 --
 drivers/net/xen-netback/xenbus.c| 21 -
 4 files changed, 2 insertions(+), 50 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 3a56268..e16004a 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -260,7 +260,6 @@ struct xenvif {
 
/* Frontend feature information. */
int gso_mask;
-   int gso_prefix_mask;
 
u8 can_sg:1;
u8 ip_csum:1;
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 83deeeb..1a009e7 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -328,9 +328,9 @@ static netdev_features_t xenvif_fix_features(struct 
net_device *dev,
 
if (!vif->can_sg)
features &= ~NETIF_F_SG;
-   if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV4))
+   if (~(vif->gso_mask) & GSO_BIT(TCPV4))
features &= ~NETIF_F_TSO;
-   if (~(vif->gso_mask | vif->gso_prefix_mask) & GSO_BIT(TCPV6))
+   if (~(vif->gso_mask) & GSO_BIT(TCPV6))
features &= ~NETIF_F_TSO6;
if (!vif->ip_csum)
features &= ~NETIF_F_IP_CSUM;
diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 03836aa..6bd7d6e 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -347,16 +347,6 @@ static int xenvif_gop_skb(struct sk_buff *skb,
gso_type = XEN_NETIF_GSO_TYPE_TCPV6;
}
 
-   /* Set up a GSO prefix descriptor, if necessary */
-   if ((1 << gso_type) & vif->gso_prefix_mask) {
-   RING_COPY_REQUEST(>rx, queue->rx.req_cons++, );
-   meta = npo->meta + npo->meta_prod++;
-   meta->gso_type = gso_type;
-   meta->gso_size = skb_shinfo(skb)->gso_size;
-   meta->size = 0;
-   meta->id = req.id;
-   }
-
RING_COPY_REQUEST(>rx, queue->rx.req_cons++, );
meta = npo->meta + npo->meta_prod++;
 
@@ -511,22 +501,6 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
while ((skb = __skb_dequeue()) != NULL) {
struct xen_netif_extra_info *extra = NULL;
 
-   if ((1 << queue->meta[npo.meta_cons].gso_type) &
-   vif->gso_prefix_mask) {
-   resp = RING_GET_RESPONSE(>rx,
-queue->rx.rsp_prod_pvt++);
-
-   resp->flags = XEN_NETRXF_gso_prefix |
- XEN_NETRXF_more_data;
-
-   resp->offset = queue->meta[npo.meta_cons].gso_size;
-   resp->id = queue->meta[npo.meta_cons].id;
-   resp->status = XENVIF_RX_CB(skb)->meta_slots_used;
-
-   npo.meta_cons++;
-   XENVIF_RX_CB(skb)->meta_slots_used--;
-   }
-
queue->stats.tx_bytes += skb->len;
queue->stats.tx_packets++;
 
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index bacf6e0..6c57b02 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -1154,7 +1154,6 @@ static int read_xenbus_vif_flags(struct backend_info *be)
vif->can_sg = !!val;
 
vif->gso_mask = 0;
-   vif->gso_prefix_mask = 0;
 
if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
 "%d", ) < 0)
@@ -1162,32 +1161,12 @@ static int read_xenbus_vif_flags(struct backend_info 
*be)
if (val)
vif->gso_mask |= GSO_BIT(TCPV4);
 
-   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-prefix",
-"%d", ) < 0)
-   val = 0;
-   if (val)
-   vif->gso_prefix_mask |= GSO_BIT(TCPV4);
-
if (xenbus_scanf(XBT_NIL, dev->otherend, "featur

RE: [Xen-devel] [PATCH resend] xen-netback: switch to threaded irq for control ring

2016-09-22 Thread Paul Durrant

> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 22 September 2016 11:39
> To: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org;
> netdev@vger.kernel.org; linux-ker...@vger.kernel.org
> Cc: Wei Liu <wei.l...@citrix.com>
> Subject: Re: [Xen-devel] [PATCH resend] xen-netback: switch to threaded irq
> for control ring
> 
> On 22/09/16 12:31, Paul Durrant wrote:
> >> -Original Message-
> >> From: Juergen Gross [mailto:jgr...@suse.com]
> >> Sent: 22 September 2016 11:17
> >> To: Paul Durrant <paul.durr...@citrix.com>;
> >> xen-de...@lists.xenproject.org; net...@vger.kernel.orga
> >> <netdev@vger.kernel.org>; linux- ker...@vger.kernel.org
> >> Cc: Wei Liu <wei.l...@citrix.com>
> >> Subject: Re: [Xen-devel] [PATCH resend] xen-netback: switch to
> >> threaded irq for control ring
> >>
> >> On 22/09/16 11:09, Paul Durrant wrote:
> >>>> -Original Message-
> >>>> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf
> >>>> Of Juergen Gross
> >>>> Sent: 22 September 2016 10:03
> >>>> To: xen-de...@lists.xenproject.org; net...@vger.kernel.orga; linux-
> >>>> ker...@vger.kernel.org
> >>>> Cc: Juergen Gross <jgr...@suse.com>; Wei Liu <wei.l...@citrix.com>
> >>>> Subject: [Xen-devel] [PATCH resend] xen-netback: switch to threaded
> >>>> irq for control ring
> >>>>
> >>>> Instead of open coding it use the threaded irq mechanism in xen-
> netback.
> >>>>
> >>>> Signed-off-by: Juergen Gross <jgr...@suse.com>
> >>>
> >>> How have you tested this change?
> >>
> >> Only compile-tested and loaded the module. As this feature isn't
> >> being used in linux netfront AFAIK it is not easily testable. OTOH
> >> the code modification is rather limited and I've used the threaded
> >> irq in the Xen scsiback driver myself, so I'm rather confident it will 
> >> work.
> >>
> >
> > OK. How about doing the rx interrupt/task too so that it can be easily
> tested with a linux netfront?
> 
> I'd like to, but this would require some more work. The rx kthread isn't
> activated by an event only, but by a timer, too. This isn't easy to map to the
> threaded irq framework. If, however, you are confident the timer isn't really
> necessary I'd be happy to provide a patch switching the rx task to the
> threaded irq, too.
> 
> And to be honest: this wouldn't verify that the control ring related patch is
> really working. The mechanism itself _is_ working as it is already in use in
> xen-scsiback in a very similar environment.
> 

Ok. If you have confidence then...

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> 
> Juergen

RE: [Xen-devel] [PATCH resend] xen-netback: switch to threaded irq for control ring

2016-09-22 Thread Paul Durrant

> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: 22 September 2016 11:17
> To: Paul Durrant <paul.durr...@citrix.com>; xen-de...@lists.xenproject.org;
> net...@vger.kernel.orga <netdev@vger.kernel.org>; linux-
> ker...@vger.kernel.org
> Cc: Wei Liu <wei.l...@citrix.com>
> Subject: Re: [Xen-devel] [PATCH resend] xen-netback: switch to threaded irq
> for control ring
> 
> On 22/09/16 11:09, Paul Durrant wrote:
> >> -Original Message-
> >> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> >> Juergen Gross
> >> Sent: 22 September 2016 10:03
> >> To: xen-de...@lists.xenproject.org; net...@vger.kernel.orga; linux-
> >> ker...@vger.kernel.org
> >> Cc: Juergen Gross <jgr...@suse.com>; Wei Liu <wei.l...@citrix.com>
> >> Subject: [Xen-devel] [PATCH resend] xen-netback: switch to threaded
> >> irq for control ring
> >>
> >> Instead of open coding it use the threaded irq mechanism in xen-netback.
> >>
> >> Signed-off-by: Juergen Gross <jgr...@suse.com>
> >
> > How have you tested this change?
> 
> Only compile-tested and loaded the module. As this feature isn't being used
> in linux netfront AFAIK it is not easily testable. OTOH the code modification 
> is
> rather limited and I've used the threaded irq in the Xen scsiback driver
> myself, so I'm rather confident it will work.
> 

OK. How about doing the rx interrupt/task too so that it can be easily tested 
with a linux netfront?

  Paul

> 
> Juergen

[PATCH net-next] xen-netback: create a debugfs node for hash information

2016-08-17 Thread Paul Durrant

It is useful to be able to see the hash configuration when running tests.
This patch adds a debugfs node for that purpose.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h |  4 +++
 drivers/net/xen-netback/hash.c   | 68 
 drivers/net/xen-netback/xenbus.c | 37 --
 3 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 84d6cbd..3a56268 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -412,4 +412,8 @@ u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, 
u32 len,
 
 void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
 
+#ifdef CONFIG_DEBUG_FS
+void xenvif_dump_hash_info(struct xenvif *vif, struct seq_file *m);
+#endif
+
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
index fb87cb3..282b16d 100644
--- a/drivers/net/xen-netback/hash.c
+++ b/drivers/net/xen-netback/hash.c
@@ -369,6 +369,74 @@ u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, 
u32 len,
return XEN_NETIF_CTRL_STATUS_SUCCESS;
 }
 
+#ifdef CONFIG_DEBUG_FS
+void xenvif_dump_hash_info(struct xenvif *vif, struct seq_file *m)
+{
+   unsigned int i;
+
+   switch (vif->hash.alg) {
+   case XEN_NETIF_CTRL_HASH_ALGORITHM_TOEPLITZ:
+   seq_puts(m, "Hash Algorithm: TOEPLITZ\n");
+   break;
+
+   case XEN_NETIF_CTRL_HASH_ALGORITHM_NONE:
+   seq_puts(m, "Hash Algorithm: NONE\n");
+   /* FALLTHRU */
+   default:
+   return;
+   }
+
+   if (vif->hash.flags) {
+   seq_puts(m, "\nHash Flags:\n");
+
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV4)
+   seq_puts(m, "- IPv4\n");
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV4_TCP)
+   seq_puts(m, "- IPv4 + TCP\n");
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV6)
+   seq_puts(m, "- IPv6\n");
+   if (vif->hash.flags & XEN_NETIF_CTRL_HASH_TYPE_IPV6_TCP)
+   seq_puts(m, "- IPv6 + TCP\n");
+   }
+
+   seq_puts(m, "\nHash Key:\n");
+
+   for (i = 0; i < XEN_NETBK_MAX_HASH_KEY_SIZE; ) {
+   unsigned int j, n;
+
+   n = 8;
+   if (i + n >= XEN_NETBK_MAX_HASH_KEY_SIZE)
+   n = XEN_NETBK_MAX_HASH_KEY_SIZE - i;
+
+   seq_printf(m, "[%2u - %2u]: ", i, i + n - 1);
+
+   for (j = 0; j < n; j++, i++)
+   seq_printf(m, "%02x ", vif->hash.key[i]);
+
+   seq_puts(m, "\n");
+   }
+
+   if (vif->hash.size != 0) {
+   seq_puts(m, "\nHash Mapping:\n");
+
+   for (i = 0; i < vif->hash.size; ) {
+   unsigned int j, n;
+
+   n = 8;
+   if (i + n >= vif->hash.size)
+   n = vif->hash.size - i;
+
+   seq_printf(m, "[%4u - %4u]: ", i, i + n - 1);
+
+   for (j = 0; j < n; j++, i++)
+   seq_printf(m, "%4u ", vif->hash.mapping[i]);
+
+   seq_puts(m, "\n");
+   }
+   }
+}
+#endif /* CONFIG_DEBUG_FS */
+
 void xenvif_init_hash(struct xenvif *vif)
 {
if (xenvif_hash_cache_size == 0)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 6a31f26..bacf6e0 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -165,7 +165,7 @@ xenvif_write_io_ring(struct file *filp, const char __user 
*buf, size_t count,
return count;
 }
 
-static int xenvif_dump_open(struct inode *inode, struct file *filp)
+static int xenvif_io_ring_open(struct inode *inode, struct file *filp)
 {
int ret;
void *queue = NULL;
@@ -179,13 +179,35 @@ static int xenvif_dump_open(struct inode *inode, struct 
file *filp)
 
 static const struct file_operations xenvif_dbg_io_ring_ops_fops = {
.owner = THIS_MODULE,
-   .open = xenvif_dump_open,
+   .open = xenvif_io_ring_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
.write = xenvif_write_io_ring,
 };
 
+static int xenvif_read_ctrl(struct seq_file *m, void *v)
+{
+   struct xenvif *vif = m->private;
+
+   xenvif_dump_hash_info(vif, m);
+
+   return 0;
+}
+
+static int xenvif_ctrl_open(struct inode *inode, struct file *filp)
+{
+   return single_open(filp, xenvif_read_ctrl, inode-&

RE: [Xen-devel] [PATCH] xen-netback: correct return value checks on xenbus_scanf()

2016-07-07 Thread Paul Durrant

> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-
> ow...@vger.kernel.org] On Behalf Of David Vrabel
> Sent: 07 July 2016 11:45
> To: Wei Liu; David Vrabel
> Cc: xen-de...@lists.xenproject.org; Jan Beulich; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH] xen-netback: correct return value checks
> on xenbus_scanf()
> 
> On 07/07/16 11:35, Wei Liu wrote:
> > On Thu, Jul 07, 2016 at 10:58:16AM +0100, David Vrabel wrote:
> >> On 07/07/16 08:57, Jan Beulich wrote:
> >>> Only a positive return value indicates success.
> >>
> >> This is not correct.
> >>
> >
> > Do you mean the commit message is not correct or the code is not
> > correct? If it is the formal, do you have any suggestion to fix it?
> 
> This code is correct as-is, thus the commit message is wrong or misleading.
> 

Is that true? Jan is correct in saying that only >0 is an indicator of success 
according to the usual semantics of sccanf(). Personally I think the code would 
be clearer if the checks for failure were < 1 rather than <= 0.

  Paul

> David

RE: [Xen-devel] [PATCH] xen-netback: correct return value checks on xenbus_scanf()

2016-07-07 Thread Paul Durrant

> -Original Message-
> From: Paul Durrant
> Sent: 07 July 2016 11:41
> To: Wei Liu; David Vrabel
> Cc: Jan Beulich; Wei Liu; xen-de...@lists.xenproject.org;
> netdev@vger.kernel.org
> Subject: RE: [Xen-devel] [PATCH] xen-netback: correct return value checks
> on xenbus_scanf()
> 
> > -Original Message-
> > From: netdev-ow...@vger.kernel.org [mailto:netdev-
> > ow...@vger.kernel.org] On Behalf Of Wei Liu
> > Sent: 07 July 2016 11:35
> > To: David Vrabel
> > Cc: Jan Beulich; Wei Liu; xen-de...@lists.xenproject.org;
> > netdev@vger.kernel.org
> > Subject: Re: [Xen-devel] [PATCH] xen-netback: correct return value checks
> > on xenbus_scanf()
> >
> > On Thu, Jul 07, 2016 at 10:58:16AM +0100, David Vrabel wrote:
> > > On 07/07/16 08:57, Jan Beulich wrote:
> > > > Only a positive return value indicates success.
> > >
> > > This is not correct.
> > >
> 
> If Xen's vsscanf follows the semantics of scanf(3) then 0 is a failure so I 
> think
> the comment is correct.
> 

s/Xen/the kernel/

>   Paul
> 
> >
> > Do you mean the commit message is not correct or the code is not
> > correct? If it is the formal, do you have any suggestion to fix it?
> >
> > (I was going to just ack this because Paul already reviewed it)
> >
> > Wei.
> >
> > > David

RE: [Xen-devel] [PATCH] xen-netback: correct return value checks on xenbus_scanf()

2016-07-07 Thread Paul Durrant

> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-
> ow...@vger.kernel.org] On Behalf Of Wei Liu
> Sent: 07 July 2016 11:35
> To: David Vrabel
> Cc: Jan Beulich; Wei Liu; xen-de...@lists.xenproject.org;
> netdev@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH] xen-netback: correct return value checks
> on xenbus_scanf()
> 
> On Thu, Jul 07, 2016 at 10:58:16AM +0100, David Vrabel wrote:
> > On 07/07/16 08:57, Jan Beulich wrote:
> > > Only a positive return value indicates success.
> >
> > This is not correct.
> >

If Xen's vsscanf follows the semantics of scanf(3) then 0 is a failure so I 
think the comment is correct.

  Paul

> 
> Do you mean the commit message is not correct or the code is not
> correct? If it is the formal, do you have any suggestion to fix it?
> 
> (I was going to just ack this because Paul already reviewed it)
> 
> Wei.
> 
> > David

RE: [Xen-devel] [PATCH] xen-netback: prefer xenbus_write() over xenbus_printf() where possible

2016-07-07 Thread Paul Durrant

> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 07 July 2016 08:58
> To: Wei Liu
> Cc: xen-de...@lists.xenproject.org; netdev@vger.kernel.org
> Subject: [Xen-devel] [PATCH] xen-netback: prefer xenbus_write() over
> xenbus_printf() where possible
> 
> ... as being the simpler variant.
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/xenbus.c |   24 +---
>  1 file changed, 9 insertions(+), 15 deletions(-)
> 
> --- 4.7-rc6-prefer-xenbus_write.orig/drivers/net/xen-netback/xenbus.c
> +++ 4.7-rc6-prefer-xenbus_write/drivers/net/xen-netback/xenbus.c
> @@ -301,17 +301,15 @@ static int netback_probe(struct xenbus_d
>   }
> 
>   /* We support partial checksum setup for IPv6 packets */
> - err = xenbus_printf(xbt, dev->nodename,
> - "feature-ipv6-csum-offload",
> - "%d", 1);
> + err = xenbus_write(xbt, dev->nodename,
> +"feature-ipv6-csum-offload", "1");
>   if (err) {
>   message = "writing feature-ipv6-csum-offload";
>   goto abort_transaction;
>   }
> 
>   /* We support rx-copy path. */
> - err = xenbus_printf(xbt, dev->nodename,
> - "feature-rx-copy", "%d", 1);
> + err = xenbus_write(xbt, dev->nodename, "feature-rx-copy",
> "1");
>   if (err) {
>   message = "writing feature-rx-copy";
>   goto abort_transaction;
> @@ -321,24 +319,22 @@ static int netback_probe(struct xenbus_d
>* We don't support rx-flip path (except old guests who don't
>* grok this feature flag).
>*/
> - err = xenbus_printf(xbt, dev->nodename,
> - "feature-rx-flip", "%d", 0);
> + err = xenbus_write(xbt, dev->nodename, "feature-rx-flip",
> "0");
>   if (err) {
>   message = "writing feature-rx-flip";
>   goto abort_transaction;
>   }
> 
>   /* We support dynamic multicast-control. */
> - err = xenbus_printf(xbt, dev->nodename,
> - "feature-multicast-control", "%d", 1);
> + err = xenbus_write(xbt, dev->nodename,
> +"feature-multicast-control", "1");
>   if (err) {
>   message = "writing feature-multicast-control";
>   goto abort_transaction;
>   }
> 
> - err = xenbus_printf(xbt, dev->nodename,
> - "feature-dynamic-multicast-control",
> - "%d", 1);
> + err = xenbus_write(xbt, dev->nodename,
> +"feature-dynamic-multicast-control", "1");
>   if (err) {
>   message = "writing feature-dynamic-multicast-
> control";
>   goto abort_transaction;
> @@ -368,9 +364,7 @@ static int netback_probe(struct xenbus_d
>   if (err)
>   pr_debug("Error writing multi-queue-max-queues\n");
> 
> - err = xenbus_printf(XBT_NIL, dev->nodename,
> - "feature-ctrl-ring",
> - "%u", true);
> + err = xenbus_write(XBT_NIL, dev->nodename, "feature-ctrl-ring",
> "1");
>   if (err)
>   pr_debug("Error writing feature-ctrl-ring\n");
> 
> 
> 
> 
> 
> ___
> Xen-devel mailing list
> xen-de...@lists.xen.org
> https://lists.xen.org/xen-devel

RE: [Xen-devel] [PATCH] xen-netback: correct return value checks on xenbus_scanf()

2016-07-07 Thread Paul Durrant

> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of Jan
> Beulich
> Sent: 07 July 2016 08:57
> To: Wei Liu
> Cc: xen-de...@lists.xenproject.org; netdev@vger.kernel.org
> Subject: [Xen-devel] [PATCH] xen-netback: correct return value checks on
> xenbus_scanf()
> 
> Only a positive return value indicates success.
> 
> Signed-off-by: Jan Beulich <jbeul...@suse.com>

Reviewed-by: Paul Durrant <paul.durr...@citrix.com>

> ---
>  drivers/net/xen-netback/xenbus.c |   26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
> --- 4.7-rc6-xenbus_scanf.orig/drivers/net/xen-netback/xenbus.c
> +++ 4.7-rc6-xenbus_scanf/drivers/net/xen-netback/xenbus.c
> @@ -741,7 +741,7 @@ static void xen_mcast_ctrl_changed(struc
>   int val;
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend,
> -  "request-multicast-control", "%d", ) < 0)
> +  "request-multicast-control", "%d", ) <= 0)
>   val = 0;
>   vif->multicast_control = !!val;
>  }
> @@ -890,7 +890,7 @@ static void connect(struct backend_info
>   err = xenbus_scanf(XBT_NIL, dev->otherend,
>  "multi-queue-num-queues",
>  "%u", _num_queues);
> - if (err < 0) {
> + if (err <= 0) {
>   requested_num_queues = 1; /* Fall back to single queue */
>   } else if (requested_num_queues > xenvif_max_queues) {
>   /* buggy or malicious guest */
> @@ -1056,7 +1056,7 @@ static int connect_data_rings(struct bac
>   if (err < 0) {
>   err = xenbus_scanf(XBT_NIL, xspath,
>  "event-channel", "%u", _evtchn);
> - if (err < 0) {
> + if (err <= 0) {
>   xenbus_dev_fatal(dev, err,
>"reading %s/event-channel(-tx/rx)",
>xspath);
> @@ -1092,10 +1092,10 @@ static int read_xenbus_vif_flags(struct
>   err = xenbus_scanf(XBT_NIL, dev->otherend, "request-rx-copy",
> "%u",
>  _copy);
>   if (err == -ENOENT) {
> - err = 0;
> + err = 1;
>   rx_copy = 0;
>   }
> - if (err < 0) {
> + if (err <= 0) {
>   xenbus_dev_fatal(dev, err, "reading %s/request-rx-copy",
>dev->otherend);
>   return err;
> @@ -1104,7 +1104,7 @@ static int read_xenbus_vif_flags(struct
>   return -EOPNOTSUPP;
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend,
> -  "feature-rx-notify", "%d", ) < 0)
> +  "feature-rx-notify", "%d", ) <= 0)
>   val = 0;
>   if (!val) {
>   /* - Reduce drain timeout to poll more frequently for
> @@ -1116,7 +1116,7 @@ static int read_xenbus_vif_flags(struct
>   }
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-sg",
> -  "%d", ) < 0)
> +  "%d", ) <= 0)
>   val = 0;
>   vif->can_sg = !!val;
> 
> @@ -1124,25 +1124,25 @@ static int read_xenbus_vif_flags(struct
>   vif->gso_prefix_mask = 0;
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4",
> -  "%d", ) < 0)
> +  "%d", ) <= 0)
>   val = 0;
>   if (val)
>   vif->gso_mask |= GSO_BIT(TCPV4);
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv4-
> prefix",
> -  "%d", ) < 0)
> +  "%d", ) <= 0)
>   val = 0;
>   if (val)
>   vif->gso_prefix_mask |= GSO_BIT(TCPV4);
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6",
> -  "%d", ) < 0)
> +  "%d", ) <= 0)
>   val = 0;
>   if (val)
>   vif->gso_mask |= GSO_BIT(TCPV6);
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-gso-tcpv6-
> prefix",
> -  "%d", ) < 0)
> +  "%d", ) <= 0)
>   val = 0;
>   if (val)
>   vif->gso_prefix_mask |= GSO_BIT(TCPV6);
> @@ -1156,12 +1156,12 @@ static int read_xenbus_vif_flags(struct
>   }
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-no-csum-
> offload",
> -  "%d", ) < 0)
> +  "%d", ) <= 0)
>   val = 0;
>   vif->ip_csum = !val;
> 
>   if (xenbus_scanf(XBT_NIL, dev->otherend, "feature-ipv6-csum-
> offload",
> -  "%d", ) < 0)
> +  "%d", ) <= 0)
>   val = 0;
>   vif->ipv6_csum = !!val;
> 
> 
> 
> 
> 
> ___
> Xen-devel mailing list
> xen-de...@lists.xen.org
> https://lists.xen.org/xen-devel

[PATCH net-next] xen-netback: only deinitialized hash if it was initialized

2016-05-18 Thread Paul Durrant

A domain with a frontend that does not implement a control ring has been
seen to cause a crash during domain save. This was apparently because
the call to xenvif_deinit_hash() in xenvif_disconnect_ctrl() is made
regardless of whether a control ring was connected, and hence
xenvif_hash_init() was called.

This patch brings the call to xenvif_deinit_hash() in
xenvif_disconnect_ctrl() inside the if clause that checks whether the
control ring event channel was connected. This is sufficient to ensure
it is only called if xenvif_init_hash() was called previously.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reported-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Tested-by: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 1c7f49b..83deeeb 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -780,9 +780,8 @@ void xenvif_disconnect_ctrl(struct xenvif *vif)
vif->ctrl_task = NULL;
}
 
-   xenvif_deinit_hash(vif);
-
if (vif->ctrl_irq) {
+   xenvif_deinit_hash(vif);
unbind_from_irqhandler(vif->ctrl_irq, vif);
vif->ctrl_irq = 0;
}
-- 
2.1.4

[PATCH net-next] xen-netback: correct length checks on hash copy_ops

2016-05-18 Thread Paul Durrant

The length checks on the grant table copy_ops for setting hash key and
hash mapping are checking the local 'len' value which is correct in
the case of the former but not the latter. This was picked up by
static analysis checks.

This patch replaces checks of 'len' with 'copy_op.len' in both cases
to correct the incorrect check, keep the two checks consistent, and to
make it clear what the checks are for.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reported-by: Dan Carpenter <dan.carpen...@oracle.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/hash.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
index 392e392..fb87cb3 100644
--- a/drivers/net/xen-netback/hash.c
+++ b/drivers/net/xen-netback/hash.c
@@ -311,7 +311,7 @@ u32 xenvif_set_hash_key(struct xenvif *vif, u32 gref, u32 
len)
if (len > XEN_NETBK_MAX_HASH_KEY_SIZE)
return XEN_NETIF_CTRL_STATUS_INVALID_PARAMETER;
 
-   if (len != 0) {
+   if (copy_op.len != 0) {
gnttab_batch_copy(_op, 1);
 
if (copy_op.status != GNTST_okay)
@@ -359,7 +359,7 @@ u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, 
u32 len,
if (mapping[off++] >= vif->num_queues)
return XEN_NETIF_CTRL_STATUS_INVALID_PARAMETER;
 
-   if (len != 0) {
+   if (copy_op.len != 0) {
gnttab_batch_copy(_op, 1);
 
if (copy_op.status != GNTST_okay)
-- 
2.1.4

[PATCH net-next v4 0/4] xen-netback: support for control ring

2016-05-13 Thread Paul Durrant

My recent patch to import an up-to-date include/xen/interface/io/netif.h
from the Xen Project brought in the necessary definitions to support the
new control shared ring and protocol. This patch series updates xen-netback
to support the new ring.

Patch #1 adds the necessary boilerplate to map the control ring and handle
messages. No implementation of the new protocol is included in this patch
so that it can be kept to a reasonable size.

Patch #2 adds the protocol implementation.

Patch #3 adds support for passing has values calculated by xen-netback to
capable frontends.

Patch #4 adds support for accepting hash values calculated by capable
frontends and using them the set the socket buffer hash.

Paul Durrant (4):
  xen-netback: add control ring boilerplate
  xen-netback: add control protocol implementation
  xen-netback: pass hash value to the frontend
  xen-netback: use hash value from the frontend

 drivers/net/xen-netback/Makefile|   2 +-
 drivers/net/xen-netback/common.h|  74 ++-
 drivers/net/xen-netback/hash.c  | 384 
 drivers/net/xen-netback/interface.c | 134 -
 drivers/net/xen-netback/netback.c   | 249 +--
 drivers/net/xen-netback/xenbus.c|  79 +++-
 6 files changed, 879 insertions(+), 43 deletions(-)
 create mode 100644 drivers/net/xen-netback/hash.c

-- 
2.1.4

[PATCH net-next v4 3/4] xen-netback: pass hash value to the frontend

2016-05-13 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to pass hash values calculated for
guest receive-side packets (i.e. netback transmit side) to the frontend.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c | 13 ++-
 drivers/net/xen-netback/netback.c   | 78 +++--
 2 files changed, 77 insertions(+), 14 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 5a39cdb..1c7f49b 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -158,8 +158,17 @@ static u16 xenvif_select_queue(struct net_device *dev, 
struct sk_buff *skb,
struct xenvif *vif = netdev_priv(dev);
unsigned int size = vif->hash.size;
 
-   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
-   return fallback(dev, skb) % dev->real_num_tx_queues;
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) {
+   u16 index = fallback(dev, skb) % dev->real_num_tx_queues;
+
+   /* Make sure there is no hash information in the socket
+* buffer otherwise it would be incorrectly forwarded
+* to the frontend.
+*/
+   skb_clear_hash(skb);
+
+   return index;
+   }
 
xenvif_set_skb_hash(vif, skb);
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 6509d11..7c72510 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -168,6 +168,8 @@ static bool xenvif_rx_ring_slots_available(struct 
xenvif_queue *queue)
needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
if (skb_is_gso(skb))
needed++;
+   if (skb->sw_hash)
+   needed++;
 
do {
prod = queue->rx.sring->req_prod;
@@ -285,6 +287,8 @@ struct gop_frag_copy {
struct xenvif_rx_meta *meta;
int head;
int gso_type;
+   int protocol;
+   int hash_present;
 
struct page *page;
 };
@@ -331,8 +335,15 @@ static void xenvif_setup_copy_gop(unsigned long gfn,
npo->copy_off += *len;
info->meta->size += *len;
 
+   if (!info->head)
+   return;
+
/* Leave a gap for the GSO descriptor. */
-   if (info->head && ((1 << info->gso_type) & queue->vif->gso_mask))
+   if ((1 << info->gso_type) & queue->vif->gso_mask)
+   queue->rx.req_cons++;
+
+   /* Leave a gap for the hash extra segment. */
+   if (info->hash_present)
queue->rx.req_cons++;
 
info->head = 0; /* There must be something in this buffer now */
@@ -367,6 +378,11 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
.npo = npo,
.head = *head,
.gso_type = XEN_NETIF_GSO_TYPE_NONE,
+   /* xenvif_set_skb_hash() will have either set a s/w
+* hash or cleared the hash depending on
+* whether the the frontend wants a hash for this skb.
+*/
+   .hash_present = skb->sw_hash,
};
unsigned long bytes;
 
@@ -555,6 +571,7 @@ void xenvif_kick_thread(struct xenvif_queue *queue)
 
 static void xenvif_rx_action(struct xenvif_queue *queue)
 {
+   struct xenvif *vif = queue->vif;
s8 status;
u16 flags;
struct xen_netif_rx_response *resp;
@@ -590,9 +607,10 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
gnttab_batch_copy(queue->grant_copy_op, npo.copy_prod);
 
while ((skb = __skb_dequeue()) != NULL) {
+   struct xen_netif_extra_info *extra = NULL;
 
if ((1 << queue->meta[npo.meta_cons].gso_type) &
-   queue->vif->gso_prefix_mask) {
+   vif->gso_prefix_mask) {
resp = RING_GET_RESPONSE(>rx,
 queue->rx.rsp_prod_pvt++);
 
@@ -610,7 +628,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
queue->stats.tx_bytes += skb->len;
queue->stats.tx_packets++;
 
-   status = xenvif_check_gop(queue->vif,
+   status = xenvif_check_gop(vif,
  XENVIF_RX_CB(skb)->meta_slots_used,
  );
 
@@ -632,21 +650,57 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
flags);
 
if ((1 << queue->meta[npo.meta_cons].gs

[PATCH net-next v4 2/4] xen-netback: add control protocol implementation

2016-05-13 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

A previous patch added the necessary boilerplate for mapping the control
ring from the frontend, should it be created. This patch adds
implementations for each of the defined protocol messages.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---

v4:
 - Remove calls to init_rcu_head() and destroy_rcu_head()

v3:
 - Remove unintentional label rename

v2:
 - Use RCU list for hash cache
---
 drivers/net/xen-netback/Makefile|   2 +-
 drivers/net/xen-netback/common.h|  46 +
 drivers/net/xen-netback/hash.c  | 384 
 drivers/net/xen-netback/interface.c |  24 +++
 drivers/net/xen-netback/netback.c   |  49 -
 5 files changed, 502 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/xen-netback/hash.c

diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile
index e346e81..11e02be 100644
--- a/drivers/net/xen-netback/Makefile
+++ b/drivers/net/xen-netback/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o
 
-xen-netback-y := netback.o xenbus.o interface.o
+xen-netback-y := netback.o xenbus.o interface.o hash.o
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 093a12a..84d6cbd 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -220,6 +220,35 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MCAST_MAX 64
 
+#define XEN_NETBK_MAX_HASH_KEY_SIZE 40
+#define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
+#define XEN_NETBK_HASH_TAG_SIZE 40
+
+struct xenvif_hash_cache_entry {
+   struct list_head link;
+   struct rcu_head rcu;
+   u8 tag[XEN_NETBK_HASH_TAG_SIZE];
+   unsigned int len;
+   u32 val;
+   int seq;
+};
+
+struct xenvif_hash_cache {
+   spinlock_t lock;
+   struct list_head list;
+   unsigned int count;
+   atomic_t seq;
+};
+
+struct xenvif_hash {
+   unsigned int alg;
+   u32 flags;
+   u8 key[XEN_NETBK_MAX_HASH_KEY_SIZE];
+   u32 mapping[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
+   unsigned int size;
+   struct xenvif_hash_cache cache;
+};
+
 struct xenvif {
/* Unique identifier for this interface. */
domid_t  domid;
@@ -251,6 +280,8 @@ struct xenvif {
unsigned int num_queues; /* active queues, resource allocated */
unsigned int stalled_queues;
 
+   struct xenvif_hash hash;
+
struct xenbus_watch credit_watch;
struct xenbus_watch mcast_ctrl_watch;
 
@@ -353,6 +384,7 @@ extern bool separate_tx_rx_irq;
 extern unsigned int rx_drain_timeout_msecs;
 extern unsigned int rx_stall_timeout_msecs;
 extern unsigned int xenvif_max_queues;
+extern unsigned int xenvif_hash_cache_size;
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *xen_netback_dbg_root;
@@ -366,4 +398,18 @@ void xenvif_skb_zerocopy_complete(struct xenvif_queue 
*queue);
 bool xenvif_mcast_match(struct xenvif *vif, const u8 *addr);
 void xenvif_mcast_addr_list_free(struct xenvif *vif);
 
+/* Hash */
+void xenvif_init_hash(struct xenvif *vif);
+void xenvif_deinit_hash(struct xenvif *vif);
+
+u32 xenvif_set_hash_alg(struct xenvif *vif, u32 alg);
+u32 xenvif_get_hash_flags(struct xenvif *vif, u32 *flags);
+u32 xenvif_set_hash_flags(struct xenvif *vif, u32 flags);
+u32 xenvif_set_hash_key(struct xenvif *vif, u32 gref, u32 len);
+u32 xenvif_set_hash_mapping_size(struct xenvif *vif, u32 size);
+u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, u32 len,
+   u32 off);
+
+void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
+
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
new file mode 100644
index 000..392e392
--- /dev/null
+++ b/drivers/net/xen-netback/hash.c
@@ -0,0 +1,384 @@
+/*
+ * Copyright (c) 2016 Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Softare Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial po

[PATCH net-next v4 1/4] xen-netback: add control ring boilerplate

2016-05-13 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

This patch adds the necessary code to xen-netback to map this new shared
ring, should it be created by a frontend, but does not add implementations
for any of the defined protocol messages. These are added in a subsequent
patch for clarity.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---

v2:
 - Changed error handling style in connect_ctrl_ring()
---
 drivers/net/xen-netback/common.h|  28 +++---
 drivers/net/xen-netback/interface.c | 101 +---
 drivers/net/xen-netback/netback.c   |  99 +--
 drivers/net/xen-netback/xenbus.c|  79 
 4 files changed, 277 insertions(+), 30 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f44b388..093a12a 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -260,6 +260,11 @@ struct xenvif {
struct dentry *xenvif_dbg_root;
 #endif
 
+   struct xen_netif_ctrl_back_ring ctrl;
+   struct task_struct *ctrl_task;
+   wait_queue_head_t ctrl_wq;
+   unsigned int ctrl_irq;
+
/* Miscellaneous private stuff. */
struct net_device *dev;
 };
@@ -285,10 +290,15 @@ struct xenvif *xenvif_alloc(struct device *parent,
 int xenvif_init_queue(struct xenvif_queue *queue);
 void xenvif_deinit_queue(struct xenvif_queue *queue);
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn);
-void xenvif_disconnect(struct xenvif *vif);
+int xenvif_connect_data(struct xenvif_queue *queue,
+   unsigned long tx_ring_ref,
+   unsigned long rx_ring_ref,
+   unsigned int tx_evtchn,
+   unsigned int rx_evtchn);
+void xenvif_disconnect_data(struct xenvif *vif);
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn);
+void xenvif_disconnect_ctrl(struct xenvif *vif);
 void xenvif_free(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
@@ -300,10 +310,10 @@ int xenvif_queue_stopped(struct xenvif_queue *queue);
 void xenvif_wake_queue(struct xenvif_queue *queue);
 
 /* (Un)Map communication rings. */
-void xenvif_unmap_frontend_rings(struct xenvif_queue *queue);
-int xenvif_map_frontend_rings(struct xenvif_queue *queue,
- grant_ref_t tx_ring_ref,
- grant_ref_t rx_ring_ref);
+void xenvif_unmap_frontend_data_rings(struct xenvif_queue *queue);
+int xenvif_map_frontend_data_rings(struct xenvif_queue *queue,
+  grant_ref_t tx_ring_ref,
+  grant_ref_t rx_ring_ref);
 
 /* Check for SKBs from frontend and schedule backend processing */
 void xenvif_napi_schedule_or_enable_events(struct xenvif_queue *queue);
@@ -318,6 +328,8 @@ void xenvif_kick_thread(struct xenvif_queue *queue);
 
 int xenvif_dealloc_kthread(void *data);
 
+int xenvif_ctrl_kthread(void *data);
+
 void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
 
 void xenvif_carrier_on(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index f5231a2..78a10d2 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -128,6 +128,15 @@ irqreturn_t xenvif_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
+irqreturn_t xenvif_ctrl_interrupt(int irq, void *dev_id)
+{
+   struct xenvif *vif = dev_id;
+
+   wake_up(>ctrl_wq);
+
+   return IRQ_HANDLED;
+}
+
 int xenvif_queue_stopped(struct xenvif_queue *queue)
 {
struct net_device *dev = queue->vif->dev;
@@ -527,9 +536,66 @@ void xenvif_carrier_on(struct xenvif *vif)
rtnl_unlock();
 }
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn)
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn)
+{
+   struct net_device *dev = vif->dev;
+   void *addr;
+   struct xen_netif_ctrl_sring *shared;
+   struct task_struct *task;
+   int err = -ENOMEM;
+
+   err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
+_ref, 1, );
+   if (err)
+   goto err;
+
+   shared = (struct xen_netif_ctrl_sring *)addr;
+   BACK_RING_INIT(>ctrl, shared, XEN_PAGE_SIZE);
+
+   init_waitqueue_head(>ctrl_wq);
+
+   err = bind_interdomain_e

[PATCH net-next v4 4/4] xen-netback: use hash value from the frontend

2016-05-13 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to use the value in a hash extra
info fragment passed from the guest frontend in a transmit-side
(i.e. netback receive side) packet to set the skb hash accordingly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/netback.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 7c72510..a5b5aad 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1509,6 +1509,33 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
}
}
 
+   if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type) {
+   struct xen_netif_extra_info *extra;
+   enum pkt_hash_types type = PKT_HASH_TYPE_NONE;
+
+   extra = [XEN_NETIF_EXTRA_TYPE_HASH - 1];
+
+   switch (extra->u.hash.type) {
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6:
+   type = PKT_HASH_TYPE_L3;
+   break;
+
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4_TCP:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6_TCP:
+   type = PKT_HASH_TYPE_L4;
+   break;
+
+   default:
+   break;
+   }
+
+   if (type != PKT_HASH_TYPE_NONE)
+   skb_set_hash(skb,
+*(u32 *)extra->u.hash.value,
+type);
+   }
+
XENVIF_TX_CB(skb)->pending_idx = pending_idx;
 
__skb_put(skb, data_len);
-- 
2.1.4

RE: [Xen-devel] [PATCH net-next v2 0/4] xen-netback: support for control ring

2016-05-12 Thread Paul Durrant

> -Original Message-
> From: Xen-devel [mailto:xen-devel-boun...@lists.xen.org] On Behalf Of
> David Miller
> Sent: 12 May 2016 16:52
> To: Paul Durrant
> Cc: xen-de...@lists.xenproject.org; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] [PATCH net-next v2 0/4] xen-netback: support for
> control ring
> 
> From: Paul Durrant <paul.durr...@citrix.com>
> Date: Wed, 11 May 2016 16:16:26 +0100
> 
> > My recent patch to import an up-to-date include/xen/interface/io/netif.h
> > from the Xen Project brought in the necessary definitions to support the
> > new control shared ring and protocol. This patch series updates xen-
> netback
> > to support the new ring.
> 
> I lost my copy of your V3 cover letter, so I'm replying to this one, but
> be sure I tested V3 :-)
> 
> This doesn't build, please fix:
> 
> ERROR: "init_rcu_head" [drivers/net/xen-netback/xen-netback.ko]
> undefined!
> ERROR: "destroy_rcu_head" [drivers/net/xen-netback/xen-netback.ko]
> undefined!
> 

OK.

  Paul

> ___
> Xen-devel mailing list
> xen-de...@lists.xen.org
> http://lists.xen.org/xen-devel

[PATCH net] xen-netback: fix extra_info handling in xenvif_tx_err()

2016-05-12 Thread Paul Durrant

Patch 562abd39 "xen-netback: support multiple extra info fragments
passed from frontend" contained a mistake which can result in an in-
correct number of responses being generated when handling errors
encountered when processing packets containing extra info fragments.
This patch fixes the problem.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Reported-by: Jan Beulich <jbeul...@suse.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/netback.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index b42f260..4412a57 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -711,6 +711,7 @@ static void xenvif_tx_err(struct xenvif_queue *queue,
if (cons == end)
break;
RING_COPY_REQUEST(>tx, cons++, txp);
+   extra_count = 0; /* only the first frag can have extras */
} while (1);
queue->tx.req_cons = cons;
 }
-- 
2.1.4

[PATCH net-next v3 4/4] xen-netback: use hash value from the frontend

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to use the value in a hash extra
info fragment passed from the guest frontend in a transmit-side
(i.e. netback receive side) packet to set the skb hash accordingly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/netback.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 7c72510..a5b5aad 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1509,6 +1509,33 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
}
}
 
+   if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type) {
+   struct xen_netif_extra_info *extra;
+   enum pkt_hash_types type = PKT_HASH_TYPE_NONE;
+
+   extra = [XEN_NETIF_EXTRA_TYPE_HASH - 1];
+
+   switch (extra->u.hash.type) {
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6:
+   type = PKT_HASH_TYPE_L3;
+   break;
+
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4_TCP:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6_TCP:
+   type = PKT_HASH_TYPE_L4;
+   break;
+
+   default:
+   break;
+   }
+
+   if (type != PKT_HASH_TYPE_NONE)
+   skb_set_hash(skb,
+*(u32 *)extra->u.hash.value,
+type);
+   }
+
XENVIF_TX_CB(skb)->pending_idx = pending_idx;
 
__skb_put(skb, data_len);
-- 
2.1.4

[PATCH net-next v3 0/4] xen-netback: support for control ring

2016-05-11 Thread Paul Durrant

My recent patch to import an up-to-date include/xen/interface/io/netif.h
from the Xen Project brought in the necessary definitions to support the
new control shared ring and protocol. This patch series updates xen-netback
to support the new ring.

Patch #1 adds the necessary boilerplate to map the control ring and handle
messages. No implementation of the new protocol is included in this patch
so that it can be kept to a reasonable size.

Patch #2 adds the protocol implementation.

Patch #3 adds support for passing has values calculated by xen-netback to
capable frontends.

Patch #4 adds support for accepting hash values calculated by capable
frontends and using them the set the socket buffer hash.

[PATCH net-next v3 3/4] xen-netback: pass hash value to the frontend

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to pass hash values calculated for
guest receive-side packets (i.e. netback transmit side) to the frontend.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c | 13 ++-
 drivers/net/xen-netback/netback.c   | 78 +++--
 2 files changed, 77 insertions(+), 14 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 5a39cdb..1c7f49b 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -158,8 +158,17 @@ static u16 xenvif_select_queue(struct net_device *dev, 
struct sk_buff *skb,
struct xenvif *vif = netdev_priv(dev);
unsigned int size = vif->hash.size;
 
-   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
-   return fallback(dev, skb) % dev->real_num_tx_queues;
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) {
+   u16 index = fallback(dev, skb) % dev->real_num_tx_queues;
+
+   /* Make sure there is no hash information in the socket
+* buffer otherwise it would be incorrectly forwarded
+* to the frontend.
+*/
+   skb_clear_hash(skb);
+
+   return index;
+   }
 
xenvif_set_skb_hash(vif, skb);
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 6509d11..7c72510 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -168,6 +168,8 @@ static bool xenvif_rx_ring_slots_available(struct 
xenvif_queue *queue)
needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
if (skb_is_gso(skb))
needed++;
+   if (skb->sw_hash)
+   needed++;
 
do {
prod = queue->rx.sring->req_prod;
@@ -285,6 +287,8 @@ struct gop_frag_copy {
struct xenvif_rx_meta *meta;
int head;
int gso_type;
+   int protocol;
+   int hash_present;
 
struct page *page;
 };
@@ -331,8 +335,15 @@ static void xenvif_setup_copy_gop(unsigned long gfn,
npo->copy_off += *len;
info->meta->size += *len;
 
+   if (!info->head)
+   return;
+
/* Leave a gap for the GSO descriptor. */
-   if (info->head && ((1 << info->gso_type) & queue->vif->gso_mask))
+   if ((1 << info->gso_type) & queue->vif->gso_mask)
+   queue->rx.req_cons++;
+
+   /* Leave a gap for the hash extra segment. */
+   if (info->hash_present)
queue->rx.req_cons++;
 
info->head = 0; /* There must be something in this buffer now */
@@ -367,6 +378,11 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
.npo = npo,
.head = *head,
.gso_type = XEN_NETIF_GSO_TYPE_NONE,
+   /* xenvif_set_skb_hash() will have either set a s/w
+* hash or cleared the hash depending on
+* whether the the frontend wants a hash for this skb.
+*/
+   .hash_present = skb->sw_hash,
};
unsigned long bytes;
 
@@ -555,6 +571,7 @@ void xenvif_kick_thread(struct xenvif_queue *queue)
 
 static void xenvif_rx_action(struct xenvif_queue *queue)
 {
+   struct xenvif *vif = queue->vif;
s8 status;
u16 flags;
struct xen_netif_rx_response *resp;
@@ -590,9 +607,10 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
gnttab_batch_copy(queue->grant_copy_op, npo.copy_prod);
 
while ((skb = __skb_dequeue()) != NULL) {
+   struct xen_netif_extra_info *extra = NULL;
 
if ((1 << queue->meta[npo.meta_cons].gso_type) &
-   queue->vif->gso_prefix_mask) {
+   vif->gso_prefix_mask) {
resp = RING_GET_RESPONSE(>rx,
 queue->rx.rsp_prod_pvt++);
 
@@ -610,7 +628,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
queue->stats.tx_bytes += skb->len;
queue->stats.tx_packets++;
 
-   status = xenvif_check_gop(queue->vif,
+   status = xenvif_check_gop(vif,
  XENVIF_RX_CB(skb)->meta_slots_used,
  );
 
@@ -632,21 +650,57 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
flags);
 
if ((1 << queue->meta[npo.meta_cons].gs

[PATCH net-next v3 1/4] xen-netback: add control ring boilerplate

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

This patch adds the necessary code to xen-netback to map this new shared
ring, should it be created by a frontend, but does not add implementations
for any of the defined protocol messages. These are added in a subsequent
patch for clarity.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---

v2:
 - Changed error handling style in connect_ctrl_ring()
---
 drivers/net/xen-netback/common.h|  28 +++---
 drivers/net/xen-netback/interface.c | 101 +---
 drivers/net/xen-netback/netback.c   |  99 +--
 drivers/net/xen-netback/xenbus.c|  79 
 4 files changed, 277 insertions(+), 30 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f44b388..093a12a 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -260,6 +260,11 @@ struct xenvif {
struct dentry *xenvif_dbg_root;
 #endif
 
+   struct xen_netif_ctrl_back_ring ctrl;
+   struct task_struct *ctrl_task;
+   wait_queue_head_t ctrl_wq;
+   unsigned int ctrl_irq;
+
/* Miscellaneous private stuff. */
struct net_device *dev;
 };
@@ -285,10 +290,15 @@ struct xenvif *xenvif_alloc(struct device *parent,
 int xenvif_init_queue(struct xenvif_queue *queue);
 void xenvif_deinit_queue(struct xenvif_queue *queue);
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn);
-void xenvif_disconnect(struct xenvif *vif);
+int xenvif_connect_data(struct xenvif_queue *queue,
+   unsigned long tx_ring_ref,
+   unsigned long rx_ring_ref,
+   unsigned int tx_evtchn,
+   unsigned int rx_evtchn);
+void xenvif_disconnect_data(struct xenvif *vif);
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn);
+void xenvif_disconnect_ctrl(struct xenvif *vif);
 void xenvif_free(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
@@ -300,10 +310,10 @@ int xenvif_queue_stopped(struct xenvif_queue *queue);
 void xenvif_wake_queue(struct xenvif_queue *queue);
 
 /* (Un)Map communication rings. */
-void xenvif_unmap_frontend_rings(struct xenvif_queue *queue);
-int xenvif_map_frontend_rings(struct xenvif_queue *queue,
- grant_ref_t tx_ring_ref,
- grant_ref_t rx_ring_ref);
+void xenvif_unmap_frontend_data_rings(struct xenvif_queue *queue);
+int xenvif_map_frontend_data_rings(struct xenvif_queue *queue,
+  grant_ref_t tx_ring_ref,
+  grant_ref_t rx_ring_ref);
 
 /* Check for SKBs from frontend and schedule backend processing */
 void xenvif_napi_schedule_or_enable_events(struct xenvif_queue *queue);
@@ -318,6 +328,8 @@ void xenvif_kick_thread(struct xenvif_queue *queue);
 
 int xenvif_dealloc_kthread(void *data);
 
+int xenvif_ctrl_kthread(void *data);
+
 void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
 
 void xenvif_carrier_on(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index f5231a2..78a10d2 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -128,6 +128,15 @@ irqreturn_t xenvif_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
+irqreturn_t xenvif_ctrl_interrupt(int irq, void *dev_id)
+{
+   struct xenvif *vif = dev_id;
+
+   wake_up(>ctrl_wq);
+
+   return IRQ_HANDLED;
+}
+
 int xenvif_queue_stopped(struct xenvif_queue *queue)
 {
struct net_device *dev = queue->vif->dev;
@@ -527,9 +536,66 @@ void xenvif_carrier_on(struct xenvif *vif)
rtnl_unlock();
 }
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn)
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn)
+{
+   struct net_device *dev = vif->dev;
+   void *addr;
+   struct xen_netif_ctrl_sring *shared;
+   struct task_struct *task;
+   int err = -ENOMEM;
+
+   err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
+_ref, 1, );
+   if (err)
+   goto err;
+
+   shared = (struct xen_netif_ctrl_sring *)addr;
+   BACK_RING_INIT(>ctrl, shared, XEN_PAGE_SIZE);
+
+   init_waitqueue_head(>ctrl_wq);
+
+   err = bind_interdomain_evtchn_to

[PATCH net-next v3 2/4] xen-netback: add control protocol implementation

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

A previous patch added the necessary boilerplate for mapping the control
ring from the frontend, should it be created. This patch adds
implementations for each of the defined protocol messages.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---

v3:
 - Remove unintentional label rename

v2:
 - Use RCU list for hash cache
---
 drivers/net/xen-netback/Makefile|   2 +-
 drivers/net/xen-netback/common.h|  46 +
 drivers/net/xen-netback/hash.c  | 386 
 drivers/net/xen-netback/interface.c |  24 +++
 drivers/net/xen-netback/netback.c   |  49 -
 5 files changed, 504 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/xen-netback/hash.c

diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile
index e346e81..11e02be 100644
--- a/drivers/net/xen-netback/Makefile
+++ b/drivers/net/xen-netback/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o
 
-xen-netback-y := netback.o xenbus.o interface.o
+xen-netback-y := netback.o xenbus.o interface.o hash.o
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 093a12a..84d6cbd 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -220,6 +220,35 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MCAST_MAX 64
 
+#define XEN_NETBK_MAX_HASH_KEY_SIZE 40
+#define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
+#define XEN_NETBK_HASH_TAG_SIZE 40
+
+struct xenvif_hash_cache_entry {
+   struct list_head link;
+   struct rcu_head rcu;
+   u8 tag[XEN_NETBK_HASH_TAG_SIZE];
+   unsigned int len;
+   u32 val;
+   int seq;
+};
+
+struct xenvif_hash_cache {
+   spinlock_t lock;
+   struct list_head list;
+   unsigned int count;
+   atomic_t seq;
+};
+
+struct xenvif_hash {
+   unsigned int alg;
+   u32 flags;
+   u8 key[XEN_NETBK_MAX_HASH_KEY_SIZE];
+   u32 mapping[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
+   unsigned int size;
+   struct xenvif_hash_cache cache;
+};
+
 struct xenvif {
/* Unique identifier for this interface. */
domid_t  domid;
@@ -251,6 +280,8 @@ struct xenvif {
unsigned int num_queues; /* active queues, resource allocated */
unsigned int stalled_queues;
 
+   struct xenvif_hash hash;
+
struct xenbus_watch credit_watch;
struct xenbus_watch mcast_ctrl_watch;
 
@@ -353,6 +384,7 @@ extern bool separate_tx_rx_irq;
 extern unsigned int rx_drain_timeout_msecs;
 extern unsigned int rx_stall_timeout_msecs;
 extern unsigned int xenvif_max_queues;
+extern unsigned int xenvif_hash_cache_size;
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *xen_netback_dbg_root;
@@ -366,4 +398,18 @@ void xenvif_skb_zerocopy_complete(struct xenvif_queue 
*queue);
 bool xenvif_mcast_match(struct xenvif *vif, const u8 *addr);
 void xenvif_mcast_addr_list_free(struct xenvif *vif);
 
+/* Hash */
+void xenvif_init_hash(struct xenvif *vif);
+void xenvif_deinit_hash(struct xenvif *vif);
+
+u32 xenvif_set_hash_alg(struct xenvif *vif, u32 alg);
+u32 xenvif_get_hash_flags(struct xenvif *vif, u32 *flags);
+u32 xenvif_set_hash_flags(struct xenvif *vif, u32 flags);
+u32 xenvif_set_hash_key(struct xenvif *vif, u32 gref, u32 len);
+u32 xenvif_set_hash_mapping_size(struct xenvif *vif, u32 size);
+u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, u32 len,
+   u32 off);
+
+void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
+
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
new file mode 100644
index 000..47edfe9
--- /dev/null
+++ b/drivers/net/xen-netback/hash.c
@@ -0,0 +1,386 @@
+/*
+ * Copyright (c) 2016 Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Softare Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHO

RE: [PATCH net-next v2 2/4] xen-netback: add control protocol implementation

2016-05-11 Thread Paul Durrant

> -Original Message-
> From: Paul Durrant [mailto:paul.durr...@citrix.com]
> Sent: 11 May 2016 16:16
> To: xen-de...@lists.xenproject.org; netdev@vger.kernel.org
> Cc: Paul Durrant; Wei Liu
> Subject: [PATCH net-next v2 2/4] xen-netback: add control protocol
> implementation
> 
> My recent patch to include/xen/interface/io/netif.h defines a new shared
> ring (in addition to the rx and tx rings) for passing control messages
> from a VM frontend driver to a backend driver.
> 
> A previous patch added the necessary boilerplate for mapping the control
> ring from the frontend, should it be created. This patch adds
> implementations for each of the defined protocol messages.
> 
> Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> Cc: Wei Liu <wei.l...@citrix.com>
> ---
> 
> v2:
>  - Use RCU list for hash cache
> ---
>  drivers/net/xen-netback/Makefile|   2 +-
>  drivers/net/xen-netback/common.h|  46 +
>  drivers/net/xen-netback/hash.c  | 386
> 
>  drivers/net/xen-netback/interface.c |  28 ++-
>  drivers/net/xen-netback/netback.c   |  49 -
>  5 files changed, 506 insertions(+), 5 deletions(-)
>  create mode 100644 drivers/net/xen-netback/hash.c
> 
> diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-
> netback/Makefile
> index e346e81..11e02be 100644
> --- a/drivers/net/xen-netback/Makefile
> +++ b/drivers/net/xen-netback/Makefile
> @@ -1,3 +1,3 @@
>  obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o
> 
> -xen-netback-y := netback.o xenbus.o interface.o
> +xen-netback-y := netback.o xenbus.o interface.o hash.o
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-
> netback/common.h
> index 093a12a..84d6cbd 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -220,6 +220,35 @@ struct xenvif_mcast_addr {
> 
>  #define XEN_NETBK_MCAST_MAX 64
> 
> +#define XEN_NETBK_MAX_HASH_KEY_SIZE 40
> +#define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
> +#define XEN_NETBK_HASH_TAG_SIZE 40
> +
> +struct xenvif_hash_cache_entry {
> + struct list_head link;
> + struct rcu_head rcu;
> + u8 tag[XEN_NETBK_HASH_TAG_SIZE];
> + unsigned int len;
> + u32 val;
> + int seq;
> +};
> +
> +struct xenvif_hash_cache {
> + spinlock_t lock;
> + struct list_head list;
> + unsigned int count;
> + atomic_t seq;
> +};
> +
> +struct xenvif_hash {
> + unsigned int alg;
> + u32 flags;
> + u8 key[XEN_NETBK_MAX_HASH_KEY_SIZE];
> + u32 mapping[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
> + unsigned int size;
> + struct xenvif_hash_cache cache;
> +};
> +
>  struct xenvif {
>   /* Unique identifier for this interface. */
>   domid_t  domid;
> @@ -251,6 +280,8 @@ struct xenvif {
>   unsigned int num_queues; /* active queues, resource allocated */
>   unsigned int stalled_queues;
> 
> + struct xenvif_hash hash;
> +
>   struct xenbus_watch credit_watch;
>   struct xenbus_watch mcast_ctrl_watch;
> 
> @@ -353,6 +384,7 @@ extern bool separate_tx_rx_irq;
>  extern unsigned int rx_drain_timeout_msecs;
>  extern unsigned int rx_stall_timeout_msecs;
>  extern unsigned int xenvif_max_queues;
> +extern unsigned int xenvif_hash_cache_size;
> 
>  #ifdef CONFIG_DEBUG_FS
>  extern struct dentry *xen_netback_dbg_root;
> @@ -366,4 +398,18 @@ void xenvif_skb_zerocopy_complete(struct
> xenvif_queue *queue);
>  bool xenvif_mcast_match(struct xenvif *vif, const u8 *addr);
>  void xenvif_mcast_addr_list_free(struct xenvif *vif);
> 
> +/* Hash */
> +void xenvif_init_hash(struct xenvif *vif);
> +void xenvif_deinit_hash(struct xenvif *vif);
> +
> +u32 xenvif_set_hash_alg(struct xenvif *vif, u32 alg);
> +u32 xenvif_get_hash_flags(struct xenvif *vif, u32 *flags);
> +u32 xenvif_set_hash_flags(struct xenvif *vif, u32 flags);
> +u32 xenvif_set_hash_key(struct xenvif *vif, u32 gref, u32 len);
> +u32 xenvif_set_hash_mapping_size(struct xenvif *vif, u32 size);
> +u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, u32 len,
> + u32 off);
> +
> +void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
> +
>  #endif /* __XEN_NETBACK__COMMON_H__ */
> diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-
> netback/hash.c
> new file mode 100644
> index 000..47edfe9
> --- /dev/null
> +++ b/drivers/net/xen-netback/hash.c
> @@ -0,0 +1,386 @@
> +/*
> + * Copyright (c) 2016 Citrix Systems Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License ve

[PATCH net-next v2 1/4] xen-netback: add control ring boilerplate

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

This patch adds the necessary code to xen-netback to map this new shared
ring, should it be created by a frontend, but does not add implementations
for any of the defined protocol messages. These are added in a subsequent
patch for clarity.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---

v2:
 - Changed error handling style in connect_ctrl_ring()
---
 drivers/net/xen-netback/common.h|  28 +++---
 drivers/net/xen-netback/interface.c | 101 +---
 drivers/net/xen-netback/netback.c   |  99 +--
 drivers/net/xen-netback/xenbus.c|  79 
 4 files changed, 277 insertions(+), 30 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f44b388..093a12a 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -260,6 +260,11 @@ struct xenvif {
struct dentry *xenvif_dbg_root;
 #endif
 
+   struct xen_netif_ctrl_back_ring ctrl;
+   struct task_struct *ctrl_task;
+   wait_queue_head_t ctrl_wq;
+   unsigned int ctrl_irq;
+
/* Miscellaneous private stuff. */
struct net_device *dev;
 };
@@ -285,10 +290,15 @@ struct xenvif *xenvif_alloc(struct device *parent,
 int xenvif_init_queue(struct xenvif_queue *queue);
 void xenvif_deinit_queue(struct xenvif_queue *queue);
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn);
-void xenvif_disconnect(struct xenvif *vif);
+int xenvif_connect_data(struct xenvif_queue *queue,
+   unsigned long tx_ring_ref,
+   unsigned long rx_ring_ref,
+   unsigned int tx_evtchn,
+   unsigned int rx_evtchn);
+void xenvif_disconnect_data(struct xenvif *vif);
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn);
+void xenvif_disconnect_ctrl(struct xenvif *vif);
 void xenvif_free(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
@@ -300,10 +310,10 @@ int xenvif_queue_stopped(struct xenvif_queue *queue);
 void xenvif_wake_queue(struct xenvif_queue *queue);
 
 /* (Un)Map communication rings. */
-void xenvif_unmap_frontend_rings(struct xenvif_queue *queue);
-int xenvif_map_frontend_rings(struct xenvif_queue *queue,
- grant_ref_t tx_ring_ref,
- grant_ref_t rx_ring_ref);
+void xenvif_unmap_frontend_data_rings(struct xenvif_queue *queue);
+int xenvif_map_frontend_data_rings(struct xenvif_queue *queue,
+  grant_ref_t tx_ring_ref,
+  grant_ref_t rx_ring_ref);
 
 /* Check for SKBs from frontend and schedule backend processing */
 void xenvif_napi_schedule_or_enable_events(struct xenvif_queue *queue);
@@ -318,6 +328,8 @@ void xenvif_kick_thread(struct xenvif_queue *queue);
 
 int xenvif_dealloc_kthread(void *data);
 
+int xenvif_ctrl_kthread(void *data);
+
 void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
 
 void xenvif_carrier_on(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index f5231a2..78a10d2 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -128,6 +128,15 @@ irqreturn_t xenvif_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
+irqreturn_t xenvif_ctrl_interrupt(int irq, void *dev_id)
+{
+   struct xenvif *vif = dev_id;
+
+   wake_up(>ctrl_wq);
+
+   return IRQ_HANDLED;
+}
+
 int xenvif_queue_stopped(struct xenvif_queue *queue)
 {
struct net_device *dev = queue->vif->dev;
@@ -527,9 +536,66 @@ void xenvif_carrier_on(struct xenvif *vif)
rtnl_unlock();
 }
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn)
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn)
+{
+   struct net_device *dev = vif->dev;
+   void *addr;
+   struct xen_netif_ctrl_sring *shared;
+   struct task_struct *task;
+   int err = -ENOMEM;
+
+   err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
+_ref, 1, );
+   if (err)
+   goto err;
+
+   shared = (struct xen_netif_ctrl_sring *)addr;
+   BACK_RING_INIT(>ctrl, shared, XEN_PAGE_SIZE);
+
+   init_waitqueue_head(>ctrl_wq);
+
+   err = bind_interdomain_evtchn_to

[PATCH net-next v2 0/4] xen-netback: support for control ring

2016-05-11 Thread Paul Durrant

My recent patch to import an up-to-date include/xen/interface/io/netif.h
from the Xen Project brought in the necessary definitions to support the
new control shared ring and protocol. This patch series updates xen-netback
to support the new ring.

Patch #1 adds the necessary boilerplate to map the control ring and handle
messages. No implementation of the new protocol is included in this patch
so that it can be kept to a reasonable size.

Patch #2 adds the protocol implementation.

Patch #3 adds support for passing has values calculated by xen-netback to
capable frontends.

Patch #4 adds support for accepting hash values calculated by capable
frontends and using them the set the socket buffer hash.

[PATCH net-next v2 4/4] xen-netback: use hash value from the frontend

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to use the value in a hash extra
info fragment passed from the guest frontend in a transmit-side
(i.e. netback receive side) packet to set the skb hash accordingly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/netback.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 7c72510..a5b5aad 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1509,6 +1509,33 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
}
}
 
+   if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type) {
+   struct xen_netif_extra_info *extra;
+   enum pkt_hash_types type = PKT_HASH_TYPE_NONE;
+
+   extra = [XEN_NETIF_EXTRA_TYPE_HASH - 1];
+
+   switch (extra->u.hash.type) {
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6:
+   type = PKT_HASH_TYPE_L3;
+   break;
+
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4_TCP:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6_TCP:
+   type = PKT_HASH_TYPE_L4;
+   break;
+
+   default:
+   break;
+   }
+
+   if (type != PKT_HASH_TYPE_NONE)
+   skb_set_hash(skb,
+*(u32 *)extra->u.hash.value,
+type);
+   }
+
XENVIF_TX_CB(skb)->pending_idx = pending_idx;
 
__skb_put(skb, data_len);
-- 
2.1.4

[PATCH net-next v2 2/4] xen-netback: add control protocol implementation

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

A previous patch added the necessary boilerplate for mapping the control
ring from the frontend, should it be created. This patch adds
implementations for each of the defined protocol messages.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---

v2:
 - Use RCU list for hash cache
---
 drivers/net/xen-netback/Makefile|   2 +-
 drivers/net/xen-netback/common.h|  46 +
 drivers/net/xen-netback/hash.c  | 386 
 drivers/net/xen-netback/interface.c |  28 ++-
 drivers/net/xen-netback/netback.c   |  49 -
 5 files changed, 506 insertions(+), 5 deletions(-)
 create mode 100644 drivers/net/xen-netback/hash.c

diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile
index e346e81..11e02be 100644
--- a/drivers/net/xen-netback/Makefile
+++ b/drivers/net/xen-netback/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o
 
-xen-netback-y := netback.o xenbus.o interface.o
+xen-netback-y := netback.o xenbus.o interface.o hash.o
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 093a12a..84d6cbd 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -220,6 +220,35 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MCAST_MAX 64
 
+#define XEN_NETBK_MAX_HASH_KEY_SIZE 40
+#define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
+#define XEN_NETBK_HASH_TAG_SIZE 40
+
+struct xenvif_hash_cache_entry {
+   struct list_head link;
+   struct rcu_head rcu;
+   u8 tag[XEN_NETBK_HASH_TAG_SIZE];
+   unsigned int len;
+   u32 val;
+   int seq;
+};
+
+struct xenvif_hash_cache {
+   spinlock_t lock;
+   struct list_head list;
+   unsigned int count;
+   atomic_t seq;
+};
+
+struct xenvif_hash {
+   unsigned int alg;
+   u32 flags;
+   u8 key[XEN_NETBK_MAX_HASH_KEY_SIZE];
+   u32 mapping[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
+   unsigned int size;
+   struct xenvif_hash_cache cache;
+};
+
 struct xenvif {
/* Unique identifier for this interface. */
domid_t  domid;
@@ -251,6 +280,8 @@ struct xenvif {
unsigned int num_queues; /* active queues, resource allocated */
unsigned int stalled_queues;
 
+   struct xenvif_hash hash;
+
struct xenbus_watch credit_watch;
struct xenbus_watch mcast_ctrl_watch;
 
@@ -353,6 +384,7 @@ extern bool separate_tx_rx_irq;
 extern unsigned int rx_drain_timeout_msecs;
 extern unsigned int rx_stall_timeout_msecs;
 extern unsigned int xenvif_max_queues;
+extern unsigned int xenvif_hash_cache_size;
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *xen_netback_dbg_root;
@@ -366,4 +398,18 @@ void xenvif_skb_zerocopy_complete(struct xenvif_queue 
*queue);
 bool xenvif_mcast_match(struct xenvif *vif, const u8 *addr);
 void xenvif_mcast_addr_list_free(struct xenvif *vif);
 
+/* Hash */
+void xenvif_init_hash(struct xenvif *vif);
+void xenvif_deinit_hash(struct xenvif *vif);
+
+u32 xenvif_set_hash_alg(struct xenvif *vif, u32 alg);
+u32 xenvif_get_hash_flags(struct xenvif *vif, u32 *flags);
+u32 xenvif_set_hash_flags(struct xenvif *vif, u32 flags);
+u32 xenvif_set_hash_key(struct xenvif *vif, u32 gref, u32 len);
+u32 xenvif_set_hash_mapping_size(struct xenvif *vif, u32 size);
+u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, u32 len,
+   u32 off);
+
+void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
+
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
new file mode 100644
index 000..47edfe9
--- /dev/null
+++ b/drivers/net/xen-netback/hash.c
@@ -0,0 +1,386 @@
+/*
+ * Copyright (c) 2016 Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Softare Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IM

[PATCH net-next v2 3/4] xen-netback: pass hash value to the frontend

2016-05-11 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to pass hash values calculated for
guest receive-side packets (i.e. netback transmit side) to the frontend.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Acked-by: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c | 13 ++-
 drivers/net/xen-netback/netback.c   | 78 +++--
 2 files changed, 77 insertions(+), 14 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index 483080f..dcca498 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -158,8 +158,17 @@ static u16 xenvif_select_queue(struct net_device *dev, 
struct sk_buff *skb,
struct xenvif *vif = netdev_priv(dev);
unsigned int size = vif->hash.size;
 
-   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
-   return fallback(dev, skb) % dev->real_num_tx_queues;
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) {
+   u16 index = fallback(dev, skb) % dev->real_num_tx_queues;
+
+   /* Make sure there is no hash information in the socket
+* buffer otherwise it would be incorrectly forwarded
+* to the frontend.
+*/
+   skb_clear_hash(skb);
+
+   return index;
+   }
 
xenvif_set_skb_hash(vif, skb);
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 6509d11..7c72510 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -168,6 +168,8 @@ static bool xenvif_rx_ring_slots_available(struct 
xenvif_queue *queue)
needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
if (skb_is_gso(skb))
needed++;
+   if (skb->sw_hash)
+   needed++;
 
do {
prod = queue->rx.sring->req_prod;
@@ -285,6 +287,8 @@ struct gop_frag_copy {
struct xenvif_rx_meta *meta;
int head;
int gso_type;
+   int protocol;
+   int hash_present;
 
struct page *page;
 };
@@ -331,8 +335,15 @@ static void xenvif_setup_copy_gop(unsigned long gfn,
npo->copy_off += *len;
info->meta->size += *len;
 
+   if (!info->head)
+   return;
+
/* Leave a gap for the GSO descriptor. */
-   if (info->head && ((1 << info->gso_type) & queue->vif->gso_mask))
+   if ((1 << info->gso_type) & queue->vif->gso_mask)
+   queue->rx.req_cons++;
+
+   /* Leave a gap for the hash extra segment. */
+   if (info->hash_present)
queue->rx.req_cons++;
 
info->head = 0; /* There must be something in this buffer now */
@@ -367,6 +378,11 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
.npo = npo,
.head = *head,
.gso_type = XEN_NETIF_GSO_TYPE_NONE,
+   /* xenvif_set_skb_hash() will have either set a s/w
+* hash or cleared the hash depending on
+* whether the the frontend wants a hash for this skb.
+*/
+   .hash_present = skb->sw_hash,
};
unsigned long bytes;
 
@@ -555,6 +571,7 @@ void xenvif_kick_thread(struct xenvif_queue *queue)
 
 static void xenvif_rx_action(struct xenvif_queue *queue)
 {
+   struct xenvif *vif = queue->vif;
s8 status;
u16 flags;
struct xen_netif_rx_response *resp;
@@ -590,9 +607,10 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
gnttab_batch_copy(queue->grant_copy_op, npo.copy_prod);
 
while ((skb = __skb_dequeue()) != NULL) {
+   struct xen_netif_extra_info *extra = NULL;
 
if ((1 << queue->meta[npo.meta_cons].gso_type) &
-   queue->vif->gso_prefix_mask) {
+   vif->gso_prefix_mask) {
resp = RING_GET_RESPONSE(>rx,
 queue->rx.rsp_prod_pvt++);
 
@@ -610,7 +628,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
queue->stats.tx_bytes += skb->len;
queue->stats.tx_packets++;
 
-   status = xenvif_check_gop(queue->vif,
+   status = xenvif_check_gop(vif,
  XENVIF_RX_CB(skb)->meta_slots_used,
  );
 
@@ -632,21 +650,57 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
flags);
 
if ((1 << queue->meta[npo.meta_cons].gs

RE: [PATCH net-next 1/4] xen-netback: add control ring boilerplate

2016-05-10 Thread Paul Durrant

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: 10 May 2016 14:29
> To: Paul Durrant
> Cc: xen-de...@lists.xenproject.org; netdev@vger.kernel.org; Wei Liu
> Subject: Re: [PATCH net-next 1/4] xen-netback: add control ring boilerplate
> 
> On Thu, May 05, 2016 at 12:19:27PM +0100, Paul Durrant wrote:
> [...]
> >
> > +static int connect_ctrl_ring(struct backend_info *be)
> > +{
> 
> Please use goto style error handling in this function.
> 

Will do.

> Other than this the code looks good.
> 

Thanks,

  Paul

> Wei.

RE: [PATCH net-next 2/4] xen-netback: add control protocol implementation

2016-05-10 Thread Paul Durrant

> -Original Message-
> From: Wei Liu [mailto:wei.l...@citrix.com]
> Sent: 10 May 2016 14:29
> To: Paul Durrant
> Cc: xen-de...@lists.xenproject.org; netdev@vger.kernel.org; Wei Liu
> Subject: Re: [PATCH net-next 2/4] xen-netback: add control protocol
> implementation
> 
> On Thu, May 05, 2016 at 12:19:28PM +0100, Paul Durrant wrote:
> > My recent patch to include/xen/interface/io/netif.h defines a new shared
> > ring (in addition to the rx and tx rings) for passing control messages
> > from a VM frontend driver to a backend driver.
> >
> > A previous patch added the necessary boilerplate for mapping the control
> > ring from the frontend, should it be created. This patch adds
> > implementations for each of the defined protocol messages.
> >
> > Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
> > Cc: Wei Liu <wei.l...@citrix.com>
> > ---
> >  drivers/net/xen-netback/Makefile|   2 +-
> >  drivers/net/xen-netback/common.h|  43 +
> >  drivers/net/xen-netback/hash.c  | 361
> 
> >  drivers/net/xen-netback/interface.c |  28 +++
> >  drivers/net/xen-netback/netback.c   |  49 -
> >  5 files changed, 480 insertions(+), 3 deletions(-)
> >  create mode 100644 drivers/net/xen-netback/hash.c
> >
> 
> Other than the issue mentioned by David, the code looks OK to me.
> 

Cool, thanks. I should have the RCU-based hash code done in the next day or so.

  Paul

> Wei.

RE: [PATCH net-next 2/4] xen-netback: add control protocol implementation

2016-05-09 Thread Paul Durrant

> -Original Message-
> From: David Miller [mailto:da...@davemloft.net]
> Sent: 07 May 2016 20:09
> To: Paul Durrant
> Cc: xen-de...@lists.xenproject.org; netdev@vger.kernel.org; Wei Liu
> Subject: Re: [PATCH net-next 2/4] xen-netback: add control protocol
> implementation
> 
> From: Paul Durrant <paul.durr...@citrix.com>
> Date: Thu, 5 May 2016 12:19:28 +0100
> 
> > +struct xenvif_hash_cache {
> > +   rwlock_t lock;
> 
> You really don't want to lock on every SKB hash computation like
> this, turn this into a spin lock for locking the write side and
> use RCU locking for lookup and usage.
>

Yes, that would be better. Will do.

Cheers,

  Paul
 
> THanks.

[PATCH net-next 1/4] xen-netback: add control ring boilerplate

2016-05-05 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

This patch adds the necessary code to xen-netback to map this new shared
ring, should it be created by a frontend, but does not add implementations
for any of the defined protocol messages. These are added in a subsequent
patch for clarity.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/common.h|  28 +++---
 drivers/net/xen-netback/interface.c | 101 +---
 drivers/net/xen-netback/netback.c   |  99 +--
 drivers/net/xen-netback/xenbus.c|  75 ++
 4 files changed, 273 insertions(+), 30 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index f44b388..093a12a 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -260,6 +260,11 @@ struct xenvif {
struct dentry *xenvif_dbg_root;
 #endif
 
+   struct xen_netif_ctrl_back_ring ctrl;
+   struct task_struct *ctrl_task;
+   wait_queue_head_t ctrl_wq;
+   unsigned int ctrl_irq;
+
/* Miscellaneous private stuff. */
struct net_device *dev;
 };
@@ -285,10 +290,15 @@ struct xenvif *xenvif_alloc(struct device *parent,
 int xenvif_init_queue(struct xenvif_queue *queue);
 void xenvif_deinit_queue(struct xenvif_queue *queue);
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn);
-void xenvif_disconnect(struct xenvif *vif);
+int xenvif_connect_data(struct xenvif_queue *queue,
+   unsigned long tx_ring_ref,
+   unsigned long rx_ring_ref,
+   unsigned int tx_evtchn,
+   unsigned int rx_evtchn);
+void xenvif_disconnect_data(struct xenvif *vif);
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn);
+void xenvif_disconnect_ctrl(struct xenvif *vif);
 void xenvif_free(struct xenvif *vif);
 
 int xenvif_xenbus_init(void);
@@ -300,10 +310,10 @@ int xenvif_queue_stopped(struct xenvif_queue *queue);
 void xenvif_wake_queue(struct xenvif_queue *queue);
 
 /* (Un)Map communication rings. */
-void xenvif_unmap_frontend_rings(struct xenvif_queue *queue);
-int xenvif_map_frontend_rings(struct xenvif_queue *queue,
- grant_ref_t tx_ring_ref,
- grant_ref_t rx_ring_ref);
+void xenvif_unmap_frontend_data_rings(struct xenvif_queue *queue);
+int xenvif_map_frontend_data_rings(struct xenvif_queue *queue,
+  grant_ref_t tx_ring_ref,
+  grant_ref_t rx_ring_ref);
 
 /* Check for SKBs from frontend and schedule backend processing */
 void xenvif_napi_schedule_or_enable_events(struct xenvif_queue *queue);
@@ -318,6 +328,8 @@ void xenvif_kick_thread(struct xenvif_queue *queue);
 
 int xenvif_dealloc_kthread(void *data);
 
+int xenvif_ctrl_kthread(void *data);
+
 void xenvif_rx_queue_tail(struct xenvif_queue *queue, struct sk_buff *skb);
 
 void xenvif_carrier_on(struct xenvif *vif);
diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index f5231a2..78a10d2 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -128,6 +128,15 @@ irqreturn_t xenvif_interrupt(int irq, void *dev_id)
return IRQ_HANDLED;
 }
 
+irqreturn_t xenvif_ctrl_interrupt(int irq, void *dev_id)
+{
+   struct xenvif *vif = dev_id;
+
+   wake_up(>ctrl_wq);
+
+   return IRQ_HANDLED;
+}
+
 int xenvif_queue_stopped(struct xenvif_queue *queue)
 {
struct net_device *dev = queue->vif->dev;
@@ -527,9 +536,66 @@ void xenvif_carrier_on(struct xenvif *vif)
rtnl_unlock();
 }
 
-int xenvif_connect(struct xenvif_queue *queue, unsigned long tx_ring_ref,
-  unsigned long rx_ring_ref, unsigned int tx_evtchn,
-  unsigned int rx_evtchn)
+int xenvif_connect_ctrl(struct xenvif *vif, grant_ref_t ring_ref,
+   unsigned int evtchn)
+{
+   struct net_device *dev = vif->dev;
+   void *addr;
+   struct xen_netif_ctrl_sring *shared;
+   struct task_struct *task;
+   int err = -ENOMEM;
+
+   err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(vif),
+_ref, 1, );
+   if (err)
+   goto err;
+
+   shared = (struct xen_netif_ctrl_sring *)addr;
+   BACK_RING_INIT(>ctrl, shared, XEN_PAGE_SIZE);
+
+   init_waitqueue_head(>ctrl_wq);
+
+   err = bind_interdomain_evtchn_to

[PATCH net-next 2/4] xen-netback: add control protocol implementation

2016-05-05 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new shared
ring (in addition to the rx and tx rings) for passing control messages
from a VM frontend driver to a backend driver.

A previous patch added the necessary boilerplate for mapping the control
ring from the frontend, should it be created. This patch adds
implementations for each of the defined protocol messages.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/Makefile|   2 +-
 drivers/net/xen-netback/common.h|  43 +
 drivers/net/xen-netback/hash.c  | 361 
 drivers/net/xen-netback/interface.c |  28 +++
 drivers/net/xen-netback/netback.c   |  49 -
 5 files changed, 480 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/xen-netback/hash.c

diff --git a/drivers/net/xen-netback/Makefile b/drivers/net/xen-netback/Makefile
index e346e81..11e02be 100644
--- a/drivers/net/xen-netback/Makefile
+++ b/drivers/net/xen-netback/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_XEN_NETDEV_BACKEND) := xen-netback.o
 
-xen-netback-y := netback.o xenbus.o interface.o
+xen-netback-y := netback.o xenbus.o interface.o hash.o
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 093a12a..4959716 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -220,6 +220,32 @@ struct xenvif_mcast_addr {
 
 #define XEN_NETBK_MCAST_MAX 64
 
+#define XEN_NETBK_MAX_HASH_KEY_SIZE 40
+#define XEN_NETBK_MAX_HASH_MAPPING_SIZE 128
+#define XEN_NETBK_HASH_TAG_SIZE 40
+
+struct xenvif_hash_cache_entry {
+   u8 tag[XEN_NETBK_HASH_TAG_SIZE];
+   unsigned int len;
+   u32 val;
+   int seq;
+};
+
+struct xenvif_hash_cache {
+   rwlock_t lock;
+   struct xenvif_hash_cache_entry *entry;
+   atomic_t seq;
+};
+
+struct xenvif_hash {
+   unsigned int alg;
+   u32 flags;
+   u8 key[XEN_NETBK_MAX_HASH_KEY_SIZE];
+   u32 mapping[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
+   unsigned int size;
+   struct xenvif_hash_cache cache;
+};
+
 struct xenvif {
/* Unique identifier for this interface. */
domid_t  domid;
@@ -251,6 +277,8 @@ struct xenvif {
unsigned int num_queues; /* active queues, resource allocated */
unsigned int stalled_queues;
 
+   struct xenvif_hash hash;
+
struct xenbus_watch credit_watch;
struct xenbus_watch mcast_ctrl_watch;
 
@@ -353,6 +381,7 @@ extern bool separate_tx_rx_irq;
 extern unsigned int rx_drain_timeout_msecs;
 extern unsigned int rx_stall_timeout_msecs;
 extern unsigned int xenvif_max_queues;
+extern unsigned int xenvif_hash_cache_size;
 
 #ifdef CONFIG_DEBUG_FS
 extern struct dentry *xen_netback_dbg_root;
@@ -366,4 +395,18 @@ void xenvif_skb_zerocopy_complete(struct xenvif_queue 
*queue);
 bool xenvif_mcast_match(struct xenvif *vif, const u8 *addr);
 void xenvif_mcast_addr_list_free(struct xenvif *vif);
 
+/* Hash */
+int xenvif_init_hash(struct xenvif *vif);
+void xenvif_deinit_hash(struct xenvif *vif);
+
+u32 xenvif_set_hash_alg(struct xenvif *vif, u32 alg);
+u32 xenvif_get_hash_flags(struct xenvif *vif, u32 *flags);
+u32 xenvif_set_hash_flags(struct xenvif *vif, u32 flags);
+u32 xenvif_set_hash_key(struct xenvif *vif, u32 gref, u32 len);
+u32 xenvif_set_hash_mapping_size(struct xenvif *vif, u32 size);
+u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, u32 len,
+   u32 off);
+
+void xenvif_set_skb_hash(struct xenvif *vif, struct sk_buff *skb);
+
 #endif /* __XEN_NETBACK__COMMON_H__ */
diff --git a/drivers/net/xen-netback/hash.c b/drivers/net/xen-netback/hash.c
new file mode 100644
index 000..054bfd9
--- /dev/null
+++ b/drivers/net/xen-netback/hash.c
@@ -0,0 +1,361 @@
+/*
+ * Copyright (c) 2016 Citrix Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version 2
+ * as published by the Free Softare Foundation; or, when distributed
+ * separately from the Linux kernel or incorporated into other
+ * software packages, subject to the following license:
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this source file (the "Software"), to deal in the Software without
+ * restriction, including without limitation the rights to use, copy, modify,
+ * merge, publish, distribute, sublicense, and/or sell copies of the Software,
+ * and to permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NO

[PATCH net-next 3/4] xen-netback: pass hash value to the frontend

2016-05-05 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to pass hash values calculated for
guest receive-side packets (i.e. netback transmit side) to the frontend.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/interface.c | 13 ++-
 drivers/net/xen-netback/netback.c   | 78 +++--
 2 files changed, 77 insertions(+), 14 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c 
b/drivers/net/xen-netback/interface.c
index e54b475..b2d945f 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -158,8 +158,17 @@ static u16 xenvif_select_queue(struct net_device *dev, 
struct sk_buff *skb,
struct xenvif *vif = netdev_priv(dev);
unsigned int size = vif->hash.size;
 
-   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE)
-   return fallback(dev, skb) % dev->real_num_tx_queues;
+   if (vif->hash.alg == XEN_NETIF_CTRL_HASH_ALGORITHM_NONE) {
+   u16 index = fallback(dev, skb) % dev->real_num_tx_queues;
+
+   /* Make sure there is no hash information in the socket
+* buffer otherwise it would be incorrectly forwarded
+* to the frontend.
+*/
+   skb_clear_hash(skb);
+
+   return index;
+   }
 
xenvif_set_skb_hash(vif, skb);
 
diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 6509d11..7c72510 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -168,6 +168,8 @@ static bool xenvif_rx_ring_slots_available(struct 
xenvif_queue *queue)
needed = DIV_ROUND_UP(skb->len, XEN_PAGE_SIZE);
if (skb_is_gso(skb))
needed++;
+   if (skb->sw_hash)
+   needed++;
 
do {
prod = queue->rx.sring->req_prod;
@@ -285,6 +287,8 @@ struct gop_frag_copy {
struct xenvif_rx_meta *meta;
int head;
int gso_type;
+   int protocol;
+   int hash_present;
 
struct page *page;
 };
@@ -331,8 +335,15 @@ static void xenvif_setup_copy_gop(unsigned long gfn,
npo->copy_off += *len;
info->meta->size += *len;
 
+   if (!info->head)
+   return;
+
/* Leave a gap for the GSO descriptor. */
-   if (info->head && ((1 << info->gso_type) & queue->vif->gso_mask))
+   if ((1 << info->gso_type) & queue->vif->gso_mask)
+   queue->rx.req_cons++;
+
+   /* Leave a gap for the hash extra segment. */
+   if (info->hash_present)
queue->rx.req_cons++;
 
info->head = 0; /* There must be something in this buffer now */
@@ -367,6 +378,11 @@ static void xenvif_gop_frag_copy(struct xenvif_queue 
*queue, struct sk_buff *skb
.npo = npo,
.head = *head,
.gso_type = XEN_NETIF_GSO_TYPE_NONE,
+   /* xenvif_set_skb_hash() will have either set a s/w
+* hash or cleared the hash depending on
+* whether the the frontend wants a hash for this skb.
+*/
+   .hash_present = skb->sw_hash,
};
unsigned long bytes;
 
@@ -555,6 +571,7 @@ void xenvif_kick_thread(struct xenvif_queue *queue)
 
 static void xenvif_rx_action(struct xenvif_queue *queue)
 {
+   struct xenvif *vif = queue->vif;
s8 status;
u16 flags;
struct xen_netif_rx_response *resp;
@@ -590,9 +607,10 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
gnttab_batch_copy(queue->grant_copy_op, npo.copy_prod);
 
while ((skb = __skb_dequeue()) != NULL) {
+   struct xen_netif_extra_info *extra = NULL;
 
if ((1 << queue->meta[npo.meta_cons].gso_type) &
-   queue->vif->gso_prefix_mask) {
+   vif->gso_prefix_mask) {
resp = RING_GET_RESPONSE(>rx,
 queue->rx.rsp_prod_pvt++);
 
@@ -610,7 +628,7 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
queue->stats.tx_bytes += skb->len;
queue->stats.tx_packets++;
 
-   status = xenvif_check_gop(queue->vif,
+   status = xenvif_check_gop(vif,
  XENVIF_RX_CB(skb)->meta_slots_used,
  );
 
@@ -632,21 +650,57 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
flags);
 
if ((1 << queue->meta[npo.meta_cons].gs

[PATCH net-next 4/4] xen-netback: use hash value from the frontend

2016-05-05 Thread Paul Durrant

My recent patch to include/xen/interface/io/netif.h defines a new extra
info type that can be used to pass hash values between backend and guest
frontend.

This patch adds code to xen-netback to use the value in a hash extra
info fragment passed from the guest frontend in a transmit-side
(i.e. netback receive side) packet to set the skb hash accordingly.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/netback.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/net/xen-netback/netback.c 
b/drivers/net/xen-netback/netback.c
index 7c72510..a5b5aad 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1509,6 +1509,33 @@ static void xenvif_tx_build_gops(struct xenvif_queue 
*queue,
}
}
 
+   if (extras[XEN_NETIF_EXTRA_TYPE_HASH - 1].type) {
+   struct xen_netif_extra_info *extra;
+   enum pkt_hash_types type = PKT_HASH_TYPE_NONE;
+
+   extra = [XEN_NETIF_EXTRA_TYPE_HASH - 1];
+
+   switch (extra->u.hash.type) {
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6:
+   type = PKT_HASH_TYPE_L3;
+   break;
+
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV4_TCP:
+   case _XEN_NETIF_CTRL_HASH_TYPE_IPV6_TCP:
+   type = PKT_HASH_TYPE_L4;
+   break;
+
+   default:
+   break;
+   }
+
+   if (type != PKT_HASH_TYPE_NONE)
+   skb_set_hash(skb,
+*(u32 *)extra->u.hash.value,
+type);
+   }
+
XENVIF_TX_CB(skb)->pending_idx = pending_idx;
 
__skb_put(skb, data_len);
-- 
2.1.4

[PATCH net-next 0/4] xen-netback: support for control ring

2016-05-05 Thread Paul Durrant

My recent patch to import an up-to-date include/xen/interface/io/netif.h
from the Xen Project brought in the necessary definitions to support the
new control shared ring and protocol. This patch series updates xen-netback
to support the new ring.

Patch #1 adds the necessary boilerplate to map the control ring and handle
messages. No implementation of the new protocol is included in this patch
so that it can be kept to a reasonable size.

Patch #2 adds the protocol implementation.

Patch #3 adds support for passing has values calculated by xen-netback to
capable frontends.

Patch #4 adds support for accepting hash values calculated by capable
frontends and using them the set the socket buffer hash.

[PATCH net-next 3/3] xen-netback: reduce log spam

2016-03-10 Thread Paul Durrant

Remove the "prepare for reconnect" pr_info in xenbus.c. It's largely
uninteresting and the states of the frontend and backend can easily be
observed by watching the (o)xenstored log.

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 drivers/net/xen-netback/xenbus.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 39a303d..bd182cd 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -511,8 +511,6 @@ static void set_backend_state(struct backend_info *be,
switch (state) {
case XenbusStateInitWait:
case XenbusStateConnected:
-   pr_info("%s: prepare for reconnect\n",
-   be->dev->nodename);
backend_switch_state(be, XenbusStateInitWait);
break;
case XenbusStateClosing:
-- 
2.1.4

[PATCH net-next 1/3] xen-netback: re-import canonical netif header

2016-03-10 Thread Paul Durrant

The canonical netif header (in the Xen source repo) and the Linux variant
have diverged significantly. Recently much documentation has been added to
the canonical header which is highly useful for developers making
modifications to either xen-netfront or xen-netback. This patch therefore
re-imports the canonical header in its entirity.

To maintain compatibility and some style consistency with the old Linux
variant, the header was stripped of its emacs boilerplate, and
post-processed and copied into place with the following commands:

ed -s netif.h << EOF
H
,s/NETTXF_/XEN_NETTXF_/g
,s/NETRXF_/XEN_NETRXF_/g
,s/NETIF_/XEN_NETIF_/g
,s/XEN_XEN_/XEN_/g
,s/netif/xen_netif/g
,s/xen_xen_/xen_/g
,s/^typedef.*$//g
,s/^/${TAB}/g
w
$
w
EOF

indent --line-length 80 --linux-style netif.h \
-o include/xen/interface/io/netif.h

Signed-off-by: Paul Durrant <paul.durr...@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: David Vrabel <david.vra...@citrix.com>
Cc: Wei Liu <wei.l...@citrix.com>
---
 include/xen/interface/io/netif.h | 861 ++-
 1 file changed, 766 insertions(+), 95 deletions(-)

diff --git a/include/xen/interface/io/netif.h b/include/xen/interface/io/netif.h
index 252ffd4..4f20dbc 100644
--- a/include/xen/interface/io/netif.h
+++ b/include/xen/interface/io/netif.h
@@ -1,16 +1,34 @@
 /**
- * netif.h
+ * xen_netif.h
  *
  * Unified network-device I/O interface for Xen guest OSes.
  *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
  * Copyright (c) 2003-2004, Keir Fraser
  */
 
-#ifndef __XEN_PUBLIC_IO_NETIF_H__
-#define __XEN_PUBLIC_IO_NETIF_H__
+#ifndef __XEN_PUBLIC_IO_XEN_NETIF_H__
+#define __XEN_PUBLIC_IO_XEN_NETIF_H__
 
-#include 
-#include 
+#include "ring.h"
+#include "../grant_table.h"
 
 /*
  * Older implementation of Xen network frontend / backend has an
@@ -38,10 +56,10 @@
  * that it cannot safely queue packets (as it may not be kicked to send them).
  */
 
- /*
+/*
  * "feature-split-event-channels" is introduced to separate guest TX
- * and RX notificaion. Backend either doesn't support this feature or
- * advertise it via xenstore as 0 (disabled) or 1 (enabled).
+ * and RX notification. Backend either doesn't support this feature or
+ * advertises it via xenstore as 0 (disabled) or 1 (enabled).
  *
  * To make use of this feature, frontend should allocate two event
  * channels for TX and RX, advertise them to backend as
@@ -118,151 +136,804 @@
  */
 
 /*
- * This is the 'wire' format for packets:
- *  Request 1: xen_netif_tx_request  -- XEN_NETTXF_* (any flags)
- * [Request 2: xen_netif_extra_info](only if request 1 has 
XEN_NETTXF_extra_info)
- * [Request 3: xen_netif_extra_info](only if request 2 has 
XEN_NETIF_EXTRA_MORE)
- *  Request 4: xen_netif_tx_request  -- XEN_NETTXF_more_data
- *  Request 5: xen_netif_tx_request  -- XEN_NETTXF_more_data
+ * "feature-multicast-control" and "feature-dynamic-multicast-control"
+ * advertise the capability to filter ethernet multicast packets in the
+ * backend. If the frontend wishes to take advantage of this feature then
+ * it may set "request-multicast-control". If the backend only advertises
+ * "feature-multicast-control" then "request-multicast-control" must be set
+ * before the frontend moves into the connected state. The backend will
+ * sample the value on this state transition and any subsequent change in
+ * value will have no effect. However, if the backend also advertises
+ * "feature-dynamic-multicast-control" then "request-multicast-control"
+ * may be set by the frontend at any time. In this case, the backend will
+ * watch the value and re-sample on watch events.
+ *
+

1 2 >

1 - 100 of 140 matches

Mail list logo