Re: [RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-21 Thread Wei Xu
On Mon, May 21, 2018 at 10:33:30AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月21日 00:25, Wei Xu wrote:
> >On Wed, May 16, 2018 at 08:32:13PM +0800, Jason Wang wrote:
> >>Hi all:
> >>
> >>This RFC implement packed ring layout. The code were tested with
> >>Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
> >>tweaks were needed on top of Tiwei's code to make it run for event
> >>index.
> >Could you please show the change based on Tiwei's code to easy other's
> >test?
> 
> Please try Tiwei's V4 instead of just waiting for the fixup. It should work
> unless you don't try zerocopy and vIOMMU.

Yeah, actually v3 of both you guys works well on my test bed except resetting.

> 
> 
> >>Pktgen reports about 20% improvement on PPS (event index is off). More
> >>testing is ongoing.
> >>
> >>Notes for tester:
> >>
> >>- Start from this version, vhost need qemu co-operation to work
> >>   correctly. Or you can comment out the packed specific code for
> >>   GET/SET_VRING_BASE.
> >Do you mean the code in vhost_virtqueue_start/stop?
> 
> For qemu, probably.
> 
> >Both Tiwei's and your v3
> >work fortunately correctly which should be avoided since the ring should be
> >definitely different.
> 
> I don't understand this, you mean reset work?a

No, currently we have not handled vhost start/stop for split/packed ring 
relatively
which is what I am doing now.

Wei

> 
> Thanks
> 
> >
> >Wei
> >
> >>Changes from V3:
> >>- Fix math on event idx checking
> >>- Sync last avail wrap counter through GET/SET_VRING_BASE
> >>- remove desc_event prefix in the driver/device structure
> >>
> >>Changes from V2:
> >>- do not use & in checking desc_event_flags
> >>- off should be most significant bit
> >>- remove the workaround of mergeable buffer for dpdk prototype
> >>- id should be in the last descriptor in the chain
> >>- keep _F_WRITE for write descriptor when adding used
> >>- device flags updating should use ADDR_USED type
> >>- return error on unexpected unavail descriptor in a chain
> >>- return false in vhost_ve_avail_empty is descriptor is available
> >>- track last seen avail_wrap_counter
> >>- correctly examine available descriptor in get_indirect_packed()
> >>- vhost_idx_diff should return u16 instead of bool
> >>
> >>Changes from V1:
> >>
> >>- Refactor vhost used elem code to avoid open coding on used elem
> >>- Event suppression support (compile test only).
> >>- Indirect descriptor support (compile test only).
> >>- Zerocopy support.
> >>- vIOMMU support.
> >>- SCSI/VSOCK support (compile test only).
> >>- Fix several bugs
> >>
> >>Jason Wang (8):
> >>   vhost: move get_rx_bufs to vhost.c
> >>   vhost: hide used ring layout from device
> >>   vhost: do not use vring_used_elem
> >>   vhost_net: do not explicitly manipulate vhost_used_elem
> >>   vhost: vhost_put_user() can accept metadata type
> >>   virtio: introduce packed ring defines
> >>   vhost: packed ring support
> >>   vhost: event suppression for packed ring
> >>
> >>  drivers/vhost/net.c| 136 ++
> >>  drivers/vhost/scsi.c   |  62 +--
> >>  drivers/vhost/vhost.c  | 861 
> >> -
> >>  drivers/vhost/vhost.h  |  47 +-
> >>  drivers/vhost/vsock.c  |  42 +-
> >>  include/uapi/linux/virtio_config.h |   9 +
> >>  include/uapi/linux/virtio_ring.h   |  32 ++
> >>  7 files changed, 928 insertions(+), 261 deletions(-)
> >>
> >>-- 
> >>2.7.4
> >>
> 


Re: [RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-21 Thread Wei Xu
On Mon, May 21, 2018 at 10:33:30AM +0800, Jason Wang wrote:
> 
> 
> On 2018年05月21日 00:25, Wei Xu wrote:
> >On Wed, May 16, 2018 at 08:32:13PM +0800, Jason Wang wrote:
> >>Hi all:
> >>
> >>This RFC implement packed ring layout. The code were tested with
> >>Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
> >>tweaks were needed on top of Tiwei's code to make it run for event
> >>index.
> >Could you please show the change based on Tiwei's code to easy other's
> >test?
> 
> Please try Tiwei's V4 instead of just waiting for the fixup. It should work
> unless you don't try zerocopy and vIOMMU.

Yeah, actually v3 of both you guys works well on my test bed except resetting.

> 
> 
> >>Pktgen reports about 20% improvement on PPS (event index is off). More
> >>testing is ongoing.
> >>
> >>Notes for tester:
> >>
> >>- Start from this version, vhost need qemu co-operation to work
> >>   correctly. Or you can comment out the packed specific code for
> >>   GET/SET_VRING_BASE.
> >Do you mean the code in vhost_virtqueue_start/stop?
> 
> For qemu, probably.
> 
> >Both Tiwei's and your v3
> >work fortunately correctly which should be avoided since the ring should be
> >definitely different.
> 
> I don't understand this, you mean reset work?a

No, currently we have not handled vhost start/stop for split/packed ring 
relatively
which is what I am doing now.

Wei

> 
> Thanks
> 
> >
> >Wei
> >
> >>Changes from V3:
> >>- Fix math on event idx checking
> >>- Sync last avail wrap counter through GET/SET_VRING_BASE
> >>- remove desc_event prefix in the driver/device structure
> >>
> >>Changes from V2:
> >>- do not use & in checking desc_event_flags
> >>- off should be most significant bit
> >>- remove the workaround of mergeable buffer for dpdk prototype
> >>- id should be in the last descriptor in the chain
> >>- keep _F_WRITE for write descriptor when adding used
> >>- device flags updating should use ADDR_USED type
> >>- return error on unexpected unavail descriptor in a chain
> >>- return false in vhost_ve_avail_empty is descriptor is available
> >>- track last seen avail_wrap_counter
> >>- correctly examine available descriptor in get_indirect_packed()
> >>- vhost_idx_diff should return u16 instead of bool
> >>
> >>Changes from V1:
> >>
> >>- Refactor vhost used elem code to avoid open coding on used elem
> >>- Event suppression support (compile test only).
> >>- Indirect descriptor support (compile test only).
> >>- Zerocopy support.
> >>- vIOMMU support.
> >>- SCSI/VSOCK support (compile test only).
> >>- Fix several bugs
> >>
> >>Jason Wang (8):
> >>   vhost: move get_rx_bufs to vhost.c
> >>   vhost: hide used ring layout from device
> >>   vhost: do not use vring_used_elem
> >>   vhost_net: do not explicitly manipulate vhost_used_elem
> >>   vhost: vhost_put_user() can accept metadata type
> >>   virtio: introduce packed ring defines
> >>   vhost: packed ring support
> >>   vhost: event suppression for packed ring
> >>
> >>  drivers/vhost/net.c| 136 ++
> >>  drivers/vhost/scsi.c   |  62 +--
> >>  drivers/vhost/vhost.c  | 861 
> >> -
> >>  drivers/vhost/vhost.h  |  47 +-
> >>  drivers/vhost/vsock.c  |  42 +-
> >>  include/uapi/linux/virtio_config.h |   9 +
> >>  include/uapi/linux/virtio_ring.h   |  32 ++
> >>  7 files changed, 928 insertions(+), 261 deletions(-)
> >>
> >>-- 
> >>2.7.4
> >>
> 


Re: [RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-20 Thread Jason Wang



On 2018年05月21日 00:25, Wei Xu wrote:

On Wed, May 16, 2018 at 08:32:13PM +0800, Jason Wang wrote:

Hi all:

This RFC implement packed ring layout. The code were tested with
Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
tweaks were needed on top of Tiwei's code to make it run for event
index.

Could you please show the change based on Tiwei's code to easy other's
test?


Please try Tiwei's V4 instead of just waiting for the fixup. It should 
work unless you don't try zerocopy and vIOMMU.




Pktgen reports about 20% improvement on PPS (event index is off). More
testing is ongoing.

Notes for tester:

- Start from this version, vhost need qemu co-operation to work
   correctly. Or you can comment out the packed specific code for
   GET/SET_VRING_BASE.

Do you mean the code in vhost_virtqueue_start/stop?


For qemu, probably.


Both Tiwei's and your v3
work fortunately correctly which should be avoided since the ring should be
definitely different.


I don't understand this, you mean reset work?

Thanks



Wei


Changes from V3:
- Fix math on event idx checking
- Sync last avail wrap counter through GET/SET_VRING_BASE
- remove desc_event prefix in the driver/device structure

Changes from V2:
- do not use & in checking desc_event_flags
- off should be most significant bit
- remove the workaround of mergeable buffer for dpdk prototype
- id should be in the last descriptor in the chain
- keep _F_WRITE for write descriptor when adding used
- device flags updating should use ADDR_USED type
- return error on unexpected unavail descriptor in a chain
- return false in vhost_ve_avail_empty is descriptor is available
- track last seen avail_wrap_counter
- correctly examine available descriptor in get_indirect_packed()
- vhost_idx_diff should return u16 instead of bool

Changes from V1:

- Refactor vhost used elem code to avoid open coding on used elem
- Event suppression support (compile test only).
- Indirect descriptor support (compile test only).
- Zerocopy support.
- vIOMMU support.
- SCSI/VSOCK support (compile test only).
- Fix several bugs

Jason Wang (8):
   vhost: move get_rx_bufs to vhost.c
   vhost: hide used ring layout from device
   vhost: do not use vring_used_elem
   vhost_net: do not explicitly manipulate vhost_used_elem
   vhost: vhost_put_user() can accept metadata type
   virtio: introduce packed ring defines
   vhost: packed ring support
   vhost: event suppression for packed ring

  drivers/vhost/net.c| 136 ++
  drivers/vhost/scsi.c   |  62 +--
  drivers/vhost/vhost.c  | 861 -
  drivers/vhost/vhost.h  |  47 +-
  drivers/vhost/vsock.c  |  42 +-
  include/uapi/linux/virtio_config.h |   9 +
  include/uapi/linux/virtio_ring.h   |  32 ++
  7 files changed, 928 insertions(+), 261 deletions(-)

--
2.7.4





Re: [RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-20 Thread Jason Wang



On 2018年05月21日 00:25, Wei Xu wrote:

On Wed, May 16, 2018 at 08:32:13PM +0800, Jason Wang wrote:

Hi all:

This RFC implement packed ring layout. The code were tested with
Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
tweaks were needed on top of Tiwei's code to make it run for event
index.

Could you please show the change based on Tiwei's code to easy other's
test?


Please try Tiwei's V4 instead of just waiting for the fixup. It should 
work unless you don't try zerocopy and vIOMMU.




Pktgen reports about 20% improvement on PPS (event index is off). More
testing is ongoing.

Notes for tester:

- Start from this version, vhost need qemu co-operation to work
   correctly. Or you can comment out the packed specific code for
   GET/SET_VRING_BASE.

Do you mean the code in vhost_virtqueue_start/stop?


For qemu, probably.


Both Tiwei's and your v3
work fortunately correctly which should be avoided since the ring should be
definitely different.


I don't understand this, you mean reset work?

Thanks



Wei


Changes from V3:
- Fix math on event idx checking
- Sync last avail wrap counter through GET/SET_VRING_BASE
- remove desc_event prefix in the driver/device structure

Changes from V2:
- do not use & in checking desc_event_flags
- off should be most significant bit
- remove the workaround of mergeable buffer for dpdk prototype
- id should be in the last descriptor in the chain
- keep _F_WRITE for write descriptor when adding used
- device flags updating should use ADDR_USED type
- return error on unexpected unavail descriptor in a chain
- return false in vhost_ve_avail_empty is descriptor is available
- track last seen avail_wrap_counter
- correctly examine available descriptor in get_indirect_packed()
- vhost_idx_diff should return u16 instead of bool

Changes from V1:

- Refactor vhost used elem code to avoid open coding on used elem
- Event suppression support (compile test only).
- Indirect descriptor support (compile test only).
- Zerocopy support.
- vIOMMU support.
- SCSI/VSOCK support (compile test only).
- Fix several bugs

Jason Wang (8):
   vhost: move get_rx_bufs to vhost.c
   vhost: hide used ring layout from device
   vhost: do not use vring_used_elem
   vhost_net: do not explicitly manipulate vhost_used_elem
   vhost: vhost_put_user() can accept metadata type
   virtio: introduce packed ring defines
   vhost: packed ring support
   vhost: event suppression for packed ring

  drivers/vhost/net.c| 136 ++
  drivers/vhost/scsi.c   |  62 +--
  drivers/vhost/vhost.c  | 861 -
  drivers/vhost/vhost.h  |  47 +-
  drivers/vhost/vsock.c  |  42 +-
  include/uapi/linux/virtio_config.h |   9 +
  include/uapi/linux/virtio_ring.h   |  32 ++
  7 files changed, 928 insertions(+), 261 deletions(-)

--
2.7.4





Re: [RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-20 Thread Wei Xu
On Wed, May 16, 2018 at 08:32:13PM +0800, Jason Wang wrote:
> Hi all:
> 
> This RFC implement packed ring layout. The code were tested with
> Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
> tweaks were needed on top of Tiwei's code to make it run for event
> index.

Could you please show the change based on Tiwei's code to easy other's
test?

> 
> Pktgen reports about 20% improvement on PPS (event index is off). More
> testing is ongoing.
> 
> Notes for tester:
> 
> - Start from this version, vhost need qemu co-operation to work
>   correctly. Or you can comment out the packed specific code for
>   GET/SET_VRING_BASE.

Do you mean the code in vhost_virtqueue_start/stop? Both Tiwei's and your v3
work fortunately correctly which should be avoided since the ring should be
definitely different.

Wei

> 
> Changes from V3:
> - Fix math on event idx checking
> - Sync last avail wrap counter through GET/SET_VRING_BASE
> - remove desc_event prefix in the driver/device structure
> 
> Changes from V2:
> - do not use & in checking desc_event_flags
> - off should be most significant bit
> - remove the workaround of mergeable buffer for dpdk prototype
> - id should be in the last descriptor in the chain
> - keep _F_WRITE for write descriptor when adding used
> - device flags updating should use ADDR_USED type
> - return error on unexpected unavail descriptor in a chain
> - return false in vhost_ve_avail_empty is descriptor is available
> - track last seen avail_wrap_counter
> - correctly examine available descriptor in get_indirect_packed()
> - vhost_idx_diff should return u16 instead of bool
> 
> Changes from V1:
> 
> - Refactor vhost used elem code to avoid open coding on used elem
> - Event suppression support (compile test only).
> - Indirect descriptor support (compile test only).
> - Zerocopy support.
> - vIOMMU support.
> - SCSI/VSOCK support (compile test only).
> - Fix several bugs
> 
> Jason Wang (8):
>   vhost: move get_rx_bufs to vhost.c
>   vhost: hide used ring layout from device
>   vhost: do not use vring_used_elem
>   vhost_net: do not explicitly manipulate vhost_used_elem
>   vhost: vhost_put_user() can accept metadata type
>   virtio: introduce packed ring defines
>   vhost: packed ring support
>   vhost: event suppression for packed ring
> 
>  drivers/vhost/net.c| 136 ++
>  drivers/vhost/scsi.c   |  62 +--
>  drivers/vhost/vhost.c  | 861 
> -
>  drivers/vhost/vhost.h  |  47 +-
>  drivers/vhost/vsock.c  |  42 +-
>  include/uapi/linux/virtio_config.h |   9 +
>  include/uapi/linux/virtio_ring.h   |  32 ++
>  7 files changed, 928 insertions(+), 261 deletions(-)
> 
> -- 
> 2.7.4
> 


Re: [RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-20 Thread Wei Xu
On Wed, May 16, 2018 at 08:32:13PM +0800, Jason Wang wrote:
> Hi all:
> 
> This RFC implement packed ring layout. The code were tested with
> Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
> tweaks were needed on top of Tiwei's code to make it run for event
> index.

Could you please show the change based on Tiwei's code to easy other's
test?

> 
> Pktgen reports about 20% improvement on PPS (event index is off). More
> testing is ongoing.
> 
> Notes for tester:
> 
> - Start from this version, vhost need qemu co-operation to work
>   correctly. Or you can comment out the packed specific code for
>   GET/SET_VRING_BASE.

Do you mean the code in vhost_virtqueue_start/stop? Both Tiwei's and your v3
work fortunately correctly which should be avoided since the ring should be
definitely different.

Wei

> 
> Changes from V3:
> - Fix math on event idx checking
> - Sync last avail wrap counter through GET/SET_VRING_BASE
> - remove desc_event prefix in the driver/device structure
> 
> Changes from V2:
> - do not use & in checking desc_event_flags
> - off should be most significant bit
> - remove the workaround of mergeable buffer for dpdk prototype
> - id should be in the last descriptor in the chain
> - keep _F_WRITE for write descriptor when adding used
> - device flags updating should use ADDR_USED type
> - return error on unexpected unavail descriptor in a chain
> - return false in vhost_ve_avail_empty is descriptor is available
> - track last seen avail_wrap_counter
> - correctly examine available descriptor in get_indirect_packed()
> - vhost_idx_diff should return u16 instead of bool
> 
> Changes from V1:
> 
> - Refactor vhost used elem code to avoid open coding on used elem
> - Event suppression support (compile test only).
> - Indirect descriptor support (compile test only).
> - Zerocopy support.
> - vIOMMU support.
> - SCSI/VSOCK support (compile test only).
> - Fix several bugs
> 
> Jason Wang (8):
>   vhost: move get_rx_bufs to vhost.c
>   vhost: hide used ring layout from device
>   vhost: do not use vring_used_elem
>   vhost_net: do not explicitly manipulate vhost_used_elem
>   vhost: vhost_put_user() can accept metadata type
>   virtio: introduce packed ring defines
>   vhost: packed ring support
>   vhost: event suppression for packed ring
> 
>  drivers/vhost/net.c| 136 ++
>  drivers/vhost/scsi.c   |  62 +--
>  drivers/vhost/vhost.c  | 861 
> -
>  drivers/vhost/vhost.h  |  47 +-
>  drivers/vhost/vsock.c  |  42 +-
>  include/uapi/linux/virtio_config.h |   9 +
>  include/uapi/linux/virtio_ring.h   |  32 ++
>  7 files changed, 928 insertions(+), 261 deletions(-)
> 
> -- 
> 2.7.4
> 


[RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-16 Thread Jason Wang
Hi all:

This RFC implement packed ring layout. The code were tested with
Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
tweaks were needed on top of Tiwei's code to make it run for event
index.

Pktgen reports about 20% improvement on PPS (event index is off). More
testing is ongoing.

Notes for tester:

- Start from this version, vhost need qemu co-operation to work
  correctly. Or you can comment out the packed specific code for
  GET/SET_VRING_BASE.

Changes from V3:
- Fix math on event idx checking
- Sync last avail wrap counter through GET/SET_VRING_BASE
- remove desc_event prefix in the driver/device structure

Changes from V2:
- do not use & in checking desc_event_flags
- off should be most significant bit
- remove the workaround of mergeable buffer for dpdk prototype
- id should be in the last descriptor in the chain
- keep _F_WRITE for write descriptor when adding used
- device flags updating should use ADDR_USED type
- return error on unexpected unavail descriptor in a chain
- return false in vhost_ve_avail_empty is descriptor is available
- track last seen avail_wrap_counter
- correctly examine available descriptor in get_indirect_packed()
- vhost_idx_diff should return u16 instead of bool

Changes from V1:

- Refactor vhost used elem code to avoid open coding on used elem
- Event suppression support (compile test only).
- Indirect descriptor support (compile test only).
- Zerocopy support.
- vIOMMU support.
- SCSI/VSOCK support (compile test only).
- Fix several bugs

Jason Wang (8):
  vhost: move get_rx_bufs to vhost.c
  vhost: hide used ring layout from device
  vhost: do not use vring_used_elem
  vhost_net: do not explicitly manipulate vhost_used_elem
  vhost: vhost_put_user() can accept metadata type
  virtio: introduce packed ring defines
  vhost: packed ring support
  vhost: event suppression for packed ring

 drivers/vhost/net.c| 136 ++
 drivers/vhost/scsi.c   |  62 +--
 drivers/vhost/vhost.c  | 861 -
 drivers/vhost/vhost.h  |  47 +-
 drivers/vhost/vsock.c  |  42 +-
 include/uapi/linux/virtio_config.h |   9 +
 include/uapi/linux/virtio_ring.h   |  32 ++
 7 files changed, 928 insertions(+), 261 deletions(-)

-- 
2.7.4



[RFC V4 PATCH 0/8] Packed ring layout for vhost

2018-05-16 Thread Jason Wang
Hi all:

This RFC implement packed ring layout. The code were tested with
Tiwei's RFC V3 ahttps://lkml.org/lkml/2018/4/25/34. Some fixups and
tweaks were needed on top of Tiwei's code to make it run for event
index.

Pktgen reports about 20% improvement on PPS (event index is off). More
testing is ongoing.

Notes for tester:

- Start from this version, vhost need qemu co-operation to work
  correctly. Or you can comment out the packed specific code for
  GET/SET_VRING_BASE.

Changes from V3:
- Fix math on event idx checking
- Sync last avail wrap counter through GET/SET_VRING_BASE
- remove desc_event prefix in the driver/device structure

Changes from V2:
- do not use & in checking desc_event_flags
- off should be most significant bit
- remove the workaround of mergeable buffer for dpdk prototype
- id should be in the last descriptor in the chain
- keep _F_WRITE for write descriptor when adding used
- device flags updating should use ADDR_USED type
- return error on unexpected unavail descriptor in a chain
- return false in vhost_ve_avail_empty is descriptor is available
- track last seen avail_wrap_counter
- correctly examine available descriptor in get_indirect_packed()
- vhost_idx_diff should return u16 instead of bool

Changes from V1:

- Refactor vhost used elem code to avoid open coding on used elem
- Event suppression support (compile test only).
- Indirect descriptor support (compile test only).
- Zerocopy support.
- vIOMMU support.
- SCSI/VSOCK support (compile test only).
- Fix several bugs

Jason Wang (8):
  vhost: move get_rx_bufs to vhost.c
  vhost: hide used ring layout from device
  vhost: do not use vring_used_elem
  vhost_net: do not explicitly manipulate vhost_used_elem
  vhost: vhost_put_user() can accept metadata type
  virtio: introduce packed ring defines
  vhost: packed ring support
  vhost: event suppression for packed ring

 drivers/vhost/net.c| 136 ++
 drivers/vhost/scsi.c   |  62 +--
 drivers/vhost/vhost.c  | 861 -
 drivers/vhost/vhost.h  |  47 +-
 drivers/vhost/vsock.c  |  42 +-
 include/uapi/linux/virtio_config.h |   9 +
 include/uapi/linux/virtio_ring.h   |  32 ++
 7 files changed, 928 insertions(+), 261 deletions(-)

-- 
2.7.4