Re: [Qemu-devel] Question about libvhost-user and vhost-user-bridge.c

2019-08-15 Thread William Tu
Hi Stefan,

Thanks for your reply.

On Thu, Aug 15, 2019 at 7:07 AM Stefan Hajnoczi  wrote:
>
> On Wed, Aug 14, 2019 at 10:54:34AM -0700, William Tu wrote:
> > Hi,
> >
> > I'm using libvhost-user.a to write a vhost backend, in order to receive and
> > send packets from/to VMs from OVS. I started by reading the 
> > vhost-user-bridge.c.
> > I can now pass the initialization stage, seeing .queue_set_started get 
> > invoked.
> >
> > However, I am stuck at receiving the packet from VM.
> > So is it correct to do:
> > 1) check vu_queue_empty, started, and aval_bytes, if OK, then
>
> This step can be skipped because vu_queue_pop() returns NULL if there
> are no virtqueue elements available.
>
> > 2) elem = vu_queue_pop(>vudev, vq, sizeof(VuVirtqElement));
> > 3) the packet payload should be at elem->in_sg->iov_base + hdrlen? or
> > at elem->out_sg?
>
> The driver->device buffers are elem->out_sg and the device->driver
> buffers are elem->in_sg.

OK, thanks. Then for vswitch to receive from qemu, I should check
device->driver.
>
> Device implementations must not make assumptions about the layout of
> out_sg and in_sg (e.g. you cannot assume that in_sg[0]->iov_len ==
> sizeof(struct virtio_net_hdr) and you must handle the case where
> in_sg[0]->iov_len == 1).

OK so I might need to copy to a single continuous buffer.

>
> > I tried to hex dump the iov_base, but the content doesn't look like
> > having a ethernet header. I saw in vubr_backend_recv_cb at 
> > vhost-user-bridge.c,
> > we're creating another iovec and recvmsg(vubr->backend_udp_sock, , 0);
> > I don't think I have to create backend UDP sock, am I correct?
>
> Please see the VIRTIO specification for details of the virtio-net rx/tx
> virtqueue formats:
> https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-2050006
>
> I think you may need to handle the struct virtio_net_hdr that comes
> before the Ethernet header.

Thank, will look at it.
William
>
> Stefan



[Qemu-devel] Question about libvhost-user and vhost-user-bridge.c

2019-08-14 Thread William Tu
Hi,

I'm using libvhost-user.a to write a vhost backend, in order to receive and
send packets from/to VMs from OVS. I started by reading the vhost-user-bridge.c.
I can now pass the initialization stage, seeing .queue_set_started get invoked.

However, I am stuck at receiving the packet from VM.
So is it correct to do:
1) check vu_queue_empty, started, and aval_bytes, if OK, then
2) elem = vu_queue_pop(>vudev, vq, sizeof(VuVirtqElement));
3) the packet payload should be at elem->in_sg->iov_base + hdrlen? or
at elem->out_sg?

I tried to hex dump the iov_base, but the content doesn't look like
having a ethernet header. I saw in vubr_backend_recv_cb at vhost-user-bridge.c,
we're creating another iovec and recvmsg(vubr->backend_udp_sock, , 0);
I don't think I have to create backend UDP sock, am I correct?

Thanks
William



Re: [Qemu-devel] Low shared memory throughput at VM when using PCI mapping

2012-05-31 Thread William Tu
This is just an update if you are interested in the outcome. I turns
out that my MTRR (Memory Type Range Register) configuration does not
take effect so that the shared memory region is always uncachable. My
shared memory is located at 0xf24, and the MTRR settings are
below:

 cat /proc/mtrr
reg00: base=0x0e000 ( 3584MB), size=  512MB, count=1: uncachable
reg01: base=0x0f240 ( 3876MB), size=    4MB, count=1: write-back

The first entry actually ranges from 0xe000 to 0x, which
covers the 0xf2400, even if I follow the Linux documentation
(Documentation/mtrr.txt) to create an overlapping mtrr at listed
below, the 0xf240 is still uncachable

// overlapping way:
reg00: base=0x0f240 ( 3876MB), size=    4MB, count=1: write-back
reg01: base=0x0e000 ( 3584MB), size=  512MB, count=1: uncachable

In the end, I tried exclusive out the shared memory region
(0xf240) from the uncachable region and the 0xf240 finally
becomes cacheable as I wish!

//non-overlapping MTRR list:
reg00: base=0x0e000 ( 3584MB), size=  256MB, count=1: uncachable
reg01: base=0x0f000 ( 3840MB), size=   32MB, count=1: uncachable
reg02: base=0x0f200 ( 3872MB), size=4MB, count=1: uncachable
reg03: base=0x0f280 ( 3880MB), size=8MB, count=1: uncachable
reg04: base=0x0f300 ( 3888MB), size=   16MB, count=1: uncachable
reg05: base=0x0f400 ( 3904MB), size=   64MB, count=1: uncachable
reg06: base=0x0f800 ( 3968MB), size=  128MB, count=1: uncachable

I also found one wired thing that I have to setup this MTRR at the
very beginning (after bootup before I do anything to this shared
memory). Once if I touch the shared memory and its mtrr is
uncacheable, even though I modified its mtrr entry to write-back
afterwards, the kernel still takes it as uncacheable and the memory
read/write shows high latency. Does anyone run into similar cases?

Here is my updated experiment of write 400MB and read 4MB:
--
 op   ,  ioremap type  ,  jiffies
--
read, ioremap_nocache, 304
write, ioremap_nocache, 3336
read, ioremap_wc,           309
write, ioremap_wc,           23
read, ioremap_cache,      30
write, ioremap_cache,      22
--


Regards,
William (Cheng-Chun Tu)

On Wed, May 30, 2012 at 4:56 PM, William Tu u9012...@gmail.com wrote:
 Hi Folks,

 I'm using PCI device pass-through to pass a network device to a VM.
 Since one of my additional requirements is to share a memory between
 VM and host, I pre-allocate a memory at host (say physaddr: 0x100) and
 put this address into the BAR2 of the network device's pci
 configuration space. (similar idea as ivshmem)

 The KVM boots up and the device inside VM shows me a new BAR2 address
 as its guest physical address (say: addr: 0x200). I assume KVM
 automatically setups the guest physical to host physical mappings in
 its EPT for me. So that I can use ioremap(0x200, size) at VM to access
 memory at the host.

 However I found that this memory seems to be ** uncacheable ** as its
 read/write speed is quite slow. Frank and Cam suggest that using
 ioremap_wc can speed up things quite a bit.
 http://comments.gmane.org/gmane.comp.emulators.qemu/69172

 In my case, ioremap_wc indeed is fast, but write combining only
 applies to write throughput. To increase both read/write speed, I use
 ioremap_cache and ioremap_nocache, but both show the same speed.

 Here is my experiment of write 400MB and read 4MB:
 --
 op   ,  ioremap type  ,  jiffies
 --
 read, ioremap_nocache, 304
 write, ioremap_nocache, 3336
 read, ioremap_wc,           309
 write, ioremap_wc,           23
 read, ioremap_cache,      302
 write, ioremap_cache,      3284
 --

 Since all memory read have the same speed, I guess the range of shared
 memory is marked as uncacheable in VM. Then I configure the MTRR in VM
 to set this region as write-back.

 cat /proc/mtrr
 reg00: base=0x0e000 ( 3584MB), size=  512MB, count=1: uncachable
 reg01: base=0x0f240 ( 3876MB), size=    4MB, count=1: write-back
 -- my shared memory addr at BAR2

 Sadly this does not improve my read/write performance and using
 ioremap_cache and nocache still show the same numbers. I'm now
 checking why the MTRR does not take any effect and also making sure
 the shared memory is cacheable in both host and VM. Any comments or
 suggestions are appreciated!


 Regards,
 William (Cheng-Chun Tu)



[Qemu-devel] Low shared memory throughput at VM when using PCI mapping

2012-05-30 Thread William Tu
Hi Folks,

I'm using PCI device pass-through to pass a network device to a VM.
Since one of my additional requirements is to share a memory between
VM and host, I pre-allocate a memory at host (say physaddr: 0x100) and
put this address into the BAR2 of the network device's pci
configuration space. (similar idea as ivshmem)

The KVM boots up and the device inside VM shows me a new BAR2 address
as its guest physical address (say: addr: 0x200). I assume KVM
automatically setups the guest physical to host physical mappings in
its EPT for me. So that I can use ioremap(0x200, size) at VM to access
memory at the host.

However I found that this memory seems to be ** uncacheable ** as its
read/write speed is quite slow. Frank and Cam suggest that using
ioremap_wc can speed up things quite a bit.
http://comments.gmane.org/gmane.comp.emulators.qemu/69172

In my case, ioremap_wc indeed is fast, but write combining only
applies to write throughput. To increase both read/write speed, I use
ioremap_cache and ioremap_nocache, but both show the same speed.

Here is my experiment of write 400MB and read 4MB:
--
op   ,  ioremap type  ,  jiffies
--
read, ioremap_nocache, 304
write, ioremap_nocache, 3336
read, ioremap_wc,           309
write, ioremap_wc,           23
read, ioremap_cache,      302
write, ioremap_cache,      3284
--

Since all memory read have the same speed, I guess the range of shared
memory is marked as uncacheable in VM. Then I configure the MTRR in VM
to set this region as write-back.

 cat /proc/mtrr
reg00: base=0x0e000 ( 3584MB), size=  512MB, count=1: uncachable
reg01: base=0x0f240 ( 3876MB), size=    4MB, count=1: write-back
-- my shared memory addr at BAR2

Sadly this does not improve my read/write performance and using
ioremap_cache and nocache still show the same numbers. I'm now
checking why the MTRR does not take any effect and also making sure
the shared memory is cacheable in both host and VM. Any comments or
suggestions are appreciated!


Regards,
William (Cheng-Chun Tu)