Re: [Qemu-devel] Question about libvhost-user and vhost-user-bridge.c
Hi Stefan, Thanks for your reply. On Thu, Aug 15, 2019 at 7:07 AM Stefan Hajnoczi wrote: > > On Wed, Aug 14, 2019 at 10:54:34AM -0700, William Tu wrote: > > Hi, > > > > I'm using libvhost-user.a to write a vhost backend, in order to receive and > > send packets from/to VMs from OVS. I started by reading the > > vhost-user-bridge.c. > > I can now pass the initialization stage, seeing .queue_set_started get > > invoked. > > > > However, I am stuck at receiving the packet from VM. > > So is it correct to do: > > 1) check vu_queue_empty, started, and aval_bytes, if OK, then > > This step can be skipped because vu_queue_pop() returns NULL if there > are no virtqueue elements available. > > > 2) elem = vu_queue_pop(>vudev, vq, sizeof(VuVirtqElement)); > > 3) the packet payload should be at elem->in_sg->iov_base + hdrlen? or > > at elem->out_sg? > > The driver->device buffers are elem->out_sg and the device->driver > buffers are elem->in_sg. OK, thanks. Then for vswitch to receive from qemu, I should check device->driver. > > Device implementations must not make assumptions about the layout of > out_sg and in_sg (e.g. you cannot assume that in_sg[0]->iov_len == > sizeof(struct virtio_net_hdr) and you must handle the case where > in_sg[0]->iov_len == 1). OK so I might need to copy to a single continuous buffer. > > > I tried to hex dump the iov_base, but the content doesn't look like > > having a ethernet header. I saw in vubr_backend_recv_cb at > > vhost-user-bridge.c, > > we're creating another iovec and recvmsg(vubr->backend_udp_sock, , 0); > > I don't think I have to create backend UDP sock, am I correct? > > Please see the VIRTIO specification for details of the virtio-net rx/tx > virtqueue formats: > https://docs.oasis-open.org/virtio/virtio/v1.1/cs01/virtio-v1.1-cs01.html#x1-2050006 > > I think you may need to handle the struct virtio_net_hdr that comes > before the Ethernet header. Thank, will look at it. William > > Stefan
[Qemu-devel] Question about libvhost-user and vhost-user-bridge.c
Hi, I'm using libvhost-user.a to write a vhost backend, in order to receive and send packets from/to VMs from OVS. I started by reading the vhost-user-bridge.c. I can now pass the initialization stage, seeing .queue_set_started get invoked. However, I am stuck at receiving the packet from VM. So is it correct to do: 1) check vu_queue_empty, started, and aval_bytes, if OK, then 2) elem = vu_queue_pop(>vudev, vq, sizeof(VuVirtqElement)); 3) the packet payload should be at elem->in_sg->iov_base + hdrlen? or at elem->out_sg? I tried to hex dump the iov_base, but the content doesn't look like having a ethernet header. I saw in vubr_backend_recv_cb at vhost-user-bridge.c, we're creating another iovec and recvmsg(vubr->backend_udp_sock, , 0); I don't think I have to create backend UDP sock, am I correct? Thanks William
Re: [Qemu-devel] Low shared memory throughput at VM when using PCI mapping
This is just an update if you are interested in the outcome. I turns out that my MTRR (Memory Type Range Register) configuration does not take effect so that the shared memory region is always uncachable. My shared memory is located at 0xf24, and the MTRR settings are below: cat /proc/mtrr reg00: base=0x0e000 ( 3584MB), size= 512MB, count=1: uncachable reg01: base=0x0f240 ( 3876MB), size= 4MB, count=1: write-back The first entry actually ranges from 0xe000 to 0x, which covers the 0xf2400, even if I follow the Linux documentation (Documentation/mtrr.txt) to create an overlapping mtrr at listed below, the 0xf240 is still uncachable // overlapping way: reg00: base=0x0f240 ( 3876MB), size= 4MB, count=1: write-back reg01: base=0x0e000 ( 3584MB), size= 512MB, count=1: uncachable In the end, I tried exclusive out the shared memory region (0xf240) from the uncachable region and the 0xf240 finally becomes cacheable as I wish! //non-overlapping MTRR list: reg00: base=0x0e000 ( 3584MB), size= 256MB, count=1: uncachable reg01: base=0x0f000 ( 3840MB), size= 32MB, count=1: uncachable reg02: base=0x0f200 ( 3872MB), size=4MB, count=1: uncachable reg03: base=0x0f280 ( 3880MB), size=8MB, count=1: uncachable reg04: base=0x0f300 ( 3888MB), size= 16MB, count=1: uncachable reg05: base=0x0f400 ( 3904MB), size= 64MB, count=1: uncachable reg06: base=0x0f800 ( 3968MB), size= 128MB, count=1: uncachable I also found one wired thing that I have to setup this MTRR at the very beginning (after bootup before I do anything to this shared memory). Once if I touch the shared memory and its mtrr is uncacheable, even though I modified its mtrr entry to write-back afterwards, the kernel still takes it as uncacheable and the memory read/write shows high latency. Does anyone run into similar cases? Here is my updated experiment of write 400MB and read 4MB: -- op , ioremap type , jiffies -- read, ioremap_nocache, 304 write, ioremap_nocache, 3336 read, ioremap_wc, 309 write, ioremap_wc, 23 read, ioremap_cache, 30 write, ioremap_cache, 22 -- Regards, William (Cheng-Chun Tu) On Wed, May 30, 2012 at 4:56 PM, William Tu u9012...@gmail.com wrote: Hi Folks, I'm using PCI device pass-through to pass a network device to a VM. Since one of my additional requirements is to share a memory between VM and host, I pre-allocate a memory at host (say physaddr: 0x100) and put this address into the BAR2 of the network device's pci configuration space. (similar idea as ivshmem) The KVM boots up and the device inside VM shows me a new BAR2 address as its guest physical address (say: addr: 0x200). I assume KVM automatically setups the guest physical to host physical mappings in its EPT for me. So that I can use ioremap(0x200, size) at VM to access memory at the host. However I found that this memory seems to be ** uncacheable ** as its read/write speed is quite slow. Frank and Cam suggest that using ioremap_wc can speed up things quite a bit. http://comments.gmane.org/gmane.comp.emulators.qemu/69172 In my case, ioremap_wc indeed is fast, but write combining only applies to write throughput. To increase both read/write speed, I use ioremap_cache and ioremap_nocache, but both show the same speed. Here is my experiment of write 400MB and read 4MB: -- op , ioremap type , jiffies -- read, ioremap_nocache, 304 write, ioremap_nocache, 3336 read, ioremap_wc, 309 write, ioremap_wc, 23 read, ioremap_cache, 302 write, ioremap_cache, 3284 -- Since all memory read have the same speed, I guess the range of shared memory is marked as uncacheable in VM. Then I configure the MTRR in VM to set this region as write-back. cat /proc/mtrr reg00: base=0x0e000 ( 3584MB), size= 512MB, count=1: uncachable reg01: base=0x0f240 ( 3876MB), size= 4MB, count=1: write-back -- my shared memory addr at BAR2 Sadly this does not improve my read/write performance and using ioremap_cache and nocache still show the same numbers. I'm now checking why the MTRR does not take any effect and also making sure the shared memory is cacheable in both host and VM. Any comments or suggestions are appreciated! Regards, William (Cheng-Chun Tu)
[Qemu-devel] Low shared memory throughput at VM when using PCI mapping
Hi Folks, I'm using PCI device pass-through to pass a network device to a VM. Since one of my additional requirements is to share a memory between VM and host, I pre-allocate a memory at host (say physaddr: 0x100) and put this address into the BAR2 of the network device's pci configuration space. (similar idea as ivshmem) The KVM boots up and the device inside VM shows me a new BAR2 address as its guest physical address (say: addr: 0x200). I assume KVM automatically setups the guest physical to host physical mappings in its EPT for me. So that I can use ioremap(0x200, size) at VM to access memory at the host. However I found that this memory seems to be ** uncacheable ** as its read/write speed is quite slow. Frank and Cam suggest that using ioremap_wc can speed up things quite a bit. http://comments.gmane.org/gmane.comp.emulators.qemu/69172 In my case, ioremap_wc indeed is fast, but write combining only applies to write throughput. To increase both read/write speed, I use ioremap_cache and ioremap_nocache, but both show the same speed. Here is my experiment of write 400MB and read 4MB: -- op , ioremap type , jiffies -- read, ioremap_nocache, 304 write, ioremap_nocache, 3336 read, ioremap_wc, 309 write, ioremap_wc, 23 read, ioremap_cache, 302 write, ioremap_cache, 3284 -- Since all memory read have the same speed, I guess the range of shared memory is marked as uncacheable in VM. Then I configure the MTRR in VM to set this region as write-back. cat /proc/mtrr reg00: base=0x0e000 ( 3584MB), size= 512MB, count=1: uncachable reg01: base=0x0f240 ( 3876MB), size= 4MB, count=1: write-back -- my shared memory addr at BAR2 Sadly this does not improve my read/write performance and using ioremap_cache and nocache still show the same numbers. I'm now checking why the MTRR does not take any effect and also making sure the shared memory is cacheable in both host and VM. Any comments or suggestions are appreciated! Regards, William (Cheng-Chun Tu)