Hi Dave, You were exactly right. It turned out to be a memory error - and to figure that out, I had to go to the qemu log in /var/lib/libvirt/qemu/<vm name>.log - as well as the ovs-vswitchd.log in /var/run/openvswitch.
So fortunately, I found a blog by this fellow: https://osinstom.github.io/en/tutorial/configuring-ovs-dpdk-with-vm/#running-kvm-machine It was one of only a few web locations that I could find, that had an exhaustive xml configuration. ...Because I kind of knew it was an xml issue since I could get this to work if I ran a qemu-kvm command with options appended, in a bash script. I had also compared my script with what virsh was generating when I typed: "ps -ef | grep qemu" after virsh executed the VM. So I did a diff between his, and what I had in mine. So my ORIGINAL xml file had this: <memory unit='KiB'>3145728</memory> <currentMemory unit='KiB'>3145728</currentMemory> <memoryBacking> <hugepages> <page size='1048576' unit='KiB' nodeset='0'/> </hugepages> <locked/> </memoryBacking> <vcpu placement='static'>2</vcpu> <numatune> <memory mode='strict' nodeset='0'/> </numatune> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <vmport state='off'/> </features> <cpu mode='host-passthrough' check='none'> <topology sockets='1' cores='2' threads='1'/> <numa> <cell id='0' cpus='0-1' memory='3145728' unit='KiB' memAccess='shared'/> </numa> </cpu> And, sure enough, just as you have below, this is different than what you and he both had - where host-model was being used instead of host-passthrough. So this is the change I made, to get the VM to come up.... <memory unit='KiB'>6291456</memory> <currentMemory unit='KiB'>6291456</currentMemory> <memoryBacking> <hugepages> <page size='1048576' unit='KiB' nodeset='0'/> </hugepages> <locked/> </memoryBacking> <vcpu placement='static'>2</vcpu> <numatune> <memory mode='strict' nodeset='0'/> </numatune> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <vmport state='off'/> </features> <cpu mode='host-model' check='partial'> <model fallback='allow'/> <topology sockets='1' cores='2' threads='1'/> <numa> <cell id='0' cpus='0-1' memory='6291456' unit='KiB' memAccess='shared'/> </numa> </cpu> Now one thing that the guy in the blog is doing, that I am (still) not doing, is this cpu pinning with <cputune>. He has this section: <cputune> <shares>4096</shares> <vcpupin vcpu='0' cpuset='14'/> <vcpupin vcpu='1' cpuset='15'/> <emulatorpin cpuset='11,13'/> </cputune> I know OpenStack uses this kind of concept of vcpu pinning with vcpu_pin_set option, so I am familiar with this somewhat. But I have not had time to study this section to ensure I understand it properly, and therefore have not used this (I know things can go sideways if you do not pin properly). For instance, I don't know what this emulator_cpuset is, versus the vcpu assignment. Also, I had to increase my Hugepages to 8 Hugepages to get this VM to crank. - The VM uses 6 Hugepages - OpenVSwitch uses 1 Hugepage (initialized with 1024 with max limit of 2048). So when I run the Hugepages dump, without the VM running, it looks like this: ./check-numa.sh available: 1 nodes (0) node 0 cpus: 0 1 2 3 node 0 size: 15954 MB node 0 free: 2429 MB node distances: node 0 0: 10 Node 0 Total AnonHugePages 0 0 HugePages_Total 8192 8192 HugePages_Free 7168 7168 HugePages_Surp 0 0 And when I start the VM, it looks like this: ./check-numa.sh available: 1 nodes (0) node 0 cpus: 0 1 2 3 node 0 size: 15954 MB node 0 free: 1923 MB node distances: node 0 0: 10 Node 0 Total AnonHugePages 0 0 HugePages_Total 8192 8192 HugePages_Free 1024 1024 HugePages_Surp 0 0 One of the KEY THINGS I did not understand, was why the "whole VM" needed to be on Hugepages, just to get the benefit of the DPDK ports (which are on Hugepages via OVS). Having a 16G RAM box I am testing on, I can *barely* fit this VM, and OpenVSwitch - and maybe one small 1G RAM VM on the box - and have enough non-hugepages RAM left to run the operating system (CentOS - with gnome, which eats up some RAM, and some OpenStack Compute Node services running on it). On 9/21/20, 5:13 PM, "discuss on behalf of David Christensen" <ovs-discuss-boun...@openvswitch.org on behalf of d...@linux.vnet.ibm.com> wrote: On 9/14/20 1:20 PM, Wittling, Mark (CCI-Atlanta) wrote: > I did some more testing today on this issue. I will include some more > information lest anyone be able to provide me a suggestion on how to fix > this. > > Next, we will add our dpdkvhost1 vhostuser port. > > Let’s dump the log of ovs-vswitchd.log – which shows everything to be in > order. > > 2020-09-14T20:04:28.237Z|00217|bridge|INFO|bridge br-tun: deleted > interface dpdkvhost1 on port 2 > > 2020-09-14T20:04:28.237Z|00218|dpif_netdev|INFO|Core 2 on numa node 0 > assigned port 'dpdk0' rx queue 0 (measured processing cycles 0). > > 2020-09-14T20:05:19.296Z|00219|dpdk|INFO|VHOST_CONFIG: vhost-user > server: socket created, fd: 49 > > 2020-09-14T20:05:19.296Z|00220|netdev_dpdk|INFO|Socket > /var/run/openvswitch/dpdkvhost1 created for vhost-user port dpdkvhost1 > > 2020-09-14T20:05:19.296Z|00221|dpdk|INFO|VHOST_CONFIG: bind to > /var/run/openvswitch/dpdkvhost1 > > 2020-09-14T20:05:19.296Z|00222|dpif_netdev|INFO|Core 2 on numa node 0 > assigned port 'dpdkvhost1' rx queue 0 (measured processing cycles 0). > > 2020-09-14T20:05:19.296Z|00223|dpif_netdev|INFO|Core 3 on numa node 0 > assigned port 'dpdk0' rx queue 0 (measured processing cycles 0). > > 2020-09-14T20:05:19.296Z|00224|bridge|INFO|bridge br-tun: added > interface dpdkvhost1 on port 2 > > Next, we will launch our virtual machine with virt-manager GUI. Here is > our xml: > > Here is my xml file snippet: > > <interface type=’vhostuser’> > > <mac address=’52:54:00:d1:ba:7a’/> > > <source type=’unix’ path=’/var/run/openvswitch/dpdkvhost1’ > mode=’client’/> > > <model type=’virtio’/> > > <address type=’pci’ domain=’0x0000’ bus=’0x00’ slot=’0x0b’ function=’0x0’/> > > </interface> Have you enabled memAccess in your VM's configuration? Something like: <cpu mode='host-model' check='partial'> <model fallback='allow'/> <numa> <cell id='0' cpus='0-7' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu> And what about hugepage setup? <memoryBacking> <hugepages> <page size='1048576' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> I recall having a non-obvious issue without the "memAccess" setting in place. Dave _______________________________________________ discuss mailing list disc...@openvswitch.org https://urldefense.com/v3/__https://mail.openvswitch.org/mailman/listinfo/ovs-discuss__;!!Hit2Ag!l-QZsDoL6F8UNFh5NzVodW4fqS9hrkaCDoSt8oqBkiOG0J8xjfkZkC5tp1_igj6tkQ$ _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss