[Bug 1787680] Re: guest fails to start if another running guest has an interface in the same bridge
Someone smarter than me here helped me figure this out. On this hypervisor there is also an ovs bridge, br1. Some time ago it had a bunch of vnetX ports added to it, which have all since been deleted. However they still remained in ovsdb: root@demo01kvm01:~# ovs-vsctl show 3d2868d6-cd85-4039-ad45-079535b7a1a5 Bridge "br1" Controller "tcp:127.0.0.1:6633" Port "eth4" Interface "eth4" Port "eth3" Interface "eth3" Port "vnet2" Interface "vnet2" error: "could not open network device vnet2 (No such device)" Port "eth5" Interface "eth5" Port "eth2" Interface "eth2" Port "br1" Interface "br1" type: internal Port "vnet3" Interface "vnet3" error: "could not open network device vnet3 (No such device)" Port "vnet4" Interface "vnet4" error: "could not open network device vnet4 (No such device)" Port "vnet1" Interface "vnet1" error: "could not open network device vnet1 (No such device)" ovs_version: "2.5.4" Since the devices have been deleted, and don't actually exist, libvirtd was happily allocating vnet1 id to the next device it needed. As soon as that device got created, ovs took control of it, since it's already in its database, so libvirtd would not be able to add it to br0. Error message just said that resource is busy, but there was no way to tell that it was busy because it's already part of br1. Once I manually deleted vnet1 port from ovs (even though device doesn't exist), everything worked fine, and domain got started properly. root@demo01kvm01:~# ovs-vsctl del-port br1 vnet1 root@demo01kvm01:~# ovs-vsctl show 3d2868d6-cd85-4039-ad45-079535b7a1a5 Bridge "br1" Controller "tcp:127.0.0.1:6633" Port "eth4" Interface "eth4" Port "eth3" Interface "eth3" Port "vnet2" Interface "vnet2" error: "could not open network device vnet2 (No such device)" Port "eth5" Interface "eth5" Port "eth2" Interface "eth2" Port "br1" Interface "br1" type: internal Port "vnet3" Interface "vnet3" error: "could not open network device vnet3 (No such device)" Port "vnet4" Interface "vnet4" error: "could not open network device vnet4 (No such device)" ovs_version: "2.5.4" root@demo01kvm01:~# virsh start demo01inmon01 Domain demo01inmon01 started So I'm not sure if this counts as a bug, since this is just a conflict with ovs. But it comes from trying to reuse device names, and allocating something that was previously used and still exists in OVSDB. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787680 Title: guest fails to start if another running guest has an interface in the same bridge To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1787680/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1787680] Re: guest fails to start if another running guest has an interface in the same bridge
most operations are asynchronous, virsh only tells libvirtd what to do and then spins for a reply. Stracing virsh won't help. You could enable libvirtd debug and take a check at the log. See https://libvirt.org/logging.html -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787680 Title: guest fails to start if another running guest has an interface in the same bridge To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1787680/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787680] Re: guest fails to start if another running guest has an interface in the same bridge
Here's what I'm getting in dmesg: [Tue Aug 21 13:50:41 2018] audit: type=1400 audit(1534873858.383:598): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-2d5accd3-7992-4270-a175-4a6a03f22a1e" pid=26347 comm="apparmor_parser" [Tue Aug 21 13:50:41 2018] audit: type=1400 audit(1534873858.383:599): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libvirt-2d5accd3-7992-4270-a175-4a6a03f22a1e//qemu_bridge_helper" pid=26347 comm="apparmor_parser" [Tue Aug 21 13:50:41 2018] device vnet1 entered promiscuous mode [Tue Aug 21 13:50:41 2018] device vnet1 left promiscuous mode [Tue Aug 21 13:50:42 2018] audit: type=1400 audit(1534873859.003:600): apparmor="STATUS" operation="profile_remove" profile="unconfined" name="libvirt-2d5accd3-7992-4270-a175-4a6a03f22a1e" pid=26422 comm="apparmor_parser" There are no denies, just STATUS. I'm not sure why why vnet1 tries to go promisc. It's not configured to do that anywere. At least I don't think so. It's not supposed to be. Also, here's an strace for the virsh start, if that helps in any way. root@demo01kvm01:~# strace virsh start demo01inmon01 execve("/usr/bin/virsh", ["virsh", "start", "demo01inmon01"], [/* 22 vars */]) = 0 brk(NULL) = 0x55e33209a000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=30694, ...}) = 0 mmap(NULL, 30694, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff8401bd000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/usr/lib/x86_64-linux-gnu/libvirt-lxc.so.0", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\f\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0644, st_size=10168, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8401bc000 mmap(NULL, 2105392, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff83fd9d000 mprotect(0x7ff83fd9f000, 2093056, PROT_NONE) = 0 mmap(0x7ff83ff9e000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x7ff83ff9e000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/usr/lib/x86_64-linux-gnu/libvirt-qemu.so.0", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\v\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0644, st_size=14264, ...}) = 0 mmap(NULL, 2109488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff83fb99000 mprotect(0x7ff83fb9b000, 2097152, PROT_NONE) = 0 mmap(0x7ff83fd9b000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7ff83fd9b000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/x86_64-linux-gnu/libreadline.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220/\1\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0644, st_size=282392, ...}) = 0 mmap(NULL, 2382904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff83f953000 mprotect(0x7ff83f99, 2097152, PROT_NONE) = 0 mmap(0x7ff83fb9, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3d000) = 0x7ff83fb9 mmap(0x7ff83fb98000, 3128, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ff83fb98000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/usr/lib/x86_64-linux-gnu/libvirt.so.0", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0005\6\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0644, st_size=3693920, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8401bb000 mmap(NULL, 5794856, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff83f3cc000 mprotect(0x7ff83f737000, 2097152, PROT_NONE) = 0 mmap(0x7ff83f937000, 110592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x36b000) = 0x7ff83f937000 mmap(0x7ff83f952000, 3112, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ff83f952000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/usr/lib/x86_64-linux-gnu/libxml2.so.2", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\326\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0644, st_size=1809656, ...}) = 0 mmap(NULL, 3910072, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff83f011000 mprotect(0x7ff83f1c2000, 2093056, PROT_NONE) = 0 mmap(0x7ff83f3c1000, 40960, PROT_READ|PROT_WRITE,
[Bug 1787680] Re: guest fails to start if another running guest has an interface in the same bridge
Ok, it seems in this combination demo01gravwell01 is the one already up as it has a live network section: demo01inmon01 doesn't - I assume it is not started due to that issue It all looks quite normal for a bridge setup. Trying to recreate with that. $ sudo brctl addbr testbr0 $ cat corp-net.xml corp_net $ virsh net-define corp-net.xml Network corp_net defined from corp-net.xml $ virsh net-start corp_net Network corp_net started Obviously this is a rather empty bridge, but I only want to know if I can start more than one guest on it. ... I set this in a uvt template: $ uvt-kvm create --template template-corpnet.xml --password ubuntu bionic-test2 arch=amd64 release=bionic label=daily $ uvt-kvm create --template template-corpnet.xml --password ubuntu bionic-test2 arch=amd64 release=bionic label=daily As expected both work with the status being: $ brctl show testbr0 bridge name bridge id STP enabled interfaces testbr0 8000.fe5400dead42 no vnet1 vnet2 $ virsh net-info corp_net Name: corp_net UUID: d352751a-4b27-41ed-957e-8a83c105d2db Active: yes Persistent: yes Autostart: no Bridge: testbr0 $ virsh net-dumpxml corp_net corp_net d352751a-4b27-41ed-957e-8a83c105d2db So all working for me. Do you have anything in dmesg like an apparmor deny or anything else that might provide a hint what is failing? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787680 Title: guest fails to start if another running guest has an interface in the same bridge To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1787680/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787680] Re: guest fails to start if another running guest has an interface in the same bridge
Here are the outputs requested: root@demo01kvm01:~# virsh dumpxml demo01inmon01 demo01inmon01 2d5accd3-7992-4270-a175-4a6a03f22a1e 16336896 16336896 4 hvm destroy restart destroy /usr/bin/kvm-spice root@demo01kvm01:~# virsh dumpxml demo01gravwell01 demo01gravwell01 c22ab9ae-fd95-47e6-a11e-a7559882327e 16336896 16336896 4 /machine hvm destroy restart destroy /usr/bin/kvm-spice libvirt-c22ab9ae-fd95-47e6-a11e-a7559882327e libvirt-c22ab9ae-fd95-47e6-a11e-a7559882327e root@demo01kvm01:~# virsh net-dumpxml corp_net corp_net f934fe15-4abf-47aa-98b9-d7d924132246 root@demo01kvm01:~# virsh net-info corp_net Name: corp_net UUID: f934fe15-4abf-47aa-98b9-d7d924132246 Active: yes Persistent: yes Autostart: yes Bridge: br0 Thanks a lot. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787680 Title: guest fails to start if another running guest has an interface in the same bridge To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1787680/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1787680] Re: guest fails to start if another running guest has an interface in the same bridge
Odd, that works fine for me (as expected after all the time). Could you share the full XML that the two guests you use for testing have as well as the XML description of your network. If you have them linked to the default network that would be: $ virsh dumpxml demo01inmon01 $ virsh dumpxml demo01inmon02 $ virsh net-dumpxml default If you have other guest and/or network names please adapt. Also a net-info on the network might be useful $ virsh net-info default Again adapt the network name if it differs. Maybe we can spot a difference in your and the common setup that triggers this. ** Changed in: libvirt (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1787680 Title: guest fails to start if another running guest has an interface in the same bridge To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1787680/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs