from:"Laine Stump"

Re: how to add LD_PRELOAD in the xml

2022-12-29 Thread Laine Stump


On 12/26/22 3:44 PM, Marc wrote:


How should I add LD_PRELOAD to domain xml. I guess within:


 


but this is generating syntax errors.


https://libvirt.org/kbase/qemu-passthrough-security.html

Note the bit in the example that shows:



So try:

Re: Predictable and consistent net interface naming in guests

2022-12-08 Thread Laine Stump


On 12/8/22 11:15 AM, Julia Suvorova wrote:

On Thu, Nov 3, 2022 at 9:26 AM Amnon Ilan  wrote:




On Thu, Nov 3, 2022 at 12:13 AM Amnon Ilan  wrote:




On Wed, Nov 2, 2022 at 6:47 PM Laine Stump  wrote:


On 11/2/22 11:58 AM, Igor Mammedov wrote:

On Wed, 2 Nov 2022 15:20:39 +
Daniel P. Berrangé  wrote:


On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote:

On Wed, 2 Nov 2022 10:43:10 -0400
Laine Stump  wrote:


On 11/1/22 7:46 AM, Igor Mammedov wrote:

On Mon, 31 Oct 2022 14:48:54 +
Daniel P. Berrangé  wrote:


On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:

Hi Igor and Laine,

I would like to revive a 2 years old discussion [1] about consistent network
interfaces in the guest.

That discussion mentioned that a guest PCI address may change in two cases:
- The PCI topology changes.
- The machine type changes.

Usually, the machine type is not expected to change, especially if one
wants to allow migrations between nodes.
I would hope to argue this should not be problematic in practice, because
guest images would be made per a specific machine type.

Regarding the PCI topology, I am not sure I understand what changes
need to occur to the domxml for a defined guest PCI address to change.
The only think that I can think of is a scenario where hotplug/unplug is
used,
but even then I would expect existing devices to preserve their PCI address
and the plug/unplug device to have a reserved address managed by the one
acting on it (the management system).

Could you please help clarify in which scenarios the PCI topology can cause
a mess to the naming of interfaces in the guest?

Are there any plans to add the acpi_index support?


This was implemented a year & a half ago

 https://libvirt.org/formatdomain.html#network-interfaces

though due to QEMU limitations this only works for the old
i440fx chipset, not Q35 yet.


Q35 should work partially too. In its case acpi-index support
is limited to hotplug enabled root-ports and PCIe-PCI bridges.
One also has to enable ACPI PCI hotplug (it's enled by default
on recent machine types) for it to work (i.e.it's not supported
in native PCIe hotplug mode).

So if mgmt can put nics on root-ports/bridges, then acpi-index
should just work on Q35 as well.


With only a few exceptions (e.g. the first ich9 audio device, which is
placed directly on the root bus at 00:1B.0 because that is where the
ich9 audio device is located on actual Q35 hardware), libvirt will
automatically put all PCI devices (including network interfaces) on a
pcie-root-port.

After seeing reports that "acpi index doesn't work with Q35
machinetypes" I just assumed that was correct and didn't try it. But
after seeing the "should work partially" statement above, I tried it
just now and an  of a Q35 guest that had its PCI address
auto-assigned by libvirt (and so was placed on a pcie-root-port)m and
had  was given the name "eno4". So what exactly is it
that *doesn't* work?


  From QEMU side:
acpi-index requires:
   1. acpi pci hotplug enabled (which is default on relatively new q35 machine 
types)
   2. hotpluggble pci bus (root-port, various pci bridges)
   3. NIC can be cold or hotplugged, guest should pick up acpi-index of the 
device
  currently plugged into slot
what doesn't work:
   1. device attached to host-bridge directly  (work in progress)
 (q35)
   2. devices attached to any PXB port and any hierarchy hanging of it (there 
are not plans to make it work)
 (q35, pc)


I'd say this is still a relatively important, as the PXBs are needed
to create a NUMA placement aware topology for guests, and I'd say it
is undesirable to loose acpi-index if a guest is updated to be NUMA
aware, or if a guest image can be deployed in either normal or NUMA
aware setups.


it's not only Q35 but also PC.
We basically do not generate ACPI hierarchy for PXBs at all,
so neither ACPI hotplug nor depended acpi-index would work.
It's been so for many years and no one have asked to enable
ACPI hotplug on them so far.


I'm guessing (based on absolutely 0 information :-)) that there would be
more demand for acpi-index (and the resulting predictable interface
names) than for acpi hotplug for NUMA-aware setup.



My guess is similar, but it is still desirable to have both (i.e. support 
ACPI-indexing/hotplug with Numa-aware)
Adding @Peter Xu to check if our setups for SAP require NUMA-aware topology

How big of a project would it be to enable ACPI-indexing/hotplug with PXB?


Why would you need to add acpi hotplug on pxb?


Adding +Julia Suvorova and +Tsirkin, Michael to help answer this question

Thanks,
Amnon



Since native PCI was improved, we can still compromise on switching to 
native-PCI-hotplug when PXB is required (and no fixed indexing)


Native hotplug works on pxb as is, without disabling acpi hotplug.


Are you saying you can add an acpi-index to a device plugged into a pxb, 
that index will be recognized (and used to name the dev

Re: Need help

2022-12-02 Thread Laine Stump


On 12/2/22 7:43 AM, Gk Gk wrote:

Hi,

We have an openstack platform and we are trying to get the network 
details of the guest vm on the hypervisors using the python libvirt 
library (domain.interfaceStats) . But in cases of SR-IOV vms, the 
interface is not being reported by the above tool.  The interface in 
this case is "hostdev" in the guest xml definition. Can anyone let me 
know how to find out the sr-iov network interface details of the guest vm ?


Support for reporting stats of SRIOV VF-backed "hostdev" interfaces was 
added with commit b295f06d, which first appeared in libvirt-6.9.0. 
Because there is no named "netdev" type interface on the host side, you 
need to use the MAC address of the interface when calling 
virDomainInterfaceStats().


Unfortunately, the method used to get the stats for these interfaces 
(see the commit log for details) isn't supported by all SRIOV netdev 
drivers. The Intel 82559 I have uses the igb driver, which doesn't 
support it, so all the stats show up as "0". Even the ixgbe driver 
doesn't support it. According to the commit log if you have a 
sufficiently new (but not *too* new!) Mellanox card, then you may be in 
luck.

Re: Predictable and consistent net interface naming in guests

2022-11-02 Thread Laine Stump


On 11/2/22 11:58 AM, Igor Mammedov wrote:

On Wed, 2 Nov 2022 15:20:39 +
Daniel P. Berrangé  wrote:


On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote:

On Wed, 2 Nov 2022 10:43:10 -0400
Laine Stump  wrote:
   

On 11/1/22 7:46 AM, Igor Mammedov wrote:

On Mon, 31 Oct 2022 14:48:54 +
Daniel P. Berrangé  wrote:
 

On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:

Hi Igor and Laine,

I would like to revive a 2 years old discussion [1] about consistent network
interfaces in the guest.

That discussion mentioned that a guest PCI address may change in two cases:
- The PCI topology changes.
- The machine type changes.

Usually, the machine type is not expected to change, especially if one
wants to allow migrations between nodes.
I would hope to argue this should not be problematic in practice, because
guest images would be made per a specific machine type.

Regarding the PCI topology, I am not sure I understand what changes
need to occur to the domxml for a defined guest PCI address to change.
The only think that I can think of is a scenario where hotplug/unplug is
used,
but even then I would expect existing devices to preserve their PCI address
and the plug/unplug device to have a reserved address managed by the one
acting on it (the management system).

Could you please help clarify in which scenarios the PCI topology can cause
a mess to the naming of interfaces in the guest?

Are there any plans to add the acpi_index support?


This was implemented a year & a half ago

https://libvirt.org/formatdomain.html#network-interfaces

though due to QEMU limitations this only works for the old
i440fx chipset, not Q35 yet.


Q35 should work partially too. In its case acpi-index support
is limited to hotplug enabled root-ports and PCIe-PCI bridges.
One also has to enable ACPI PCI hotplug (it's enled by default
on recent machine types) for it to work (i.e.it's not supported
in native PCIe hotplug mode).

So if mgmt can put nics on root-ports/bridges, then acpi-index
should just work on Q35 as well.


With only a few exceptions (e.g. the first ich9 audio device, which is
placed directly on the root bus at 00:1B.0 because that is where the
ich9 audio device is located on actual Q35 hardware), libvirt will
automatically put all PCI devices (including network interfaces) on a
pcie-root-port.

After seeing reports that "acpi index doesn't work with Q35
machinetypes" I just assumed that was correct and didn't try it. But
after seeing the "should work partially" statement above, I tried it
just now and an  of a Q35 guest that had its PCI address
auto-assigned by libvirt (and so was placed on a pcie-root-port)m and
had  was given the name "eno4". So what exactly is it
that *doesn't* work?


 From QEMU side:
acpi-index requires:
  1. acpi pci hotplug enabled (which is default on relatively new q35 machine 
types)
  2. hotpluggble pci bus (root-port, various pci bridges)
  3. NIC can be cold or hotplugged, guest should pick up acpi-index of the 
device
 currently plugged into slot
what doesn't work:
  1. device attached to host-bridge directly  (work in progress)
(q35)
  2. devices attached to any PXB port and any hierarchy hanging of it (there 
are not plans to make it work)
(q35, pc)


I'd say this is still a relatively important, as the PXBs are needed
to create a NUMA placement aware topology for guests, and I'd say it
is undesirable to loose acpi-index if a guest is updated to be NUMA
aware, or if a guest image can be deployed in either normal or NUMA
aware setups.


it's not only Q35 but also PC.
We basically do not generate ACPI hierarchy for PXBs at all,
so neither ACPI hotplug nor depended acpi-index would work.
It's been so for many years and no one have asked to enable
ACPI hotplug on them so far.


I'm guessing (based on absolutely 0 information :-)) that there would be 
more demand for acpi-index (and the resulting predictable interface 
names) than for acpi hotplug for NUMA-aware setup.


Anyway, it sounds like (*within the confines of how libvirt constructs 
the PCI topology*) we actually have functional parity of acpi-index 
between 440fx and Q35.

Re: Predictable and consistent net interface naming in guests

2022-11-02 Thread Laine Stump


On 11/1/22 7:46 AM, Igor Mammedov wrote:

On Mon, 31 Oct 2022 14:48:54 +
Daniel P. Berrangé  wrote:


On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:

Hi Igor and Laine,

I would like to revive a 2 years old discussion [1] about consistent network
interfaces in the guest.

That discussion mentioned that a guest PCI address may change in two cases:
- The PCI topology changes.
- The machine type changes.

Usually, the machine type is not expected to change, especially if one
wants to allow migrations between nodes.
I would hope to argue this should not be problematic in practice, because
guest images would be made per a specific machine type.

Regarding the PCI topology, I am not sure I understand what changes
need to occur to the domxml for a defined guest PCI address to change.
The only think that I can think of is a scenario where hotplug/unplug is
used,
but even then I would expect existing devices to preserve their PCI address
and the plug/unplug device to have a reserved address managed by the one
acting on it (the management system).

Could you please help clarify in which scenarios the PCI topology can cause
a mess to the naming of interfaces in the guest?

Are there any plans to add the acpi_index support?


This was implemented a year & a half ago

   https://libvirt.org/formatdomain.html#network-interfaces

though due to QEMU limitations this only works for the old
i440fx chipset, not Q35 yet.


Q35 should work partially too. In its case acpi-index support
is limited to hotplug enabled root-ports and PCIe-PCI bridges.
One also has to enable ACPI PCI hotplug (it's enled by default
on recent machine types) for it to work (i.e.it's not supported
in native PCIe hotplug mode).

So if mgmt can put nics on root-ports/bridges, then acpi-index
should just work on Q35 as well.


With only a few exceptions (e.g. the first ich9 audio device, which is 
placed directly on the root bus at 00:1B.0 because that is where the 
ich9 audio device is located on actual Q35 hardware), libvirt will 
automatically put all PCI devices (including network interfaces) on a 
pcie-root-port.


After seeing reports that "acpi index doesn't work with Q35 
machinetypes" I just assumed that was correct and didn't try it. But 
after seeing the "should work partially" statement above, I tried it 
just now and an  of a Q35 guest that had its PCI address 
auto-assigned by libvirt (and so was placed on a pcie-root-port)m and 
had  was given the name "eno4". So what exactly is it 
that *doesn't* work?

Re: Predictable and consistent net interface naming in guests

2022-10-31 Thread Laine Stump

On 10/31/22 2:21 PM, Edward Haas wrote:

On Mon, Oct 31, 2022 at 6:55 PM Andrea Bolognani > wrote:

On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:
 > That discussion mentioned that a guest PCI address may change in
two cases:
 > - The PCI topology changes.
 > - The machine type changes.
 >
 > Usually, the machine type is not expected to change, especially
if one
 > wants to allow migrations between nodes.
 > I would hope to argue this should not be problematic in practice,
because
 > guest images would be made per a specific machine type.

The machine type might not change from q35 to i440fx and vice versa,
but since the domain XML is constructed every time a KubeVirt VM is
started, the machine type might be q35-6.0 on one boot and q35-7.0
the next one if a KubeVirt upgrade that comes with a new version of
QEMU has happened in between.

This is unlikely to make a difference in terms of PCI addresses seen
in the guest OS, but it's still not accurate to say that the machine
type will not change.

Thank you for the clarification.
It makes me wonder now what are the actual implications of
the machine type change.

Live migration is a separate matter, as the machine type will
definitely not change while the VM is running.

 > Regarding the PCI topology, I am not sure I understand what changes
 > need to occur to the domxml for a defined guest PCI address to
change.
 > The only think that I can think of is a scenario where
hotplug/unplug is
 > used,
 > but even then I would expect existing devices to preserve their
PCI address
 > and the plug/unplug device to have a reserved address managed by
the one
 > acting on it (the management system).
 >
 > Could you please help clarify in which scenarios the PCI topology
can cause
 > a mess to the naming of interfaces in the guest?

A change in libvirt (again, due to a KubeVirt upgrade in between two
boots of the same VM) might result in different PCI addresses being
assigned to devices despite the same input XML.

We generally try fairly hard to avoid this kind of situation, but we
can only really guarantee stable PCI addresses for the lifetime of a
VM that has been defined and can't promise that the same input XML
will result in the same guest ABI when using different versions of
libvirt.

I would expect the PCI addresses that have been explicitly set in the
domxml [2] to be honored. We cannot assume that?

*If* the PCI address has been set in the original XML, that address will 
be honored any and every time a new domain is defined from that XML. 
Alternately, if a domain is defined once (without explicitly specifying 
any PCI addresses) and then run multiple times from the same definition, 
libvirt will auto-generate PCI addresses at initial definition time, and 
then use those same addresses each time the domain is run.

The issue is that no management application, including KubeVirt, is 
explicitly setting the PCI addresses of devices (and we believe that 
hands-off practice should continue), *AND* KubeVirt is re-defining the 
domain each time it is run (without querying libvirt for (and so never 
saving) the PCI addresses that were assigned to the devices. So each 
time the domain is stopped, all the PCI address info from that run is 
thrown away. And each time the domain is re-started (by re-defining it 
from the original XML that has no PCI address info), libvirt starts from 
scratch assigning addresses based on the information it receives from 
KubeVirt. And if the conditions have changed, then addresses are 
assigned differently.

The potential situation Andrea described, where the PCI addresses could 
be changed merely due to an upgrade of KubeVirt/libvirt/qemu from one 
run to the next in spite of being fed the same (adress-less) XML, is 
actually extremely rare (I don't remember such a case) but theoretically 
it could happen. The more common change would be if a device was added 
or removed during one run of the guest, and then remained added/removed 
the next time it was run - that could change the PCI addresses of one or 
more of the remaining devices, depending on their ordering in the XML).

So, libvirt provides two avenues to maintaining stable PCI addresses 
(and thus, network device names) across multiple runs of a domain 
(either define once, run many, or else query the XML of the running 
domain and use that XML (containing PCI addresses) the next time the 
domain is started, but KubeVirt doesn't use either of these (and if 
memory serves me correctly, it really can't due to its design. And 
delegating management of PCI addresses to KubeVirt is pushing too much 
complexity out to KubeVirt.

I mainly referred to that input option, not to the expectation that the 
generated

configuration (of the domxml) to be

Re: Libvirt virsh : Error starting network, cannot execute binary /usr/sbin/iptables

2022-08-16 Thread Laine Stump


On 8/15/22 1:00 PM, Pascal wrote:

Hi,

I am a bit lost and hope someone can help me. I am running Debian 
bookworm (testing) with last updates.


$ sudo apt policy libvirt-daemon
libvirt-daemon:
   Installé : 8.5.0-1
   Candidat : 8.5.0-1
  Table de version :
  *** 8.5.0-1 100
     100 /var/lib/dpkg/status


I am unable to start default network , and get an error related to 
iptables :


$ sudo virsh net-start default
erreur :Impossible de démarrer le réseau default
erreur :internal error: Failed to apply firewall rules 
/usr/sbin/iptables -w --table filter --list-rules: libvirt:  erreur : 
cannot execute binary /usr/sbin/iptables: Aucun fichier ou dossier de ce 
type


Sorry for the french, it says "impossible to start default network" and 
"no such file or folder" at the end.
libvirt execs the iptables command to install packet filtering rules 
that are part of the setup of virtual networks with forward mode of 
"nat", "route", and for those virtual networks that have no forward mode 
at all (aka "isolated"). libvirt's default network uses NAT to make all 
virtual machines appear to be at the IP address of the host's external 
ethernet interface, so it requires the iptables command


Although I've been working on patches to make it possible for libvirt to 
use /usr/sbin/nft to add & remove packet filter rules, libvirt still 
uses iptables to add its rules, so if have a virtual network that needs 
to add packet filtering rules (or a guest that has nwfilter rules), 
you'll need the iptables command, and if it's not found then libvirt 
will fail at runtime (as you've seen).


Beyond that, if you install the libvirt-daemon-driver-nwfilter or 
libvirt-daemon-driver-network packages, then iptables *should* be a hard 
prerequisite listed in the package manifest; if your distro's packaging 
system is allowing you to remove the iptables package without also 
removing those libvirt packages (which is what you say you were able to 
do), or allowing you to install those libvirt packages without 
(semi)automatically pulling in iptables (which you also say you've 
done), then that is a bug in the distro's packaging files for libvirt.


It's unclear from your original message if you really require libvirt's 
default network in any of your guests. If you don't, then you can just 
not install or start the libvirt-daemon-driver-network package, and 
don't try to start the default network, and you should be okay without 
the iptables package on your system. It's not really gaining you 
anything (other than a miniscule amount of disk space) to do that though.


These days /usr/sbin/iptables is a symbolic link to either iptables-nft 
(which uses the nftables API to put all the rules into special nft 
tables named "filter" and "nat", so you actually see the iptables rules 
as nft rules when you run "nft list ruleset") or to 
/usr/sbin/iptables-legacy (which uses the old iptables API to install 
rules that *don't* show up in "nft list ruleset"). In either case 
though, packets are processed using the same nft packet matching code in 
the kernel. There is some amount of info about these two variants here:



https://developers.redhat.com/blog/2020/08/18/iptables-the-two-variants-and-their-relationship-with-nftables

The end of it all though is that even if you have some userspace 
programs that "use iptables", they are in the end using nftables in the 
kernel.


It is true I removed iptables because I want to use only nftables (I 
removed both ufw and iptables packages (apt remove), and enabled the 
nftables service before error raises). Before this, all was fine, but 
when I enabled nftables, all VMs disapeared  from virt-manager).


Are the guests completely disappearing from the list? Or is it that 
you're unable to start the guests? A guest would only completely 
disappear spontaneously from virt-manager (i.e. be completely removed 
from the list of all guests) if something went wrong while parsing the 
guest config during libvirtd (or virtqemud) startup. Look in the system 
logs for any errors during libvirtd startup (or virtqemud startup if 
your debian has switched to modular daemons). I can't think of anything 
that could/should change in the parsing of the config due to enabling 
the nftables servive (which libvirt knows nothing about).



I uninstalled KVM related packages and reinstalled, still the same.

I also installed back iptables, but strangely I still get the same 
error, although binary /usr/sbin/iptables is there.


Now that certainly makes no sense! :-/ Possibly /usr/sbin/iptables is a 
symbolic link to a file that doesn't exist (because there was an 
additional iptables package, e.g. iptables-nft or iptables-legacy, that 
you didn't reinstall?)


I tried many things with no luck, restarted libvirtd service, recreated 
the network, etc...


Has anyone some idea about what is happening here ? is there some 
incompatibility with nftables (firewalld service is disabled) and libvirt ?


As I've said

Re: libvirt can't setup simple bridged network?

2022-08-15 Thread Laine Stump


On 8/15/22 10:11 AM, Ian Pilcher wrote:

I feel like I'm taking crazy pills!  I'm reading the libvirt network XML
format documentation[1], and I can't figure out how to create a simple
bridged network - no NAT, no routing, no OVS, no  macvtap, etc.  I.e.,
just a Linux bridge with a single physical interface attached.

None of the 3 scenarios listed for  describe the
simple setup that I'm trying to create, so it looks like I'll need to
create the bridge separately.  (It's not hard to do, it just seems like
such a weird gap the in the functionality.)


[1] https://libvirt.org/formatnetwork.html



libvirt's virtual network driver historically only creates networks that 
don't touch (and potentially mess up) the existing host system network 
config. But attaching a physical host system ethernet to a bridge 
requires moving the ethernet device's IP config over to the bridge, so 
that was considered "out of scope" for libvirt's network driver.


Back in 2008-2009, libvirt added an "interface driver" whose purpose was 
to configure/reconfigure host system network interfaces to, for example, 
attach a host ethernet to a bridge device, or add a vlan interface based 
on a host ethernet (and then attach that vlan interface to a bridge). 
This was initially supported on Fedora/CentOS/RHEL platforms using a (at 
the time new) library called netcf. After several years of floundering, 
I proposed in 2020 that we essentially admit failure and deprecate the 
netcf library (and libvirt's use of it). I don't have the energy to 
rehash the entire list of reasons here, but my message proposing the 
deprecation and listing all the reasons, is here:


https://listman.redhat.com/archives/libvir-list/2020-December/212781.html

These days (and even before, for the most part) if you want a bridge 
attached to a host system ethernet, it's recommended that you set that 
up using whatever host system network config you're using (e.g., 
NetworkManager, systemd-networkd, ifcfg files, /etc/network/interfaces 
file), and then either define your guest interfaces with type='bridge'>, or if you want to use  
andrefer to that with a libvirt network name, create a libvirt network 
with  (which expects that a bridge device will 
have already been created in the host system network config).

Re: Domain XML and VLAN tagging

2022-06-16 Thread Laine Stump


On 6/16/22 3:24 AM, Peter Krempa wrote:

On Thu, Jun 16, 2022 at 09:20:21 +0200, Gionatan Danti wrote:

Hi all,
from here [1]:

"Network connections that support guest-transparent VLAN tagging include 1)
type='bridge' interfaces connected to an Open vSwitch bridge Since 0.10.0 ,
2) SRIOV Virtual Functions (VF) used via type='hostdev' (direct device
assignment) Since 0.10.0 , and 3) SRIOV VFs used via type='direct' with
mode='passthrough' (macvtap "passthru" mode) Since 1.3.5 . All other
connection types, including standard linux bridges and libvirt's own virtual
networks, do not support it."

I read it correctly that when used on a classical linux bridge these vlan
tags does nothing? If so, it is due to something related to the underlying
bridge device (ie: incomplete support for vlan filtering) or it is because
libvirt lacks the necessary "plumbing" to use advanced bridge features?


AFAIK it was simply never implemented. There's also an upstream feature
request for this:

https://gitlab.com/libvirt/libvirt/-/issues/157



When VLAN tagging was first implemented, Linux host bridges didn't have 
this capability - the only way to get guest traffic transparently tagged 
in that case was by having the bridge attached to a host VLAN interface 
rather than directly to the physical ethernet (resulting in the traffic 
from all guests attached to the bridge being tagged/untagged). A few 
years later support for tagging on individual host bridge ports was aded 
to the Linux bridge driver, but there was never enough push for the 
feature to get it added to libvirt.


"Patches are welcome" of course!

Re: vepa-mode directly attached interface

2022-06-11 Thread Laine Stump


On 6/11/22 8:54 AM, Gionatan Danti wrote:

Hi all,
I just realized that libvirt default for directly attached interface is 
vepa mode.


I discovered it now because virt-manager automatically enables bridge 
mode, while cockpit-machines get the default vepa mode. This is 
unfortunate because, being vepa switches very rare, it means that using 
cockpit to configure directly attached interfaces causes guests to not 
talk each other.


I have some questions:

- can libvirt default be changed/configured somehow (ie: to 
automatically create bridge-mode directly attached interface when no 
mode is specified)?


libvirt's defaults are defaults, and can't be changed with config. 
libvirt also will never changed the hardcoded default, because that 
could break existing installations.


I'm not sure why vepa was chosen as the default when macvtap support was 
added to libvirt, but my best guess is that it was just the bias of the 
person doing the work who assumed their usage would be the most common 
(IIRC it was done by someone who was specifically wanting to support 
connection to VEPA-capable switches, and since it was a new feature 
nobody else had experience with / opinions about which mode would be 
most common, so the reviewers just accepted this default)




- how can I use virsh to discover machines with vepa-mode interfaces 
(virsh domiflist  does not return the interface mode)?


I guess you'll need to do a "virsh dumpxml --inactive" for each guest 
and parse it out of there.




- can I change the interface type at runtime (virt-xml  --edit 
--network type=direct,source.mode=bridge works for inactive domains only)?


No, I think the mode of the macvtap interface is set when the interface 
is created, and can't be changed later (i.e. it's a limitation of 
macvtap, not libvirt)

Re: Problem calling 'virsh' in a script

2022-05-15 Thread Laine Stump


On 5/15/22 11:48 AM, Digimer wrote:

Hi all,

   I've got a series of programs that monitor various things on a CentOS 
Stream 8 VM host. All of these scripts work when called directly. 
However, when I have a parent program that calls all the little programs 
in series, I found that some virsh calls hang.


Is your script being called from a libvirt "hook" script? 
(https://libvirt.org/hooks.html )If so, that won't work - a libvirt hook 
script is called from within libvirt, and can't call back into libvirt.


Other than that, is there anything different about the context the 
script is being run from vs. the context you're directly running virsh from?




   Initially, there were two scripts that were hanging repeatedly. Once 
called 'virsh net-list --all --name', so I changed it to check for 
configs in '/etc/libvirt/qemu/networks/', and that script started 
working. The other script though calls 'virsh list --all', and that 
can't be easily swapped out, so I really need to find the source of 
these hangs.


   Whenever the hang happens, about 30~45 seconds later, I see 
'libvirtd[1643714]: Cannot recv data: Connection reset by peer'.


   I think the issue is striking other scripts that run, but this 
scenario is happening predictably and consistently right now.


   I thought it might be a concurrent connect limit or a problem with 
how many times virsh is called by a script, so I wrote a test script 
that kept calling 'virsh list --all' each second, but it was close to 
100 calls without hanging, far more that all the calls in my scripts 
combined, so I don't think that's it.


Any advice/guidance would be very much appreciated!

--
Digimer
Papers and Projects:https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of Einstein’s brain 
than in the near certainty that people of equal talent have lived and died in cotton 
fields and sweatshops." - Stephen Jay Gould

Re: Updating domains definitions via API

2022-05-15 Thread Laine Stump


On 5/14/22 6:42 PM, Darragh Bailey wrote:

Hi,

On Sat 14 May 2022, 21:11 Laine Stump, <mailto:la...@redhat.com>> wrote:


Caveat - I'm completely unfamiliar with ruby and the libvirt-ruby API
bindings.

If there is a problem that causes the domain config to not be updated,
libvirt will return an error. So I would suspect one of the two things
is happening:


Thanks, that's what I was expecting should happen, just wanted to be 
sure that there wasn't some other behaviour in place for compatibility 
reasons.


1) there may be a problem in the libvirt-ruby bindings that causes the
error reported by the call (in whatever C code is behind the ruby
bindings) to libvirt to be properly propagated to ruby. I would hope
this isn't the case, but "bugs happen", so it should be considered as a
possibility.


A quick look suggests that the code looks to raise an exception if the 
dom pointer returned is NULL, so I think the bindings are correct. But I 
will check that what version of ruby-libvirt I have installed matches 
the source code I'm looked at.


2) As I said in my earlier mail, any changes that are made will take
effect the next time the domain is destroyed and restarted. This also
means that the changes won't be reflected in the "live/status" XML of
the domain until that time. If you want to see the new configuration
after you've made changes, you should add the VIR_DOMAIN_XML_INACTIVE
flag when requesting the domain XML. Possibly you haven't included this
flag, and that's why you think that your change hasn't taken effect?


Ah, I forgot to outline where in the lifecycle the update is taking 
place. The domain isn't running when the code attempts to update the 
definition.


Does that still mean that the VIR_DOMAIN_XML_INACTIVE flag is needed? I 
was assuming when the domain is inactive the XML changes would be 
reflected immediately.


No, your thinking was correct - if the domain isn't active, then the 
change should take effect immediately, and there is no difference 
whether or not you have VIR_DOMAIN_XML_INACTIVE.


I've never done anything directly with the nvram setting (just accepted 
whatever virt-manager put in there), but from your other message, it 
sounds like you've found a bonafide libvirt bug (either that, or I just 
don't know enough about how the nvram settings work :-)). Can you file 
an issue at https://gitlab.com/groups/libvirt/-/issues ?




Oddly I thought during some experiments when the added NVRAM XML element 
was ignored, the updated number of CPUs which was in the same XML 
definition passed in was applied.


Another indication that it's a bug - updates to the domain config are 
always an all-or-nothing thing.


Will dig further tomorrow or Monday on the version of ruby-libvirt 
installed into my rvm dev env as well as checking passing in the flag.


I'm sure it'll turn out to be something obvious that I'm overlooking.

Thanks,
--
Darragh

Re: Updating domains definitions via API

2022-05-14 Thread Laine Stump


On 5/14/22 3:23 PM, Darragh Bailey wrote:


On Fri, 13 May 2022 at 00:17, Darragh Bailey <mailto:daragh.bai...@gmail.com>> wrote:


Hi,

On Thu 12 May 2022, 21:34 Laine Stump, mailto:la...@redhat.com>> wrote:

The virDomainDefineXMLFlags API (and also the older/deprecated
virDomainDefineXML API) doesn't require that the domain first be
undefined (with one notable exception - see below[*]). If you
define a
domain that already exists with the same name and uuid, then the
effect
is to "redefine" (or "update" if you prefer) the existing domain
of that
name. If the domain is currently active, then the changes will take
effect the next time the domain is shut down ("Destroy"ed in
libvirt API
parlance) and re-started.


Unfortunately trying to call this via ruby-libvirt doesn't appear to 
behave as expected. It appears that if I add an nvram element without a 
loader element to the os block, the following code block will execute 
without issue but also without changing the domain XML:


# Apply XML changes directly
if descr_changed
   begin
     env[:ui].info("Updating domain definition due to configuration change")
     new_descr = String.new
     xml_descr.write new_descr
     # env[:machine].provider.driver.connection.client provides access 
to the ruby-libvirt connection object
     libvirt_domain = 
env[:machine].provider.driver.connection.client.define_domain_xml(new_descr, 
1)
     domain = 
env[:machine].provider.driver.connection.servers.get(env[:machine].id.to_s)

   rescue Fog::Errors::Error => e
     raise Errors::UpdateServerError, error_message: e.message
   end
end

I've by-passing the fog-libvirt to call the ruby-libvirt connection 
object directly to try and avoid any unexpected interference when 
testing the API call out e.g. 
https://libvirt.org/ruby/api/Libvirt/Connect.html#method-i-define_domain_xml 
<https://libvirt.org/ruby/api/Libvirt/Connect.html#method-i-define_domain_xml>


So not only is there no exception thrown if the XML change is ignored, 
I've noticed there doesn't appear to be an easy way to check using the 
API either, in that 
https://libvirt.org/ruby/api/Libvirt/Domain.html#method-i-updated-3F 
<https://libvirt.org/ruby/api/Libvirt/Domain.html#method-i-updated-3F> 
doesn't indicate that the domain has been updated whether I call it on 
the libvirt_domain object returned, or an instance from before the 
define_domain_xml call is made.


It appears that the only way is to perform a call to get the xml 
definition of the libvirt_domain object returned in the above block and 
see if that matches the xml that was sent, if not, error.


Is this the expected usage of the API? Or should the call to 
define_domain_xml raise an exception if it cannot update the domain XML? 
as opposed to a schema validation error which does appear to be detected 
when I did something stupid.


Caveat - I'm completely unfamiliar with ruby and the libvirt-ruby API 
bindings.


If there is a problem that causes the domain config to not be updated, 
libvirt will return an error. So I would suspect one of the two things 
is happening:


1) there may be a problem in the libvirt-ruby bindings that causes the 
error reported by the call (in whatever C code is behind the ruby 
bindings) to libvirt to be properly propagated to ruby. I would hope 
this isn't the case, but "bugs happen", so it should be considered as a 
possibility.


2) As I said in my earlier mail, any changes that are made will take 
effect the next time the domain is destroyed and restarted. This also 
means that the changes won't be reflected in the "live/status" XML of 
the domain until that time. If you want to see the new configuration 
after you've made changes, you should add the VIR_DOMAIN_XML_INACTIVE 
flag when requesting the domain XML. Possibly you haven't included this 
flag, and that's why you think that your change hasn't taken effect?

Re: Updating domains definitions via API

2022-05-12 Thread Laine Stump


On 5/12/22 4:03 PM, Darragh Bailey wrote:

Hi,


Looking into a bug in vagrant-libvirt where an error during the update 
will cause the domain to be completely discarded.


https://github.com/vagrant-libvirt/vagrant-libvirt/issues/949 



Basically I think it stems from doing an undefine -> create with XML new 
process, which if there is an issue with the new XML due to KVM module 
not loaded or something similar it will be rejected, but unfortunately 
it is also unlikely to allow the old definition to be restored either.


I'm looking around to try and see if there is an API (specfically in 
ruby-libvirt) for updating the domain definition, so that if the new XML 
is rejected at least the old definition remains, and so far I'm drawing 
a blank.


Is the only option here to write using a temporary domain name, then 
remove the old domain and rename the new definition to the old domain?


Or have I missed the obvious API analogous to the edit functionality?


The virDomainDefineXMLFlags API (and also the older/deprecated 
virDomainDefineXML API) doesn't require that the domain first be 
undefined (with one notable exception - see below[*]). If you define a 
domain that already exists with the same name and uuid, then the effect 
is to "redefine" (or "update" if you prefer) the existing domain of that 
name. If the domain is currently active, then the changes will take 
effect the next time the domain is shut down ("Destroy"ed in libvirt API 
parlance) and re-started.


If any error is encountered during this redefinition, then no changes 
are made to the existing domain definition.


[*]The exception to this - if you attempt to Define a domain that has 
the same name or uuid as an existing domain, but the uuid/name is 
different, that is an error.

Re: macvtap with disconnected physical interface

2022-05-03 Thread Laine Stump


On 5/3/22 4:31 AM, Daniel P. Berrangé wrote:

On Mon, May 02, 2022 at 03:42:05AM +0200, Gionatan Danti wrote:

Dear list,
I just discovered the hard way that if the lower lever physical interface of
a macvlan bridge is disconnected (ie: by unplugging the eth cable, resulting
in no carrier), inter-guest network traffic from all virtual machines bound
to the disconnected interface is dropped.

This behavior surprises me, as with the classic bridges I can disconnect the
underlying physical interface without causing any harm to inter-guest
traffic.

Am I doing something wrong, or this really is the expected behavior? If so,
can I force the macvtap interfaces to bridge traffic even when the
underlying physical interface is disconnected?


Can you share the  configuration for your guest NIC so we
can see how it is setup.


I can't say that I've ever tried this, since my only reason for using 
macvtap is to provide the guests with direct connectivity to the 
physical network, and unplugging the physdev negates that. The behavior 
you describe doesn't surprise me all that much though, since the 
physical device in the case of a host bridge isn't an integral part of 
the bridge (it's just one more device attached to a port), while the 
physical device and macvlan bridge a much more closely associated.


I'm Cc'ing Michael Tsirkin to see if he has more authoritative 
information on whether or not the macvtaps connected to a macvlan bridge 
can communicate amongst themselves when the physdev is disconnected.


In the meantime, is there a reason you don't want to just use a standard 
host bridge that's not connected to any physdev? The one thing I can 
think of is that you might not want to allow communication between the 
host and guests, but as long as the bridge itself isn't given an IP 
address, that won't be possible (at least at the level of IP).

Re: Network interface element not working

2022-04-05 Thread Laine Stump


On 4/5/22 10:28 AM, Ian Pilcher wrote:

On 4/4/22 21:10, Laine Stump wrote:
That's not what the  element in a  is used for. 
It's actual use is (in my opinion) not all that useful, which has led 
to people assuming other functionality for it that doesn't exist.


Ah.  Thanks for clarifying that.

Anyway, if you want to have a bridge device that is directly attached 
to a physical ethernet, then you should set up a bridge in the host OS 
outside the scope of libvirt, with the physical ethernet attached to 
it, and then configure your libvirt guests to use that bridge with, e.g.


That is how I normally do things.  In this case, I'm "piggy backing" on
a pre-existing automation setup that uses libvirt to set up the bridges.
I'm using a hook script to add the physical interface to the bridge when
the virtual network is started, which seems to be working.


Umm That's how it was intended to *not* work :-P. The whole point of 
the self-managed libvirt virtual networks is to setup a separate subnet 
that has all traffic routed via IP to the physical network. I'm 
impressed you've found a hack that works for you[*], but just be 
prepared to "pick up the pieces" if it breaks :-)


(I hope you've removed the DHCP service, DNS service, and for that 
matter even the IP address from the libvirt network definition. Also, is 
the host ethernet being used (i.e. does it have an IP address) prior to 
attaching it to the bridge? Normally when an ethernet is attached to a 
bridge, its IP configuration is moved to the bridge device, and the 
ethernet has no IP. I'm not sure what kind of connectivity oddities 
would show up if you left the IP address on the ethernet itself)

Re: Network interface element not working

2022-04-04 Thread Laine Stump


On 4/4/22 2:08 PM, Ian Pilcher wrote:

I've added an interface element to a libvirt network, but it isn't
working.  The interface is not being added to the bridge, even after the
system is rebooted.


That's not what the  element in a  is used for. It's 
actual use is (in my opinion) not all that useful, which has led to 
people assuming other functionality for it that doesn't exist.


The *actual* use of the  element is simply to add an extra 
iptables rule that will drop all traffic originating from a guest and 
outbound to the real network if the interface it uses for egress doesn't 
match the one listed in the  element. It doesn't attach this 
egress interface to the network's bridge, and it doesn't modify the 
next-hop routing of the traffic (which is the more common mistaken 
belief of its function).


Anyway, if you want to have a bridge device that is directly attached to 
a physical ethernet, then you should set up a bridge in the host OS 
outside the scope of libvirt, with the physical ethernet attached to it, 
and then configure your libvirt guests to use that bridge with, e.g.



  
  ...




# virsh net-dumpxml ocp4-net

   ocp4-net
   b5852945-9889-4d22-ba61-879125316cec
   
     
   
     
     
   
   
   
   
   


# brctl show
bridge name bridge id   STP enabled interfaces
virbr-ocp4  8000.52540099   yes vnet0
virbr0  8000.525400a7ce7f   yes
virbr1  8000.52540051eb1f   yes vnet1

# rpm -q libvirt
libvirt-8.0.0-2.module_el8.6.0+1087+b42c8331.x86_64

Any ideas?

Re: networking question

2022-02-25 Thread Laine Stump





On 2/25/22 11:05 AM, Martin Kletzander wrote:

On Thu, Feb 24, 2022 at 07:07:18PM +0100, Natxo Asenjo wrote:

hi,

I have an issue with one host at a customer's site. I think this cannot
work, but I would like to ask you just in case I am confused.

host:
eno1: 172.20.10.x/24 management interface gw 172.20.10.254
bridge-service: 0.0.0.0/24


I may be misunderstanding this "freehand" description of your setup, but 
it sounds like you have a Linux host bridge device that is attached to 
eno1 and to a VM (via a tap device), and that the bridge device has no 
IP address, but the host's ethernet device *does* have an IP address 
configured. I haven't ever tried the setup this way, but I do know that 
normally you should have *no IP* on the ethernet device, and an IP on 
the bridge device (every set of instructions I've ever seen for adding a 
bridge to an ethernet have included the step of "move the ethernet IP 
config over to the bridge device config").


If the config actually does have IP address on the ethernet but not on 
the bridge, try swapping that.





tun0: openvpn tunnel to external data center
internal-bridge: x.x.x.x/28 ; routed subnet that goes to openvpn tun0

on vm:
eth0: x.x.x.x/28 on internal-bridge (default gw)
eth1: 172.20.10.x/24 bridge-service gw 172.20.10.254 (same as eno1)

Connectivity to and from openvpn (from and to datacenter) is perfect. All
vms are directly reachable from our management services, no natting.


From hypervisor I can ping the gw, from vm I cannot ping 172.20.10.254.


My gut feeling is that this cannot work because traffic for the 
hypervisor

for subnet 172.20.10.x/24 flows through eno1, but for vm through the
bridge-loggin interface. So that cannot work.



I am not sure, but I would try to see where the packets are really going
through by using wireshark/tshark or tcpdump.

The only thing that I can come up on the spot is that it is trying to go
through different interface at some point due to reverse path filtering,
settings for that are in /proc/sys/net/ipv4/conf/*/rp_filter, it might
be routed elsewhere anywhere along the way.  But it is hard to say
without knowing how all the networks are connected.  Maybe I'm just bad
at understanding your situation, for me it is usually better to see this
stuff happen in wireshark.  But I figured I at least let know know one
idea which we had an issue before as well.

Hope that helps,
Martin


Should we just ask the customer to give us different subnets for the host
and the vm?

TIA.
--
regards,
Natxo

Re: SSH VM from outside, but not from host

2022-02-16 Thread Laine Stump


On 2/16/22 4:40 AM, Peter Crowther wrote:
... hang on.  Why does the *bridge* have an IP address?  Think of a 
bridge as being like a switch; it has no address of its own.


It's not the IP address of the bridge, it's the IP address of the 
"default / built-in" port of the bridge. The standard way to configure a 
Linux host bridge is to attach the host's physical ethernet to the 
bridge, and move the IP config from the ethernet device to the bridge 
device. This is because each Linux host bridge has a single port 
(netdev) that is connected to the routing stack of the host's kernel. So 
traffic comes in the ethernet, to the port on the bridge that's 
connected to the ethernet, and then sent out of the bridge via this 
"built-in" port up to the host's IP stack for either reception by the 
host, or routing by IP. Since this built-in port is "closer" to the host 
kernel, it makes sense for the IP config to be there (at least that's 
how I think about it).



The comment I have about the *original* problem is this: what's being 
described sounds exactly like what would happen if the guest config was 
using  rather than . 
Because the description talks about being connectd via a bridge, I at 
first I assumed that the connection is , but 
then just now realized that although it is pointless to use 
type='direct' (a macvtap device) to connect via a bridge, it still would 
work (except host<->guest communication wouldn't work), so it's at least 
worth asking if possibly type='direct' was used by mistake.


https://wiki.libvirt.org/page/TroubleshootMacvtapHostFail

Probably not the issue here, but I thought I should throw it out there 
just in case :-)




Cheers,

Peter

On Tue, 15 Feb 2022 at 20:21, Wolf > wrote:


On 15 Feb 2022, at 20:04, Peter Crowther
mailto:peter.crowt...@melandra.com>>
wrote:


And eno1 and eno2 are *both* connected to the same external
switch, yes?


Correct, where each NIC has its ip access-list.
XX1.XX1.XX1.150 and XX2.XX2.XX2.100 are on separate NICs.

When I ping the VM, XX2.XX2.XX2.100, from the host, XX1.XX1.XX1.150,
the host pings itself.

Thanks!

Wolf





On Tue, 15 Feb 2022 at 17:17, Wolf mailto:ort_libv...@bergersen.no>> wrote:

 Hi!

1) I have two network ports on my server.
 -      eno1 has the IP: XX1.XX1.XX1.150

 -      bridge0 has the IP: XX2.XX2.XX2.100
        and has the interface member: port eno2.
        eno2 is not set up with an IP address.

2) The host runs on IP: XX1.XX1.XX1.150

3) A VM uses the bridge: bridge0, and has the IP: XX2.XX2.XX2.100

I have a problem with this setup:
I can ssh the VM on XX2.XX2.XX2.100 from outside, but from the
host, XX1.XX1.XX1.150, I can't ssh the VM on XX2.XX2.XX2.100.

Have I set up this wrong or is it something I can do to solve
this?

Thanks!

Wolf

Re: Public IP on virtual machine network issue

2022-02-15 Thread Laine Stump

On 2/14/22 10:18 AM, Tom Ammon wrote:

Laine,

Though I can't remember the particulars, I have a vague memory of the 
sysctl settings in that article indeed solving the problem of traffic 
not being forwarded on the bridge when I had configured no filtering on 
the guest - hence my attempt to share what worked for me. Perhaps it 
would be good to update that page.

Yeah, I had completely forgot of its existence until there were two 
unrelated references suddenly made to it in the last week.

I looked around for a link to create 
an account on the libvirt wiki but could find none. I'm happy to go do 
some more research around the items you mentioned and add a quick note 
to that page to keep from leading people astray in the future, if I 
could get an account on the wiki. Do you know how I would do that?

I actually tried to update the article after this second reference, and 
found that my password no longer works. Awhile back the decision was 
made to deprecate the wiki and slowly move content into "knowledgebase" 
articles that are included in the project git repo, and I think the wiki 
may have been made read-only at that time. I had planned to ask about 
that in IRC yesterday, but either forgot, or it was too late to catch 
anyone by the time I asked (I've even forgotten what happened yesterday :-/)

Anyway, even in the days when the wiki was "active", automatic account 
creation was disabled to prevent spam articles, so creating an account 
required sending a message to danpb asking for an account; these days I 
think he'd just say "don't bother - it's going away anyway".

Thanks anyway for the offer to update it though (and also for piping in 
with the idea in the first place - hopefully my response didn't come off 
as discouraging responses - even though it wasn't the source of the 
problem this time, next time yours might be the idea that solves the 
issue :-)).

I'll try to take care of the wiki article in the next day or two.

Thanks,
Tom

On Mon, Feb 14, 2022 at 8:12 AM Laine Stump <mailto:la...@redhat.com>> wrote:

On 2/13/22 5:38 PM, Tom Ammon wrote:
 > Can you post the output of iptables -L?
 >
 > By default, the bridge module in the kernel sends packets
traversing the
 > bridge to iptables (in the FORWARD chain I believe) for
processing. So
 > if you have configured a DENY policy on the FORWARD chain, or are
 > otherwise filtering in the forward chain, you'll be affecting
packets
 > traversing the bridge. Check out this page for details on how to
change
 > this behavior:
 >
https://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf
<https://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf>

 >
<https://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf
<https://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf>>

That information is *very* out of date; the situation has changed quite
a lot since that was written in 2014.

Filtering of packets traversing a bridge device are now only
filtered if
the br_netfilter module is loaded, which isn't done by default. It *is*
autoloaded if certain types of iptables rules are added(I can't
remember
the details of the type of rule though - there was a bug in iptables a
year or so ago where autoload of br_netfilter was triggered by libvirt
attempting to *remove* a rule of whatever type it was).

Anyway, unless "lsmod | grep br_netfilter" shows that you have
br_netfilter loaded, this entire path is a red herring (if you do have
it loaded, unload it, and try to figure out why it was loaded).

(Interestingly, this is the 2nd time this particular outdated page has
come up in the last week. Has something else broken somewhere that's
causing people to search out this page?)

 >
 > Tom
 >
 > On Sun, Feb 13, 2022 at 4:08 PM Marcin Groszek
mailto:mar...@voipplus.net>
 > <mailto:mar...@voipplus.net <mailto:mar...@voipplus.net>>> wrote:
 >
 >     I have been struggling with this for weeks and I was unable
to find an
 >     answer on line. Perhaps someone here can help me.
 >
 >     Oracle linux 8 running virtualization:
 >
 >     hardware node has a public IP address on interface bridge0
and physical
 >     eno1 is a member of the bridge0
 >
 >     a virtual OS has interface bridged to lan and source is
bridge0, Ip
 >     address of virtual OS is also a public from same class as the
 >     hardware node.
 >
 >     I can route in and out of virtual, I can ping from hardware
node to
 >     virtual and vice versa, so the routing works as it should,
sort of.
 >
 >     Whe

Re: Public IP on virtual machine network issue

2022-02-14 Thread Laine Stump





On 2/13/22 5:38 PM, Tom Ammon wrote:

Can you post the output of iptables -L?

By default, the bridge module in the kernel sends packets traversing the 
bridge to iptables (in the FORWARD chain I believe) for processing. So 
if you have configured a DENY policy on the FORWARD chain, or are 
otherwise filtering in the forward chain, you'll be affecting packets 
traversing the bridge. Check out this page for details on how to change 
this behavior: 
https://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf 



That information is *very* out of date; the situation has changed quite 
a lot since that was written in 2014.


Filtering of packets traversing a bridge device are now only filtered if 
the br_netfilter module is loaded, which isn't done by default. It *is* 
autoloaded if certain types of iptables rules are added(I can't remember 
the details of the type of rule though - there was a bug in iptables a 
year or so ago where autoload of br_netfilter was triggered by libvirt 
attempting to *remove* a rule of whatever type it was).


Anyway, unless "lsmod | grep br_netfilter" shows that you have 
br_netfilter loaded, this entire path is a red herring (if you do have 
it loaded, unload it, and try to figure out why it was loaded).


(Interestingly, this is the 2nd time this particular outdated page has 
come up in the last week. Has something else broken somewhere that's 
causing people to search out this page?)




Tom

On Sun, Feb 13, 2022 at 4:08 PM Marcin Groszek > wrote:


I have been struggling with this for weeks and I was unable to find an
answer on line. Perhaps someone here can help me.

Oracle linux 8 running virtualization:

hardware node has a public IP address on interface bridge0 and physical
eno1 is a member of the bridge0

a virtual OS has interface bridged to lan and source is bridge0, Ip
address of virtual OS is also a public from same class as the
hardware node.

I can route in and out of virtual, I can ping from hardware node to
virtual and vice versa, so the routing works as it should, sort of.

When I try tracepath or traceroute from outside to virtual I get !H on
last hup

same result when I try to do the same form hardware node to virtual
I get !H

Also, when I telnet (TCP) to a specific port on virtual where I have a
daemon LISTENING OR NOT I get: No route to host. Same experiment works
just fine for ssh port.

Firewalld is not running, and I just have very basic iptables rules
like
allowing external address block to ssh to hardware node and to virtual
dropping connections from all other sources

This issue presented it self when I attempted to setup a galera node on
virtual and ports 4567 is responding but 4568 and  are not, but the
daemons are running and I can clearly see lsoft showing "LISTENING"

I capture the traffic and the tcp as well as udp are getting to the
virtual. Is there a preconfigured netfiltering that I am not aware of?

What am I missing?




-- 
Best Regards:

Marcin Groszek
Business Voip Resource.
http://www.voipplus.net 



--
-
Tom Ammon
M: (737) 400-9042
thomasam...@gmail.com 
-

Re: frequent network collapse possibly due to bridging

2022-01-24 Thread Laine Stump





On 1/24/22 4:35 AM, Martin Kletzander wrote:

On Fri, Jan 21, 2022 at 08:42:58AM -0600, Hakan E. Duran wrote:

Hi,

I would like some help to troubleshoot the problem I have been having
lately with my VM host, which contains 5 VMs, one of which is for
pi-hole, unbound services. It has been a relatively common occurrence in
the last few weeks for me to find that the host machine has lost its
network when I get back home from work. Restoring the VM/VMs do not fix
the problem, the host needs to be restarted for a fix, otherwise there
is both loss of name resolution, as well as an internet connection; I
cannot ping even IPs such as 8.8.8.8. Since I use the pi-hole VM as 
the DNS

server for my LAN, this means that my whole LAN gets disconnected from
internet, until the host machine is rebooted. The host machine has a
little complicated network setup: the two gigabit connections are bonded
and bridged to the VMs; however this set up has been serving me so well
for several years now. The problem, on the other hand, appeared a few
weeks ago. This doesn't happen every day but often enough to be annoying
and disruptive for my family.



Always good to check what has changed those weeks ago, but I understand
it is difficult to find out what you were updating and where.


My question is, how can I troubleshoot this problem and figure out
whether it is truly due to network bridging somehow collapsing or not? I
tried to find some log files but all I could find were the
/var/log/libvirt/qemu/$VM files, and the particular log file for the 
pi-hole

VM reported the following lines; however, I am not sure if they are
associated with a real crash or just due to shutting down and restarting
the host (please excuse the word-wrapping):

char device redirected to /dev/pts/2 (label charserial0)
qxl_send_events: spice-server bug: guest stopped, ignoring
2022-01-20T23:41:17.012445Z qemu-system-x86_64: terminating on signal 
15 from pid 1 (/sbin/init)


Probably restarting the host as it got SIGTERM'd by init.  Maybe it was
restarted in a bad time and there is some inconsistency on the disk?
Using something like libvirt-guests which can manage your machines when
rebooting would be a good idea.


2022-01-20 23:41:17.716+: shutting down, reason=crashed
2022-01-20 23:42:46.059+: starting up libvirt version: 7.10.0, qemu
version: 6.2.0, kernel: 5.10.89-1-MANJARO, hostname: -redacted-

Please excuse my ignorance but is there a way to restart the
networking without rebooting the host machine? This will not solve my


You can do:

virsh net-destroy 
virsh net-start 

but depending on what the network looks like, how it is set up etc. you
might need to restart some of the VMs or manually plug them in.


The connection between any guest tap device and a host bridge device 
will be broken by virsh net-destroy, and not restored by virsh net-start 
(because the network driver has no good way of notifying the QEMU driver 
that it has restarted a network). This is something that's been on my 
"list of annoying things I should fix some day" for a very long time, 
but I've never been motivated enough to figure out a clean solution.


In the meantime, if you destroy/start a network, you can get all the 
guest tap devices reconnected by restarting libvirtd:


   systemctl restart libvirtd.service

or if you're using split daemons:

   systemctl restart virtqemud.service

One of the things the QEMU driver does when it's initializing is to 
check where each guest tap device *should* be connected, compare that to 
where it *is* connected, and if those don't match then fix it.

Re: configure path name of ebtables executable

2021-11-17 Thread Laine Stump





On 11/17/21 4:52 PM, Michael Ströder wrote:

On 11/17/21 20:28, Laine Stump wrote:

On 11/17/21 1:39 PM, Michael Ströder wrote:
Is it possible to configure the full path name of the ebtables 
executable used somewhere in libvirt's config?


That's done in meson.build when you're building from source. Look for 
"optional_programs".


Noted. Thanks.

Background: I'd like to avoid automatic alternatives implementation 
to mess up libvirt installation.


See also:
https://bugzilla.opensuse.org/show_bug.cgi?id=1192799


I don't think the problem is what is being suggested in that bug.


Yes, it is.


The suggestion in that bug is that the problem is because "libvirt need 
/sbin/ebtables point to ebtables-legacy", which is definitely not the 
case. libvirt only requires that /sbin/ebtables point to a binary that 
correctly understands and acts on any valid ebtables command. If the 
binary pointed to by /sbin/ebtables doesn't do that, then it shouldn't 
be pointed to by /sbin/ebtables.



follow the symlink from /sbin/ebtables to (probably)
/etc/alternatives/ebtables > [..]
This sounds more like SUSE has some sort of special off-brand
alternative that doesn't understand all valid ebtables commands
openSUSE Tumbleweed now uses libalternatives for ebtables (see my 
comment#2) and thus
/sbin/ebtables was linked to /usr/bin/alts. Yes, something's broken 
there and I was looking for a work-around.



Anyway, I think you'd be better off fixing the problem at the source
rather than trying to make some special local build of libvirt to
work around the problem.
IMHO a libvirtd.conf option would be great to avoid relying on this 
alternatives stuff.


There are many things within libvirt that we could do ourselves in order 
to avoid relying on some other package (or add a config knob to point at 
something different to do the work), but that just makes more code that 
must be maintained forever. Working around bugs in other packages with 
package-local fixes and config knobs is a never-ending unwinnable battle 
once you start, and leads to unnecessarily complicated code and 
technical debt (I say this from painful experience - although it's a bit 
of a whatabout-ism, if I had the time (I unfortunately don't :-)) I 
would tell the story of iptablesAddOutputFixUdpChecksum, just as one 
example).


I think a better road in this case (and most other cases) would be to 
fix the package that is broken (sounds like libalternatives). If it is 
breaking libvirt, it's likely breaking other things as well.

Re: configure path name of ebtables executable

2021-11-17 Thread Laine Stump





On 11/17/21 1:39 PM, Michael Ströder wrote:

HI!

Is it possible to configure the full path name of the ebtables 
executable used somewhere in libvirt's config?


That's done in meson.build when you're building from source. Look for 
"optional_programs".




Background: I'd like to avoid automatic alternatives implementation to 
mess up libvirt installation.


See also:
https://bugzilla.opensuse.org/show_bug.cgi?id=1192799


I don't think the problem is what is being suggested in that bug.
Your claim about /etc/alternatives in comment 3 doesn't make any sense - 
I have ebtables-2.0.11 installed on my Fedora machine, and it is using 
/etc/alternatives, and I don't get that error message.


Try running this command:


   /sbin/ebtables -t nat -N xyzzy

and see if it gives you the same error. If it does, then follow the 
symlink from /sbin/ebtables to (probably) /etc/alternatives/ebtables and 
to the final destination from there - you should either end up with 
/usr/sbin/ebtables-legacy or /usr/sbin/ebtables-nft. If you don't, then 
your "alternatives" stuff is messed up, and you need to fix that.


I just checked the current source code for ebtables and the word 
"subcommand" doesn't appear anywhere in it. This sounds more like SUSE 
has some sort of special off-brand alternative that doesn't understand 
all valid ebtables commands (because the command being complained about 
in the error message in the bug *is* a valid ebtables commandline); 
maybe something that put's a shell script wrapper around calling 
ebtables or something?. If so, you need to switch away from that 
alternative, or somebody needs to fix the alternative.


Anyway, I think you'd be better off fixing the problem at the source 
rather than trying to make some special local build of libvirt to work 
around the problem.

Re: another upgrade another vm issue

2021-10-30 Thread Laine Stump





On 10/30/21 6:57 AM, daggs wrote:

Greetings Michal,


Sent: Friday, October 29, 2021 at 11:36 AM
From: "Michal Prívozník" 
To: "daggs" , libvirt-users@redhat.com
Subject: Re: another upgrade another vm issue

On 10/28/21 8:40 PM, daggs wrote:

Greetings,

so I've upgraded my server and yet again, one of my vm lost a functionally.
there is no usable sound card.
xml: https://dpaste.com/CVR5M75VH
in vm: https://snipboard.io/aZ7Dcf.jpg

outputs:
utils_server /home/igor # qemu-system-x86_64 --version
QEMU emulator version 6.0.0
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
utils_server /home/igor # libvirtd --version
libvirtd (libvirt) 7.8.0
utils_server /home/igor #

any ideas?

Thanks.




I suspect it's related to:

   

in the domain XML. Selecting a backend might help.

   https://libvirt.org/formatdomain.html#audio-backends

Michal




I've diffed the current xml with the one last known to work, they both have the 
same entry


Can you possibly downgrade QEMU, libvirt, and the host kernel 
independently to see if downgrading just one (or two) of them fixes the 
problem?


(I would normally have suggested diffing the qemu commandline from the 
logs in /var/log/libvirt/qemu/${guestname}.log to see if the VFIO 
portion was unchanged (and thus counting out libvirt from the possible 
causes of the problem), but the commandlines have recently switched to 
using JSON syntax, so they will probably be different anyway).


Since your audio device is assigned with VFIO, and libvirt has nothing 
more to do with that than just unbinding the host driver and binding the 
vfio-pci driver, then putting the info on the commandline for QEMU, I'd 
put money on the problem being either in QEMU or the kernel, in which 
case you'd probably get more useful advice from the 
vfio-us...@redhat.com mailing list than here.

Re: internal error: Network is already in use by interface virbr0

2021-09-17 Thread Laine Stump


On 9/17/21 7:11 AM, Michael Ströder wrote:

HI!

I'm using libvirt 7.7.0 on openSUSE Tumbleweed.

Until recently everything just worked. But now my virtual NAT network is
not usable anymore.

Starting a VM I get this error:

# virsh start ae-dir-suse-p1
error: Failed to start domain 'ae-dir-suse-p1'
error: Requested operation is not valid: network 'vnet1' is not active

Manually starting the network does not work either:

# virsh net-start vnet1
error: Failed to start network vnet1
error: internal error: Network is already in use by interface virbr0

In syslog I find:

libvirtd: 6325: error : networkCheckRouteCollision:296 : internal error:
Network is already in use by interface virbr0

What does that mean? Where to look for a config error?



Someone else (probably libvirt itself, for libvirt's "default" network) 
has created a bridge device called virbr0, and has set it up on the same 
subnet that you are trying to use for your network, which is confusingly 
called "vnet1".


My first guess is would be that either:

1) you previously did not have the libvirt-daemon-config-network package 
(which adds the default network to the host and sets it to auto-start) 
and now it is installed, and being started before the "vnet1" network.


or

2) you *did* have libvirt-daemon-config-network (and thus the libvirt 
default network) installed, but for whatever reason your libvirtd was 
starting your "vnet1" network first, then failing to start the "default" 
network.


To fix this problem, change the subnet used by one or the other of the 
networks such that they don't conflict (e.g. if both have 
192.168.122.0/24, change one of them to 192.168.123.0/24) using "virsh 
net-edit $networkname", then virsh net-destroy and virsh net-start both 
networks.


Alternately, if you never use libvirt's default network, just 
net-destroy it and set it to not auto-start:


   virsh net-destroy default
   virsh net-autostart default --disable
   virsh net-start vnet1

Or, another alternate possibility - instead of using your own NAT 
network, just edit your guest config to use libvirt's default network 
and set net-autostart disable for your vnet1 network (if you have any 
extra config in that network, like static DHCP addresses, you'll want to 
move it into the default network, then net-destroy/net-start the default 
network so those changes take effect).


Once you've changed the config, it should persist through future reboots 
of your host.




When restarting libvirtd itself I see the following syslog warning message:

libvirtd: 5733: warning : netcfIfaceRegister:1287 : Failed to initialize
libnetcontrol.  Management of interface devices is disabled


That's unrelated and unimportant. Most probably the libvirt APIs 
affected by that have *never* operated properly on openSUSE (it uses a 
"netcf-lookalike" that, AFAIU, accepts the same XML as netcf and has 
some of the same functions, but was never really "complete", and is 
probably just as much abandonware as netcf (which was abandoned, by me, 
last year)), and it's likely that you never had the libnetcontrol 
package installed at all, and have always had this warning but just 
didn't notice it until now.

Re: virsh domifaddr domain does not show static IP

2021-09-10 Thread Laine Stump


On 9/9/21 10:07 PM, Kaushal Shriyan wrote:


Thanks Laine for the detailed explanation. The below command worked. 
Thanks a lot and appreciate it.


virsh domifaddr testdobssbahrainms --source agent

Name       MAC address          Protocol     Address
---
  lo         00:00:00:00:00:00    ipv4 127.0.0.1/8 
  eth0       52:14:00:74:11:14    ipv4 192.168.0.113/24 



Is there a way to find out the Static IP address if the KVM Guest VM 
instance is shut off? Thanks in advance.


Maybe?

In the case of --source agent, the guest must be queried, and the guest 
is no longer running so it can't answer. In the other cases, libvirt is 
looking for the name of the tap device used by the guest interface, but 
the tap device no longer exists. So libvirt doesn't provide any method.


However, if the guest was recently running, there may still be an entry 
left in the arp cache, and you would be able to see it by grepping (on 
the host, of course) for the guest's MAC address, like this:


   arp -an | grep 52:14:00:74:11:14

That's not going to work if the guest has been down for longer than the 
timeout of the arp cache though (or if the guest hasn't communicated 
with the host in any manner for that amount of time).

Re: virsh domifaddr domain does not show static IP

2021-09-09 Thread Laine Stump





On 9/9/21 2:11 PM, Kaushal Shriyan wrote:

Hi,

I have assigned static IP for all the below KVM Guest VM's. Is there a 
way to find out the IP of the below VM's from virsh utility or any 
other utility? virsh domifaddr testdobssbahrainms does not show the 
static IP.


By default, "virsh domifaddr" will attempt to lookup the MAC address of 
the guest interface in the table of IP addresses leased from the 
libvirt-managed dhcp server for the network the interface is connected 
to. If the interface has a statically configured IP, or if it isn't 
connected to a libvirt-managed virtual network, then no IP addresses 
will be found. You can change this behavior with the "--source" option:


   '--source arp'  - looks for the MAC address in the host's ARP table
   '--source agent' - queries the guest agent (which must have been
  installed)

Each of these has varying levels of reliability and success, depending 
on your specific setup. e.g., if the guests don't have the guest agent 
installed, that method will fail, and if the guest can't be trusted, 
then any information it sends also can't be trusted. Alternately, it is 
possible for an external device on the network to poison the ARP table 
with bad information, and it's also possible that the guest simply has 
no entry in the ARP table (if the host has never attempted to 
communicate with it, or if it's connected via a host interface/bridge 
that itself has no IP address.)


Hmm, and I guess --source arp also wouldn't work for guests connected 
via macvtap () since guest<->host communication 
isn't supported in that case (and so the guest could never show up in 
the host's ARP table).


Anyway if your guests are connected to a libvirt virtual network or to a 
Linux host bridge that has an IP address on the host, then "--source 
arp" should work for you.




# virsh list --all
  Id   Name                      State
-
  1    testdobssbahrainms         running
  2    testdosstomcatpibms       running
  3    testdobsstomcatkineticms   running
  4    testdobsstomcatmsbms       running
  5    testdobsstomcatfdms        running
  6    testdobsstomcathsbcnetms   running
  7    testdobsstomcatdbbms       running
  8    testdobssapigeedev         running

#
# virt-install --version
2.2.1
# cat /etc/redhat-release
CentOS Stream release 8

#virsh domifaddr testdobssbahrainms
#Name       MAC address          Protocol     Address
---

Please guide. Thanks in advance.

Best Regards,

Kaushal

Re: issues with vm after upgrade

2021-08-29 Thread Laine Stump

(Alex - search for your name down in the middle of this - there is one 
question for you. You can probably save your neurons the trouble of 
reading the rest)


On 8/28/21 6:56 AM, daggs wrote:

Greetings Laine,


Sent: Wednesday, August 25, 2021 at 7:53 PM
From: "Laine Stump" 
To: "daggs" 
Cc: "Martin Kletzander" , libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On 8/20/21 12:07 PM, daggs wrote:

Greetings Laine,


Sent: Monday, August 16, 2021 at 12:57 AM
From: "Laine Stump" 
To: "daggs" 
Cc: "Martin Kletzander" , libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade



On 8/14/21 6:05 AM, daggs wrote:

Greetings Martin,


Sent: Thursday, August 12, 2021 at 2:07 PM
From: "daggs" 
To: "Martin Kletzander" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade


Sent: Thursday, August 12, 2021 at 11:49 AM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On Wed, Aug 11, 2021 at 08:53:10PM +0200, daggs wrote:

Greetings Martin,



Sent: Wednesday, August 11, 2021 at 6:08 PM
From: "daggs" 
To: "Martin Kletzander" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

Greetings Martin,


Sent: Wednesday, August 11, 2021 at 4:13 PM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On Wed, Aug 11, 2021 at 03:09:34PM +0200, daggs wrote:

Greetings Martin,


Sent: Wednesday, August 11, 2021 at 10:14 AM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade



[...]



2) To your issue with starting the domain it would be good to know what
   is the error you get from virsh (or however you are starting the
   domain) and the debug logs of libvirtd, ideally just for the part of
   the domain starting.

that is the issue, there wasn't any error. the vm just didn't booted.


Oh, so I misunderstood.  What was the state of the VM in libvirt?
"paused" or "running"?  Was there serial console working?

it was marked as running and there was no serial



That's a pity we could not examine what was actually happening.




I can diff the original xml with the new one to see the diffs and post them 
here if you wish



Would be nice to see if there are any differences.  The newly created
one works then?



I'll sent it later today



here: https://dpaste.com/5VBUU8Z9W



Unfortunately there are many differences there.  The machine type
changes _something_ in qemu, there is different PCI(e) topology, and I
do not think I will be able to figure this out without the non-working
machine.

So if your current setup works for you right now I'd leave figuring out
the previous issue to others, if there is anyone wanting to figure out
if there is some libvirt issue.

Have a nice day



my current setup works beside the hdmi audio, this I still need to investigate.

thanks for your help.

Dagg



just to update, I've solved the sound issue, frankly, I don't understand how 
the guest showed a soundcard in the first place.
from what I gather, libvirt sets the -nodefaults flag to prepare the vm's 
properties from scratch.
in this situation, the sound card is a function in the host machine's pci tree.
when libvirt created the pci tree for the guest, it placed the card as a 
function of a device as well, in my case 02:00.2
however it didn't created a device at 02:00.0.


Are you basing this claim on the libvirt XML? Or on what you see with
lspci in the guest?

When libvirt is assigning PCI addresses to devices in a guest, it will
never auto-assign a non-0 function. This will only happen if the user
explicitly requests it (and even then, iirc, libvirt should generate an
error if function 0 of the same slot has no device - something to the
effect of "no device on function 0 of a multifunction device").

Anyway, when I looked back at the XML diff you posted earlier (see
below), I didn't see any hostdev device assigned to 02:00.2. What I
*did* see was that in both the old and the new version of the diff, the
hostdev devices were assigned to function 0 of different *slots* on a
dmi-to-pci-bridge controller, which should cause no problems (unless
there is a bug in QEMU's dmi-to-pci-bridge). (The important thing,
though, is that there is no hostdev device on a non-0 function, and when
it is on a non-0 slot, that's because it's on a dmi-to-pci-bridge (which
has 32 slots).



I saw it in guest,


But I didn't see it in the XML diffs that you had posted.

as mentioned below, here is the xml of the new vm but with the sound problem: 
https://dpaste.com/BB9EDY6BK
the relevant entry is at https://dpaste.com/BB9EDY6BK#l

Re: issues with vm after upgrade

2021-08-25 Thread Laine Stump

On 8/20/21 12:07 PM, daggs wrote:

Greetings Laine,

Sent: Monday, August 16, 2021 at 12:57 AM
From: "Laine Stump" 
To: "daggs" 
Cc: "Martin Kletzander" , libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On 8/14/21 6:05 AM, daggs wrote:

Greetings Martin,

Sent: Thursday, August 12, 2021 at 2:07 PM
From: "daggs" 
To: "Martin Kletzander" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

Sent: Thursday, August 12, 2021 at 11:49 AM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On Wed, Aug 11, 2021 at 08:53:10PM +0200, daggs wrote:

Greetings Martin,

Sent: Wednesday, August 11, 2021 at 6:08 PM
From: "daggs" 
To: "Martin Kletzander" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

Greetings Martin,

Sent: Wednesday, August 11, 2021 at 4:13 PM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On Wed, Aug 11, 2021 at 03:09:34PM +0200, daggs wrote:

Greetings Martin,

Sent: Wednesday, August 11, 2021 at 10:14 AM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

[...]

2) To your issue with starting the domain it would be good to know what
  is the error you get from virsh (or however you are starting the
  domain) and the debug logs of libvirtd, ideally just for the part of
  the domain starting.

that is the issue, there wasn't any error. the vm just didn't booted.

Oh, so I misunderstood.  What was the state of the VM in libvirt?
"paused" or "running"?  Was there serial console working?

it was marked as running and there was no serial

That's a pity we could not examine what was actually happening.

I can diff the original xml with the new one to see the diffs and post them 
here if you wish

Would be nice to see if there are any differences.  The newly created
one works then?

I'll sent it later today

here: https://dpaste.com/5VBUU8Z9W

Unfortunately there are many differences there.  The machine type
changes _something_ in qemu, there is different PCI(e) topology, and I
do not think I will be able to figure this out without the non-working
machine.

So if your current setup works for you right now I'd leave figuring out
the previous issue to others, if there is anyone wanting to figure out
if there is some libvirt issue.

Have a nice day

my current setup works beside the hdmi audio, this I still need to investigate.

thanks for your help.

Dagg

just to update, I've solved the sound issue, frankly, I don't understand how 
the guest showed a soundcard in the first place.
from what I gather, libvirt sets the -nodefaults flag to prepare the vm's 
properties from scratch.
in this situation, the sound card is a function in the host machine's pci tree.
when libvirt created the pci tree for the guest, it placed the card as a 
function of a device as well, in my case 02:00.2
however it didn't created a device at 02:00.0.

Are you basing this claim on the libvirt XML? Or on what you see with
lspci in the guest?

When libvirt is assigning PCI addresses to devices in a guest, it will
never auto-assign a non-0 function. This will only happen if the user
explicitly requests it (and even then, iirc, libvirt should generate an
error if function 0 of the same slot has no device - something to the
effect of "no device on function 0 of a multifunction device").

Anyway, when I looked back at the XML diff you posted earlier (see
below), I didn't see any hostdev device assigned to 02:00.2. What I
*did* see was that in both the old and the new version of the diff, the
hostdev devices were assigned to function 0 of different *slots* on a
dmi-to-pci-bridge controller, which should cause no problems (unless
there is a bug in QEMU's dmi-to-pci-bridge). (The important thing,
though, is that there is no hostdev device on a non-0 function, and when
it is on a non-0 slot, that's because it's on a dmi-to-pci-bridge (which
has 32 slots).

I saw it in guest,

But I didn't see it in the XML diffs that you had posted.

I'd assume that if libvirt defines a device on a specific bdf, the guest will 
not change it.

That's not exactly true - the bus "number" in libvirt isn't given to 
qemu as an actual number, but as an alphanumeric device id (called 
"alias name" in libvirt XML). QEMU doesn't have any concept of "bus 
number", because (afaiu) there is no way to convey such info to the 
guest firmware/OS; instead, QEMU creates a topology of interconnected 
controllers, the firmware and/or OS traverses this topology and assigns 
numbers to the encountered control

Re: issues with vm after upgrade

2021-08-15 Thread Laine Stump

On 8/14/21 6:05 AM, daggs wrote:

Greetings Martin,

Sent: Thursday, August 12, 2021 at 2:07 PM
From: "daggs" 
To: "Martin Kletzander" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

Sent: Thursday, August 12, 2021 at 11:49 AM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On Wed, Aug 11, 2021 at 08:53:10PM +0200, daggs wrote:

Greetings Martin,

Sent: Wednesday, August 11, 2021 at 6:08 PM
From: "daggs" 
To: "Martin Kletzander" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

Greetings Martin,

Sent: Wednesday, August 11, 2021 at 4:13 PM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

On Wed, Aug 11, 2021 at 03:09:34PM +0200, daggs wrote:

Greetings Martin,

Sent: Wednesday, August 11, 2021 at 10:14 AM
From: "Martin Kletzander" 
To: "daggs" 
Cc: d...@berrange.com, libvirt-users@redhat.com
Subject: Re: issues with vm after upgrade

[...]

2) To your issue with starting the domain it would be good to know what
 is the error you get from virsh (or however you are starting the
 domain) and the debug logs of libvirtd, ideally just for the part of
 the domain starting.

that is the issue, there wasn't any error. the vm just didn't booted.

Oh, so I misunderstood.  What was the state of the VM in libvirt?
"paused" or "running"?  Was there serial console working?

it was marked as running and there was no serial

That's a pity we could not examine what was actually happening.

I can diff the original xml with the new one to see the diffs and post them 
here if you wish

Would be nice to see if there are any differences.  The newly created
one works then?

I'll sent it later today

here: https://dpaste.com/5VBUU8Z9W

Unfortunately there are many differences there.  The machine type
changes _something_ in qemu, there is different PCI(e) topology, and I
do not think I will be able to figure this out without the non-working
machine.

So if your current setup works for you right now I'd leave figuring out
the previous issue to others, if there is anyone wanting to figure out
if there is some libvirt issue.

Have a nice day

my current setup works beside the hdmi audio, this I still need to investigate.

thanks for your help.

Dagg

just to update, I've solved the sound issue, frankly, I don't understand how 
the guest showed a soundcard in the first place.
from what I gather, libvirt sets the -nodefaults flag to prepare the vm's 
properties from scratch.
in this situation, the sound card is a function in the host machine's pci tree.
when libvirt created the pci tree for the guest, it placed the card as a 
function of a device as well, in my case 02:00.2
however it didn't created a device at 02:00.0.

Are you basing this claim on the libvirt XML? Or on what you see with 
lspci in the guest?

When libvirt is assigning PCI addresses to devices in a guest, it will 
never auto-assign a non-0 function. This will only happen if the user 
explicitly requests it (and even then, iirc, libvirt should generate an 
error if function 0 of the same slot has no device - something to the 
effect of "no device on function 0 of a multifunction device").

Anyway, when I looked back at the XML diff you posted earlier (see 
below), I didn't see any hostdev device assigned to 02:00.2. What I 
*did* see was that in both the old and the new version of the diff, the 
hostdev devices were assigned to function 0 of different *slots* on a 
dmi-to-pci-bridge controller, which should cause no problems (unless 
there is a bug in QEMU's dmi-to-pci-bridge). (The important thing, 
though, is that there is no hostdev device on a non-0 function, and when 
it is on a non-0 slot, that's because it's on a dmi-to-pci-bridge (which 
has 32 slots).

On the topic of having a dmi-to-pci-bridge show up in your XML: I don't 
remember what versions the changes were in (it was at least a year or 
two ago), but only a fairly old version of libvirt woud do that - 1) 
recent libvirt will assume that any hostdev PCI device is a PCIe device, 
so it will add a pcie-root-port and assign the hostdev device to slot 0 
of that root-port, and even before that 2) we switched from using 
dmi-to-pci-bridge to using pcie-to-pci-bridge quite some time ago as well.

So if you're generating new XML based on config that doesn't have pci 
controllers already in it, and you're seeing hostdevs (or any other PCI 
devices) assigned to an automatically-added dmi-to-pci-bridge, then your 
libvirt version is severely out of date.

On 8/11/21 2:53 PM, daggs wrote:
>> From: "daggs" 
>>> From: "Martin Kletzander" 
>>> On Wed, Aug 11, 2021 at 03:09:34PM +0200, daggs wrote:
 I can diff the original xml with the new one to see the diffs and 
post them here if you wish

>>>
>>> Would be

Re: Sharing dhcp leases between multiple host systems

2021-08-04 Thread Laine Stump





On 8/4/21 8:12 AM, Michael Ablassmeier wrote:

hi,

assume i have multiple host systems which spin up virtual machines using
the vagrant/vagrant-libvirt provider. Both host systems have a defined
network (which has the same name on both hosts) which the first network
interface of the virtual machine is assigned to.

During boot of the virtual machine, the first network device is
configured via DHCP and vagrant uses the mac address table or libvirt
dhcp leases table to find out about the IP address that was assigned to
the virtual machine: From that point on, i can reach the virtual machine
locally on the host system.

This works nicely if the network is a libvirt NAT network, as the IP
addresses are unique on both host systems.

Now i want to change the situation and provide routed addresses, thus i
want to make sure that an IP that is assigned for a virtual machine on
host A is not re-used on host B to not have IP address conflicts.


If you are using routed networking rather than NAT, then the routed 
network on each host will have to use a different subnet anyway[*], so 
there is no chance of any conflict in IP addresses.


[*] while you *could* use the same subnet for the routed virtual network 
on both hosts, each host would then only be able to reach the guests on 
its own virtual network, and 3rd parties would need to point their 
routing tables at one host or the other for that single subnet, and so 
would only be able to reach the guests on one of the hosts, but not the 
other.




What im searching for is the "libvirt" way to have a central lease file
between multiple hosts for the same network (without having another
layer like OVS/OVN).

What i guess would work is:

  1) share /var/lib/libvirt/dnsmasq between both host systems, of course
  means the virtual bridge for the network has to have the same
  name on both systems.

  2) replace /usr/libexec/libvirt_leaseshelper with my own version, that
  stores the leases in an central place.

  3) a way that exists and i dont know about?

Option 2) sounds the best for me, but i currently dont see a way to
specify the dhcp-script used for a network on libvirt side .. any
opinions on this?

Using libvirt 7.x and alike from the centos 8 advanced virtualization
stream.

thanks,
 - michael

Re: attach a pcie root port as hostdev

2021-08-04 Thread Laine Stump





On 8/4/21 3:21 AM, Jiatong Shen wrote:

Hello community,

I am working on plugging a xilinx fpga card as a hostdev.  This fpga 
card has 2 functions, each has a different role, for some reason, I 
would like to attach both of them but do not want to attach them 
separately, so I have tried to attach a pci bridge where fpga cards 
connect to. but got the following  error,


error: Failed to attach device from bridge.xml
error: internal error: Non-endpoint PCI devices cannot be assigned to guests 



so, my question is is it possible to attach a pci-bridge as hostdev? 
thank you.


No, it isn't possible. VFIO only supports attaching endpoint devices.

You will need to attach each of your endpoint devices separately. (If 
the issue is that you want both devices to show up on the same root port 
in the guest, that is easily accomplished by manually setting the 
guest-side pci address of the devices to have the same same bus/slot, 
but different function; this does mean that you can't hotplug, but will 
deliver the desired topology. Note that although I've heard of people 
who thought that they needed devices to be on the same slot, I've never 
seen any evidence that this was actually required by any OS or driver)

Re: issue when not using acpi indices in libvirt 7.4.0 and qemu 6.0.0

2021-06-23 Thread Laine Stump

On 6/23/21 7:37 PM, Riccardo Ravaioli wrote:
On Wed, 23 Jun 2021 at 18:59, Daniel P. Berrangé > wrote:

[...]
So your config here does NOT list any ACPI indexes

Exactly, I don't list any ACPI indices.

 > After upgrading to libvirt 7.4.0 and qemu 6.0.0, the XML snippet
above
 > yielded:
 > - ens1 for the first virtio interface => OK
 > - rename4 for the second virtio interface => **KO**

(this is reminiscent of what would sometimes happen back in the "bad old 
days" of ethN NIC naming.)

 > - ens3 for the PCI passthrough interface  => OK

With the older libvirt + qemu, the guest (Debian) was setting the device 
names to ens1, ens2, and ens3 (through some sort of renaming, apparently 
by udev. The names that these interfaces would normally get (e.g. in 
Fedora or RHEL8) would be enp1s1, enp1s2, and enp1s3.

With the newer libvirt + qemu, the guest still has the names set by 
systemd (?) to enp1s1, enp1s2, and enp1s3.

So from libvirt's POV, nothing should have changed upon upgrade,
as we wouldn't be setting any ACPI indexes by default.

Right. If ACPI indexes had been turned on, I would have expected the 
names to be, e.g., eno1, eno2, eno3. But that would require explicitly 
adding the option to the qemu commandline, but it isn't there (see below).

Can you show the QEMU command line from /var/log/libvirt/qemu/$GUEST.log
both before and after the libvirt upgrade.

Sure, here it is before the upgrade: https://pastebin.com/ZzKd2uRJ 

-netdev tap,fd=50,id=hostnet0 \
-device 
virtio-net-pci,csum=off,netdev=hostnet0,id=net0,mac=52:54:00:aa:cc:05,bus=pci.1,addr=0x1 
\

-netdev tap,fd=51,id=hostnet1 \
-device 
virtio-net-pci,csum=off,netdev=hostnet1,id=net1,mac=52:54:00:aa:bb:81,bus=pci.1,addr=0x2 
\

[...]
-device vfio-pci,host=:0d:00.0,id=hostdev0,bus=pci.1,addr=0x3 \

And here after the upgrade: https://pastebin.com/EMu6Jgat 

-netdev tap,fd=55,id=hostnet0 \
-device 
virtio-net-pci,csum=off,netdev=hostnet0,id=net0,mac=52:54:00:aa:cc:a0,bus=pci.1,addr=0x1 
\

-netdev tap,fd=56,id=hostnet1 \
-device 
virtio-net-pci,csum=off,netdev=hostnet1,id=net1,mac=52:54:00:aa:bb:a1,bus=pci.1,addr=0x2 
\

[...]

So there is no change in the qemu commandline for the virtio-net 
devices, nor for the hostdev.

(BTW, you say that your vfio-assigned device is SRIOV, but it isn't - 
it's a standard ethernet device - "Intel Corporation I210 Gigabit 
Network Connection" - this has no effect on the current conversation, 
just FYI).

Since the name of the devices hasn't changed to "enoBLAH", I think the 
whole ACPI index thing is a red herring - ACPI indexes aren't being set 
and the device names aren't being set based on the non-existent ACPI 
indexes. There is something else going on (seemingly tied to the device 
renaming that udev (?) is doing from enpXsY to ensZ).

I notice that you're apparently redefining this domain from scratch each 
time it is started.

1) The machinetype changes from pc-i440fx-5.2 to pc-i440fx-6.0, implying 
that each time the domain is started, it is being told to use the 
generic machinetype "pc", which is then canonicalized to "the newest 
pci-i440fx-based machinetype" before starting the guest.

2) The MAC address has been changed for the two virtio-net cards, but 
not to some random number as would happen if you were allowing libvirt.

It's common for OSes to notice a new MAC address and attempt to give the 
interface a new name. Perhaps this is happening and whoever/whatever is 
doing that is screwing things up. Or it's possible there is some minor 
change in the machinetype from pc-i440fx-5.2 to pc-i440fx-6.0 that is 
causing this renaming to behave differently.

If you really need your guests to be stable, you shouldn't just use "pc" 
as the machinetype every time the guest is started, but instead save the 
canonicalized machinetype listed in the XML when you initially define 
the domain, and use that canonicalized machinetype on all future starts 
of the domain. Likewise, you should retain the exact MAC addresses that 
are used for all the NICs when the domain is originally defined and 
started for the first time, and use those exact same MAC addresses in 
subsequent starts. That way you are guaranteed (modulo any bugs) that 
the guest is presented with the exact same hardware each time it boots. 
If you use "virsh define" and "virsh start" (rather than "virsh create" 
- I can't be certain this is what you're doing, but there are clues 
indicating it might be the case) then all these details are 
automatically preserved for you within libvirt's persistent domain 
configuration.

One other comment - I don't remember the exact location, but I recall 
from a long long time ago that udev saves the information about names 
that it gives to NICs "somewhere". You may want to find and clear out 
that cache of info in the guest to get

Re: KVM Virtual Machine Network - Guest-guest/VM-VM only network (no host/hypervisor access, no outbound connectivity)

2021-06-14 Thread Laine Stump





On 6/11/21 7:22 PM, Eduardo Lúcio Amorim Costa wrote:
I know that with the *virsh* command I can create several types of 
networks (a "NAT network", for example) as we can see in these URLs...


KVM network management 
KVM default NAT-based networking 
 (page 33)


*QUESTION:* How can I create a network (*lan_n*) where only guests/VMs 
have connectivity, with no outbound connectivity and no host/hypervisor 
connectivity?


(Just to be sure I'm understanding correctly - you want the guests on 
this network to have connectivity to each other, but not guest<->host, 
and nothing beyond the host, correct?)


Normally the guests would get their DHCP-assigned IP address from the 
host, and use the host for DNS, but since you want to forbid 
guest<->host communication, that implies that either one of the guests 
on the network will act as DHCP/DNS server, or that the guests will have 
statically configured IP addresses.


That being the case, all you really need is to define a libvirt virtual 
network that has no IP address on the host, e.g.:


   
 super-isolated
   

(It *might* be necessary to add "ipv6='yes'" immediately after "network" 
in order for IPv6 connectivity to work, but I'm not sure  and don't have 
a setup to try it right now).




*NOTE:* The connectivity to other resources will be provided by a 
*pfSense* firewall server that will have access to another network 
(*wan_n*) with outbound connectivity and other resources.


Yes, this is a common config - have a "super-isolated" network for all 
the guests + the firewall VM, and then the firewall VM has a 2nd 
interface that connects everyone to the outside.




|Network layout... [N]wan_n ↕ [I]wan_n [V]pfsense_vm [I]lan_n ↕ [N]lan_n 
↕ . ↕ ↕ ↕ [V]some_vm_0 [V]some_vm_1 
[V]some_vm_4 [V]some_vm_2 [V]some_vm_5 [V]some_vm_3 _ [N] - Network; _ 
[I] - Network Interface; _ [V] - Virtual Machine. |


Sigh. Stupid email client formatting - your original ASCII diagram 
looked nice, but just look at what Thunderbird did to it when I hit 
reply :-/ (fortunately I didn't need to refer to it)

Re: Virtual Network API for QEMU

2021-03-28 Thread Laine Stump


On 3/27/21 8:39 AM, Radek Simko wrote:

Hi,
According to this support matrix 
https://libvirt.org/hvsupport.html#virNetworkDriver 


there is no support for any APIs other than hypervisor ones for qemu.
For example virConnectNumOfNetworks is not supported.


I'm afraid I don't understand your question. Which hypervisor are you 
using that you think virConnectNumOfNetworks isn't supported. The only 
possible meaning I can get from the above sentences is that you think 
virConnectNumOfNetworks isn't supported when qemu is the hypervisor, 
which is definitely *not* true.


As a matter of fact, essentiall *all* of the functions in the matrix are 
supported when qemu is the hypervisor, pretty much every one of them 
ever since their original introduction (e.g., the function you reference 
has been supported since libvirt 0.2.0, which was released in February 
2007).


Are you possibly misinterpreting the contents of the support matrix?

Is there any particular reason this is not supported? Has any 
development in that area been attempted in the past? Would contributions 
adding support be welcomed?


Thanks,

Radek Simko

Re: Updating dnsmasq options with virsh net-update

2021-03-17 Thread Laine Stump


On 3/17/21 1:51 PM, brent s. wrote:

On 3/17/21 13:19, Alex Crawford wrote:

I'm trying to take advantage of libvirt's support for passing through
options to dnsmasq
, but I'm
having trouble getting it to take effect. I have a network already
created and I'm trying to use net-update to add the options, but it's
not clear to me what section I should specify. By the way, is there a
good way to list the available sections? I've been resorting to reading
the code
.
Working in a different direction, I tried using net-edit to make the
changes but they seem to have been silently discarded:

     $ virsh -c qemu:///system net-edit crawford-libvirt-67v2h
     Network crawford-libvirt-67v2h XML configuration edited.
     $ virsh -c qemu:///system net-dumpxml crawford-libvirt-67v2h | grep
--count 
     0

Can anyone tell me what I'm doing wrong or how this feature was intended
to be used? Thank you.


https://wiki.libvirt.org/page/Networking#Applying_modifications_to_the_network



-Alex


The last time I tried using net-update, if I recall it didn't support
full editing.


That is correct, and it is by design. When I added the virNetworkUpdate 
API I started with exactly that idea, but during discussions we decided 
against allowing such freeform changing of anything and everything in 
the network's config (I don't remember the arguments in either direction 
now, but I definitely remember the discussion happening :-))



I had to net-edit the network in question and restart it
(to do exactly what you're trying to do, I should note!). I don't think
net-update lets you edit the root element's namespace (which is what you
need to do for e.g.  to not be eaten).


 is in some ways even beyond just "editing the root 
element's namespace" - it is adding opaque stuff into the dnsmasq 
commandline that will have effects that can't be comprehended by 
libvirt's network driver - it could do something that completely 
counteracts what libvirt has purposefully added.


But I digress. You are correct that  can't be changed 
with virsh net-update.


The good news, though, is that you can safely net-destroy and then 
net-start the network, and get full connectivity of all your guests 
(whose tap devices have just been disconnected from the network's bridge 
by the restart) back by just restarting libvirtd.service (at least if 
you have a libvirt that is newer than a couple years old). This means 
that, aside from the short disruption in connectivity during the time 
between "virsh net-destroy $net" and "systemctl restart 
libvirtd.service", the effect will be the same as if you had been able 
to do the modification with virsh net-update.




For reference, the modified root element looks like this:

http://libvirt.org/schemas/network/dnsmasq/1.0;>

Re: Bridge and VLAN trunk

2021-03-11 Thread Laine Stump


On 3/11/21 5:53 AM, Gionatan Danti wrote:

Dear list,
I am a question about the best use of bridge, vlan trunk and libvirt.

When dealing with virtual machies bound to specific vlan, I generally 
use a straightforward approach:

eth -> bridge -> vm (for untagged traffic)
eth -> eth.10 -> bridge -> vm
eth -> eth.nn -> bridge -> vm

Now I am faced with enabling vlan trunking for a specific vm (a 
virtualized firewall). The simpler approach would be:

eth -> bridge -> vm (for the vm needing trunk)
eth -> bridge.10 -> macvtap -> vm

The issue with the above method is that any VM on the main untagged vlan 
needs to be bound to the "plain" bridge, having access to *any* traffic 
of *any* other vlan. If this is ok (and the desired behavior) for the 
firewall, it is clearly wrong for the other VMs.


A simple fix would be to use ebtables to block/drop vlan tagged traffic 
on the main bridge for any virtual adapter except the required one (ie: 
the firewall virtual interface). It works, but I wounder if other 
preferred approaches exists.


For example, I tested another more convoluted setup:
eth -> bridge -> firewall vm
eth -> bridge.10 -> macvtap -> vm
eth -> bridge -> veth0 -> veth1 -> other bridge with vlan filtering on 
-> vm


The last row show the use of veth virtual interface, configurable via ip 
link. Enabling vlan filtering on the second bridge (rather than on the 
first) is to keep vlan filtering simple: rathen than enabling all 
required vlan on the first bridge, I simply enable only untagged traffic 
on the second one.


Does libvirt support bridge vlan filtering natively? Reading the docs, 
it seems to the supported only on OpenVSwitch or SRV-IO based adapter.


That's correct. Support was added to the Linux host-bridge device a few 
years ago for per-port VLAN tagging/filtering, but there hasn't been 
anyone sufficiently compelled (by their own needs or by their altruistic 
instincts) to support that. It likely would be fairly straightforward to 
do once someone dove into it - all the necessary config attributes are 
already there, so it would just involve recognizing and acting on them 
when a guest interface connected to a bridge that was a standard Linux 
host bridge (of course in reality there will likely be some unexpected 
incompatibility that will make it more difficult, but at least *in 
theory* it would be simple).


So, if you can program in C and are willing to dig into the online docs 
for setting per-port attributes for Linux host bridges and implementing 
them (iirc via sending netlink messages) then feel free to start 
hacking, and check in on irc.oftc.net in the #virt channel if you have 
questions. Otherwise, I would recommend installing Open vSwitch. I don't 
have a link handy, but I've seen a few HOWTOs floating around, and 
followed one of them a few years ago to set it up on Fedora and RHEL 
test machines.

Re: Static DHCP lease never distributed

2021-02-10 Thread Laine Stump


On 2/9/21 8:28 PM, Brooks Swinnerton wrote:

Hi there,

I have a libvirt network defined as so:


     customers
     
     


At first I was suspicious that your problem could be related to the 
 (because not many people use it), but I added 
that to my own test, and it still worked.


Same for macTableManager='libvirt' - this has been around a few years 
longer, but I suspect that also nobody uses it (in spite of the promise 
of performance gains, it turns out that if something isn't the default, 
most people just never know that it exists).




     
     
         
             ip='10.0.0.10' />


After reading a bit, I thought maybe we were missing the statement for 
and "empty" range that dnsmasq requires when there are only static 
addresses, but then I tried out a test and it worked. Just out of 
curiousity I looked back through libvirt's history and found that bug 
was fixed in September 2010! (and has remained fixed since then)


(going back up) Then I noticed you have , which is 
also very uncommonly used. So I tried adding that to my config and 
restarting the net. I am *still* unable to get it to fail.




         
     


And that network is attached to a virtual machine:

     
       
       
       
       function='0x0'/>

     

But for some reason when the domain starts, it never gets an address. If 
I tcpdump the bridge that was created by the network I can see it 
sending out discover packets, but dnsmasq never seems to respond:


01:26:25.039987 02:99:92:43:eb:b8 > ff:ff:ff:ff:ff:ff, ethertype IPv4 
(0x0800), length 342: (tos 0x0, ttl 64, id 0, offset 0, flags [none], 
proto UDP (17), length 328)
     0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 
02:99:92:43:eb:b8, length 300, xid 0xc7283f76, secs 228, Flags [none]

           Client-Ethernet-Address 02:99:92:43:eb:b8
           Vendor-rfc1048 Extensions
             Magic Cookie 0x63825363
             DHCP-Message Option 53, length 1: Discover
             Client-ID Option 61, length 7: ether 02:99:92:43:eb:b8
             MSZ Option 57, length 2: 576
             Parameter-Request Option 55, length 7:
               Subnet-Mask, Default-Gateway, Domain-Name-Server, Hostname
               Domain-Name, BR, NTP
             Vendor-Class Option 60, length 3: "d-i"


Despite that appearing to be the correct mac address.

Looking in /var/lib/libvirt/dnsmasq/customers.hostsfile, it's returning 
what I would expect to be there:


02:99:92:43:eb:b8,10.0.0.10,dhcp-test

If I add a  stanza to the configuration, that does appear to 
work, so it seems this is only related to static addresses.


Do you mean that if you add a  that the guests can then get their 
static addresses? Or are they given different (dynamic) addresses within 
the range?




This is libvirtd 5.8.0.


That's a year and a half old, but nothing in this area of the code has 
changed in a way that should affect this behavior.


Of course it could be that your specific kernel isn't properly dealing 
with one of the above "unusual" options I mentioned. If I were in your 
position, I would try removing mactableManager='libvirt', isolated='yes'/>, and  (remember you need to destroy 
and restart the network after any change.


If none of those helps, you may want to start looking into iptables 
rules (although the fact that it works with a range enabled implies that 
there are no iptables issues.

Re: [resent] virt-manager connection fails with 'qemu unexpectedly closed the monitor'

2020-12-22 Thread Laine Stump


On 12/21/20 6:15 AM, John Paul Adrian Glaubitz wrote:

(CC'ing a friend who's run into the problem as well)

Hi Laine!

On 12/20/20 7:37 PM, Laine Stump wrote:

The first step would be to look in /var/log/libvirt/qemu/$guestname.log.
If qemu is encountering some error, it should be logging a message there
before it exits. (unless the error is a segfault; in that case I guess
you'll need to look for a coredump. It will also list exactly the qemu
commandline that was used.


Except for some unrelated warnings, qemu does not emit any error message:

[...]



2020-12-21 11:10:37.580+: starting up libvirt version: 6.9.0, package: 1+b2 
(amd64 / i386 Build Daemon (x86-ubc-01) 
 Mon, 07 Dec 2020 09:45:52 +), 
qemu version: 5.2.0Debian 1:5.2+dfsg-2, kernel: 5.9.0-5-amd64, hostname: 
z6.physik.fu-berlin.de


At least you're running pretty much up-to-date of both libvirt and qemu, 
so you won't have to suffer through someone saying "come back when 
you've updated your software!" :-)


(although I suppose it's possible you've run into some recent regression 
that I haven't heard of, and all the people who have heard of it aren't 
responding because they gone off on their Christmas holidays already).




LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
HOME=/var/lib/libvirt/qemu/domain-1-windows7 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-windows7/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-windows7/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-windows7/.config \
QEMU_AUDIO_DRV=spice \
/usr/bin/qemu-system-x86_64 \
-name guest=windows7,debug-threads=on \
-S \
-object 
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-windows7/master-key.aes
 \
-machine 
pc-q35-5.2,accel=kvm,usb=off,vmport=off,dump-guest-core=off,memory-backend=pc.ram
 \
-cpu 
Broadwell-IBRS,vme=on,ss=on,vmx=on,pdcm=on,f16c=on,rdrand=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaveopt=on,pdpe1gb=on,abm=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff
 \
-m 4096 \
-object memory-backend-ram,id=pc.ram,size=4294967296 \
-overcommit mem-lock=off \
-smp 2,sockets=2,cores=1,threads=1 \
-uuid bb32a4b5-7b2c-4ac2-9a65-8fd2bb9cdb86 \
-no-user-config \
-nodefaults \
-chardev socket,id=charmonitor,fd=33,server,nowait \
-mon chardev=charmonitor,id=monitor,mode=control \
-rtc base=localtime,driftfix=slew \
-global kvm-pit.lost_tick_policy=delay \
-no-hpet \
-no-shutdown \
-global ICH9-LPC.disable_s3=1 \
-global ICH9-LPC.disable_s4=1 \
-boot strict=on \
-device 
pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2
 \
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \
-device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \
-device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 \
-device 
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d
 \
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 \
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.2,addr=0x0 \
-blockdev 
'{"driver":"file","filename":"/data/kvm/windows7.img","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}'
 \
-blockdev 
'{"node-name":"libvirt-1-format","read-only":false,"driver":"raw","file":"libvirt-1-storage"}'
 \
-device ide-hd,bus=ide.0,drive=libvirt-1-format,id=sata0-0-0,bootindex=1 \
-netdev tap,fd=35,id=hostnet0 \
-device e1000e,netdev=hostnet0,id=net0,mac=52:54:00:4b:c5:26,bus=pci.1,addr=0x0 
\
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=serial0 \
-chardev spicevmc,id=charchannel0,name=vdagent \
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
 \
-device usb-tablet,id=input0,bus=usb.0,port=1 \
-spice port=5900,addr=127.0.0.1,disable-ticketing,seamless-migration=on \
-device 
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pcie.0,addr=0x1
 \
-device ich9-intel-hda,id=sound0,bus=pcie.0,addr=0x1b \
-device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \
-chardev spicevmc,id=charredir0,name=usbredir \
-device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 \
-chardev spicevmc,id=charredir1,name=usbredir \
-device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 \
-device virtio-balloon-pci,id=balloon0,bus=pci.3,addr=0x0 \
-sandbox 
on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-msg timestamp=on


Maybe try running this command dir

Re: [resent] virt-manager connection fails with 'qemu unexpectedly closed the monitor'

2020-12-20 Thread Laine Stump


On 12/20/20 6:53 AM, John Paul Adrian Glaubitz wrote:

Hi!

I recently ran into a problem when connecting to libvirtd 6.9.0 on Debian 
unstable
and trying to import an existing image with Windows 7.

Upon finishing the wizard and starting the instance, the import process fails
with the following error message:

Unable to complete install: 'internal error: qemu unexpectedly closed the 
monitor'

Traceback (most recent call last):
   File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in 
cb_wrapper
 callback(asyncjob, *args, **kwargs)
   File "/usr/share/virt-manager/virtManager/createvm.py", line 2081, in 
_do_async_install
 installer.start_install(guest, meter=meter)
   File "/usr/share/virt-manager/virtinst/install/installer.py", line 731, in 
start_install
 domain = self._create_guest(
   File "/usr/share/virt-manager/virtinst/install/installer.py", line 679, in 
_create_guest
 domain = self.conn.createXML(install_xml or final_xml, 0)
   File "/usr/lib64/python3.8/site-packages/libvirt.py", line 4366, in createXML
 raise libvirtError('virDomainCreateXML() failed')
libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor

Since this error message is rather generic, I don't know where to start 
debugging.

Does anyone know how to increase verbosity here to get an error message that 
might be
more helpful?



The first step would be to look in /var/log/libvirt/qemu/$guestname.log. 
If qemu is encountering some error, it should be logging a message there 
before it exits. (unless the error is a segfault; in that case I guess 
you'll need to look for a coredump. It will also list exactly the qemu 
commandline that was used. If the message there isn't illuminating 
enough and you come back here, it would be helpful to "bring along" that 
qemu commandline (and any accompanying error) as well as the output of 
"virsh dumpxml $guestname"

Re: DNS forwarding for guest domains on isolated network

2020-11-11 Thread Laine Stump


On 11/11/20 3:40 AM, Jörg Kastning wrote:

Hi @all,

I'm having trouble to realize my use case and hope somebody could help me.

# Use case

For a home lab I want to deploy several guest domains. These domains 
must not have a direct or NAT connection to the internet or my LAN. They 
should only be able to reach my LAN and the internet through a proxy.


# What I've done

I've created the following virtual switch in isolated mode:

$ sudo virsh net-dumpxml private1

   private1
   THE-UUID
   
   
   
   
     
   
     
   


I've setup a guest domain that serves as a proxy and several other guests.

# My issue

Nameresolution for *.private1 works fine on this network. But I'm not 
able to resolve domains from the outside world like github.com.


This behavior is intentional:

  https://gitlab.com/libvirt/libvirt/-/commit/513122ae93



I understood that libvirt is forwarding dns resolution requests to the 
hosts nameserver configured in /etc/resolv.conf in case the dnsmasq 
instance for the virtual network is not able to resolve the name.


Not for isolated networks, because a DNS request could be used to break 
out of an isolated network (by using "IP over DNS")




My guess, in my setup this don't work, because the virtual switch is in 
isolated mode, right?


When DNS traffic is forwarded by a DNS server, it is at application 
level, not IP level, so any filtering of forwarded traffic on the switch 
is not involved.




# My questions

  * What can I do to achieve my use case described above?

  * Is it possible to use the isolated mode here or do I have to use a 
different mode?


"no-resolv" will always be in the dnsmasq config file for an isolated 
network, and there isn't any way to remove it (other than using a 
different kind of network). And since there is not (as far as I know) a 
different dnsmasq option to counteract a "no-resolv" that's already 
there, you can't eliminate the effect of no-resolv by adding something 
to the conf file with . A few things to try:


1) try adding  in the the  section of 
the network, pointing to your normal DNS server. Possibly that directive 
to dnsmasq will make a "side run" around the restriction on forwarding. 
(this can also have "domain='blah'" added, in which case it only 
forwards requests for names within the 'blah' domain ).


   https://libvirt.org/formatnetwork.html#elementsAddress

2) use a  network, but also add in nwfilter rules 
that only allow traffic on the local network.


   https://libvirt.org/formatnwfilter.html

3) again, use , but also manually add a rule to 
the host iptables that rejects all traffic from the guest network 
outbound on the host's egress interface.


It's important that the guest domains could only connect to the internet 
by using the proxy.



Have you tried putting the guests

Re: consume existing tap device when libvirt / qemu run as different users

2020-11-06 Thread Laine Stump


On 11/4/20 6:34 AM, Miguel Duarte de Mora Barroso wrote:

Hello,

I'm having some doubts about consuming an existing - already
configured - tap device from libvirt (with `managed='no' ` attribute
set).

In KubeVirt, we want to have the consumer side of the tap device run
without the NET_ADMIN capability, which requires the UID / GID of the
tap creator / opener to match, as per the kernel code in [0]. As such,
we create the tap device (with the qemu user / group on behalf of
qemu), which will ultimately be the tap consumer.

This leads me to question: why is libvirt opening / calling
`ioctl(..., TUNSETIFF, ...) ` on the tap device when it already exists
- [1] & [2] ? Why can't the tap device (already configured) be left
alone, and let qemu consume it ?

The above is problematic for KubeVirt, since our setup currently has
libvirt running as root (while qemu runs as a different user), which
is preventing us from removing NET_ADMIN (libvirt & qemu run as
different users).



Miguel also brought this question up in the #virt channel on 
irc.oftc.net, and we discussed it a bit there.



I wondered if possibly the uid of the qemu process was irrelevant, since 
it is libvirtd that's opening and configuring the device (and in his 
case libvirtd is running as root, just without the NET_ADMIN capability) 
- as Miguel points out in his references, the kernel allows a process 
running under the same uid as the owner of net device to perform ioctls 
on that device, even if that process doesn't have NET_ADMIN.



Miguel tried setting (actually "leaving" :-)) ownership of the tap 
device to root when it was created; libvirtd was then able to open and 
configure the device; it passed the open file descriptor to qemu 
(running as user qemu), which consumed and used it without problem.



So, the answer is that the pre-created / unmanaged tap/macvtap devices 
must be owned by the same uid as the libvirtd process, *not* the same 
uid as the qemu process, because libvirtd is the process that operates 
on the device itself (qemu just sends and receives on an already-opened 
file descriptor).




Thanks in advance for your time,
Miguel

[0] - 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/tun.c?id=4ef8451b332662d004df269d4cdeb7d9f31419b5#n574

[1] - 
https://github.com/libvirt/libvirt/blob/99a1cfc43889c6d425a64013a12b234dde8cff1e/src/qemu/qemu_interface.c#L453

[2] - 
https://github.com/libvirt/libvirt/blob/v6.0.0/src/util/virnetdevtap.c#L274

Re: Encrypting boot partition Libvirt not showing the OS booting up

2020-10-12 Thread Laine Stump


On 10/12/20 1:10 PM, john doe wrote:

On 10/12/2020 5:14 PM, Michal Privoznik wrote:

On 10/12/20 4:27 PM, john doe wrote:

On 10/12/2020 4:09 PM, Peter Krempa wrote:

On Mon, Oct 12, 2020 at 16:05:43 +0200, Michal Privoznik wrote:

On 10/12/20 2:14 PM, john doe wrote:




I sent privately the requested xml file to 'Peter Krempa
'.
Peter Krempa 's privately answered me back suggesting to add the
following in the domain xml file:


Solving things privately doesn't help the community.


Additionally it doesn't help solving the problem, since it's now opaque
to others what the problem might be.



 under 


I've suggested this as the outputs I've got privately hinted that the
console (as in virsh console) didn't get to asking for the password,
while the manually-started-qemu did.

Thus the problem actually doesn't have to do with encryption or
wahatver, but the console doesn't plainly work.



such as ...

  Â Â  
  Â Â Â Â  hvm
  Â Â Â Â  
  Â Â Â Â  
  Â Â  



Try adding:

 /usr/share/seabios/bios.bin


Darn, this should have been sgabios: /usr/share/sgabios/sgabios.bin
but if your seabios is new enough (v1.11.0 and newer) then this is not
needed as seabios itself is capable of serial interface. And looking at
earlier e-mails in the thread you have v1.12.0-1 you you're good and
don't need to add  at all.

But honestly, I don't know why you are not getting the console. Could it
be that you are getting the console and the qemu is waiting for your
input, i.e. what happens if you type in the password?



Nothing happened at all if I try to type the password.
Yes, so am I , I'm totaly lost on why it does not work.

How can I find the command libvirt is passing to qemu?


The qemu command issued by libvirt can be found at the end of 
/etc/libvirt/qemu/${guestname}.log




Are you at least able to reproduce the issue (Debian Buster for host and
guest)?

--
John Doe

Re: [libvirt] SRIOV configuration

2020-09-24 Thread Laine Stump

Edward and I have had a multi-day private conversation in IRC on the 
topic of this mail. I was planning to update this thread with an email, 
but forgot until now :-/



On 9/24/20 10:54 AM, Daniel P. Berrangé wrote:

On Mon, Sep 21, 2020 at 06:04:36PM +0300, Edward Haas wrote:

The PCI addresses appearing on the domxml are not the same as the ones
mappend/detected in the VM itself. I compared the domxml on the host
and the lspci in the VM while the VM runs.

Can you clarify what you are comparing here ?

The PCI slot / function in the libvirt XML should match, but the "bus"
number in libvirt XML is just a index referencing the 
element in the libvirt XML.  So the "bus" number won't directly match
what's reported in the guest OS. If you want to correlate, you need
to look at the  on the  to translate the libvirt
"bus" number.



Right. The bus number that is visible in the guest is 100% controlled by 
the device firmware (and possibly the guest OS?), and there is no way 
for qemu to explicitly set it, and thus no way for libvirt to guarantee 
that the bus number in libvirt XML will be what is seen in the guest OS; 
the bus number in the XML only has meaning within the XML - you can find 
which controller a device is connected to by looking for the PCI 
controller that has the same "index" as the device's "bus".






This occurs only when SRIOV is defined, messing up also the other
"regular" vnics.
Somehow, everything comes up and runs (with the SRIOV interface as
well) on the first boot (even though the PCI addresses are not in
sync), but additional boots cause the VM to mess up the interfaces
(not all are detected).



Actually we looked at this offline, and the "messing up" that's 
occurring is not due to any change in PCI address from one boot to the 
next. The entire problem is caused by  the guest OS using traditional 
"eth0" and "eth1" netdev names, and making the incorrect assumption that 
those names are stable from one boot to the next. In fact, it is a 
long-known problem that, due to a race between kernel code initializing 
devices and user processes giving them names, the ordering of ethN 
device names can change from one boot to the next *even with completely 
identical hardware and no configuration changes. Here is a good 
description of that problem, and of systemd's solution to it 
("predictable network device names"):



https://www.freedesktop.org/wiki/Software/systemd/PredictableNetworkInterfaceNames/


Edward's inquiry was initiated by this bugzilla:


  https://bugzilla.redhat.com/show_bug.cgi?id=1874096


You can see in the "first boot" and "second boot" ifconfig output that 
the same ethernet device has the "altname" enp2s1, and the same device 
has the altname enp3s0 during both runs; these names are given by 
systemd's "predictable network device name" algorithm (which bases the 
netdev name on the PCI address of the device). But the race between 
kernel and userspace causes the "ethN" names to be assigned differently 
during one boot and the next.



In order to have predictable netdev names, the OS image needs to stop 
setting net.ifnames=0 on the kernel command line. If they like, they can 
give their own more descriptive names to the devices (methods arae 
described in the above systemd document), but they need to stop relying 
on ethN device names.



(note that this experience did uncover another bug in libvirt, which 
*might* contribute to the racy code flip flopping from boot to boot, but 
still isn't the root cause of the problem - in this case libvirtd is 
running privileged, but inside a container, and the container doesn't 
have full access to the devices' PCI config data in sysfs (you can see 
this when you run "lspci -v" inside the container, you'll notice 
"Capabilities: ". One result of this is that libvirt 
mistakenly determines the VF is a conventional PCI device (not PCIe), so 
it auto-adds a pcie-to-pci-bridge, and plugs the VF into that 
controller. I'm guessing that makes device initialization take slightly 
longer or something, changing the results of the race. I'm looking into 
changing the test for PCIe vs. conventional PCI, but again that isn't 
the real problem here)




This is how the domxml hostdev section looks like:
```
 
   
   
 
   
   
   
 
```

Is there something we are missing or we misconfigured?
Tested with 6.0.0-16.fc31

My second question is: Can libvirt avoid accessing the PF (as we do
not need mac and other options).

I'm not sure, probably a question for Laine.



The entire point of  is to be able to set the 
MAC address (and optionally the vlan tag) of a VF when assigning it to a 
guest, and the only way to set those is via the PF. If you use plain 
, then libvirt has no idea that the device is a VF, so it 
doesn't look for or try to access its PF.



So, you're doing the right thing - since your container has no access to 
the PF, you need to set the MAC address / vlan tag outside the container

Re: cant create network with virt-manager

2020-09-21 Thread Laine Stump


On 9/21/20 3:58 PM, Marko Horn wrote:


hello list,
i cant create an isolated network with the virt-manager.

installed version virt-manager 3.0.0
installed version libvirt 6.2.0

output in error-message:

Error creating virtual network: internal error: Failed to apply firewall 
rules /sbin/iptables -w --table filter --insert LIBVIRT_INP 
--in-interface virbr3 --protocol tcp --destination-port 67 --jump 
ACCEPT: iptables: No chain/target/match by that name.


I may be remembering incorrectly (I think there might have been as 
similar bug of the same vintage), but it is possibly caused by this bug:


https://bugzilla.redhat.com/show_bug.cgi?id=1813830

It was fixed upstream in libvirt-6.4.0. If you're building from upstream 
yourself, then grab the latest master. If you're running a downstream 
distro build, ask them to backport the patches detailed in the above 
(Fedora) bug report.






Traceback (most recent call last):
   File "/usr/share/virt-manager/virtManager/asyncjob.py", line 66, in 
cb_wrapper

     callback(asyncjob, *args, **kwargs)
   File "/usr/share/virt-manager/virtManager/createnet.py", line 428, in 
_async_net_create

     netobj.create()
   File "/usr/lib/python3.7/site-packages/libvirt.py", line 3174, in create
     if ret == -1: raise libvirtError ('virNetworkCreate() failed', 
net=self)
libvirt.libvirtError: internal error: Failed to apply firewall rules 
/sbin/iptables -w --table filter --insert LIBVIRT_INP --in-interface 
virbr3 --protocol tcp --destination-port 67 --jump ACCEPT: iptables: No 
chain/target/match by that name


any ideas?

thank you
marko

Re: libvirt binding

2020-09-09 Thread Laine Stump


On 9/9/20 3:08 AM, Shashwat shagun wrote:

is the connection object a connection pool or just a single connection?
Can it be used concurrently?


libvirt is fully threadsafe (modulo any unknown errors, of course!). 
There can be man connections at once, from the same or from different 
processes on the same or different machines.

Re: Isolated bridge does not bridge

2020-09-09 Thread Laine Stump


On 9/9/20 7:13 AM, Paul van der Vlis wrote:

Hello,

I want to do some testing and I have removed two VM's from the bridge
what connects them to internet, and added them to another isolated
bridge what's not connected to internet. Problem is that I cannot reach
the other host in the isolated network.

Something like this:

virsh shutdown kvm66
virsh shutdown kvm68

brctl delif br0 vnet10 vnet6  # the interfaces of kvm66 and kvm68
brctl addbr br1
brctl addif br1 vnet10 vnet6


The delif and addif commands won't do anything if the guests are not 
running (you've done "virsh shutdown", but that will either take some 
time, or never be honored (depending on how the guest OS deals with 
ACPI, I think)





Then I've replaced br0 to br1 in the XML of both VM's with "virsh edit".


Just be certain that each guest is either completely inactive (doesn't 
show up in the output of "virsh list" when you edit, or at some point 
after you've edited it (i.e. there must be a complete "virtual 
powercycle" of the guest for the changes to take effect).




Then I did start the VM's using the serial console (no network):
virsh start --console kvm66
virsh start --console kvm68

I cannot ping from one machine to the other. Why??


I guess you're using  ... right?

Since the bridge devices were created and are managed outside libvirt's 
control, you need to do more than just create a bridge to get the 
connected guests talking to each other. In particular, if the guests are 
getting their IP addresses from DHCP, then you need to assign an IP 
address to the bridge device, and run a DHCP server that is listening on 
the bridge. (I'm curious what you used as the argument of the ping 
command, if the guests didn't have an IP address...)


(Aside from that, a bridge created with brctl will disappear when the 
host is rebooted, and not be recreated until you again enter the commands.)


If you want a simple way to create a bridge, start a dnmasq instance to 
serve DHCP, and add iptables rules to prevent the guests from breaking 
out of the isolated bridge, *and* as a bonus *re*create all of that 
every time you reboot the host, you can create an isolated libvirt 
virtual network, with a config file like the one here:



https://libvirt.org/formatnetwork.html#examplesPrivate

(editing to your taste for bridge name and IPv4 and IPv6 addresses). Put 
that in a file (e.g. net.xml) and run (as root) "virsh net-define 
net.xml; virsh net-start private; virsh net-autostart private".


Then define your guest interfaces with this:

   
 
 ...

Re: network config not working on newer libvirt

2020-09-08 Thread Laine Stump

On 9/6/20 12:02 PM, daggs wrote:

Greetings LAine,

When you say "the vm", you mean the one running libreelec, that is
trying to get and IP address, correct?

yes, you are correct.

I guess Broadcom.home is the IP of the VM that's running the dhcp
server? (I should have suggested using "tcpdump -n -e -v" :-/)

frankly, I have no idea who is Broadcom.home.

It's just some name tcpdump used to replace the IP address of one of the 
machines, and since it's the source IP of a DHCP reply packet, it most 
likely is the IP of the DHCP server.

here is the requested dump: https://dpaste.com/849DMX9ND

What I see in that dump is that the DHCP client (Mac address 
52:54:00:5a:4c:8c, hostname "streamer" repeatedly sends the exact same 
DHCP request (6 times), and the DHCP server responds to each of these 
requests alternating between sending the response to the client's MAC 
with a destination IP already set, and to the broadcast MAC + IP 
addresses) interspersed with several ARP requests directed at the MAC 
address of the client asking who has the IP that the server just 
suggested (so it's doing something different from what I described in my 
previous message - rather than using ARP to verify that an IP isn't 
already in use prior to assigning it, it's assuming it has full 
authority over IP addresses in the broadcast domain, assigning that IP 
to the client without checking for prior use, and then sending the ARP 
request to see if the client actually decided to use it.)

Eventually the client gives up (because it hasn't seen any valid DHCP 
responses) and gives itself an IP on the 169.254.0.0/16 network, then 
goes about the process of looking for other devices to connect to using 
that IP.

Was this dump taken on the host of the tap device of the client 
(libreelec aka streamer)? If so, I can only see two options: 1) there is 
something in iptables or ebtables (or nftables, if you have that on the 
host) blocking the DHCP response packets from going out the tap 
interface, or 2) there is something in the guest itself blocking the 
traffic or preventing the packet from passing.

For (1) you'd need to run "ebtables -L; iptables -S; nft list ruleset" 
and look for something suspicious.

For (2) can you try changing both the libreelec and the DHCP server vm's 
ethernet device models from virtio to e1000? (or e1000e if they are q35 
machinetypes)? If that works, then change one or the other back and see 
if it stops working.

> should I add another nic with static ip and try to trace the pkts 
from there?

>

You mean so you can ssh to the client/libreelec and run tcpdump there 
agains the interface that's doing dhcp? Is tcpdump even available on 
libreelec? I know it's very limited, and has no simple facilities for 
adding new packages. If it has tcpdump though, then sure. The only 
problem is that you would probably not be able to get tcpdump running 
via that interface quick enough to see the initial boottime dhcp 
exchange; instead you'll probably need to go into the UI and bring the 
other interface down/up to trigger a new DHCP cycle.

(BTW, if everything works when the client has a static IP address, then 
that proves there is no problem related to ARP requests/responses - that 
much is required in order for even a static IP to work)

Re: network config not working on newer libvirt

2020-09-05 Thread Laine Stump


On 9/4/20 6:47 PM, daggs wrote:

Greetings Laine,



I would start troubleshooting by making sure that the dhcp server is
running, and that you can communicate between the machine with DHCP
server and the guest once a manual IP is assigned. Then use tcpdump or
wireshark at different places on the path between those two to see how
far the DHCP request is getting out, whether a response is being sent by
the server, and if so how far the response is getting back (i.e. on the
host, run tcpdump on the guest's tap device; if you see the DHCP request
there, then run tcpdump on the bridge, if you see it there, run it on
the tap device for the guest, if you see it there, then run tcpdump
inside the guest; then check the dhcp server logs to see if it's
receiving requests. While you're doing all of this, you can also be
noticing whether or not a DHCP response is arriving at each step (and if
you see the response, you can skip looking further ahead in the packet
path, since you know by inference that it made it all the way to the
DHCP server). Once you find the point that the packet is blocked, you'll
be better able to determine why.




alright, I'll try that, thanks.



I've ran tcpdump on the vm's tap device, here is what I see:


When you say "the vm", you mean the one running libreelec, that is 
trying to get and IP address, correct?



01:42:15.404754 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request 
from 52:54:00:5a:4c:8c (oui Unknown), length 548
01:42:15.405075 IP Broadcom.Home.bootps > 10.0.0.40.bootpc: BOOTP/DHCP, Reply, 
length 300


I guess Broadcom.home is the IP of the VM that's running the dhcp 
server? (I should have suggested using "tcpdump -n -e -v" :-/)




01:42:15.735893 STP 802.1d, Config, Flags [none], bridge-id 
8000.52:54:00:6b:1b:92.8003, length 35
01:42:17.718941 STP 802.1d, Config, Flags [none], bridge-id 
8000.52:54:00:6b:1b:92.8003, length 35
01:42:17.846918 IP6 fe80::fc54:ff:fe5a:4c8c > ff02::2: ICMP6, router 
solicitation, length 16
01:42:19.702944 STP 802.1d, Config, Flags [none], bridge-id 
8000.52:54:00:6b:1b:92.8003, length 35
01:42:20.450441 ARP, Request who-has 10.0.0.40 tell Broadcom.Home, length 28

I think that issue is this:
01:42:20.450441 ARP, Request who-has 10.0.0.40 tell Broadcom.Home, length 28

I'm not sure if this is expected but looks like my dhcp server ignores it.
any thoughts on the matter?


It looks strange, but is normal. What usually happens is this:

1) The guest sends a DHCP Discover Request, suggesting that it would 
like to use the addres 10.0.0.40 (These details will be revealed once 
you add "-v" to your tcpdump commandline.



2) The DHCP server says to itself "Hmm, this guy wants to use 10.0.0.40, 
which is okay with me, but first I should see if someone else is using 
it", so it sends out an ARP request for 10.0.0.40. Then just to be sure, 
it sends another.


(at this point, if the server is dnsmasq and it hasn't received an ARP 
request, for some reason it sends an ICMP echo request to 10.0.0.40 (the 
requested/suggested IP) with destination MAC address of the client that 
just sent the DHCP request. No idea why. It won't be answered though 
(unless the client actually still had a lease on that address and was 
just renewing; but the DHCP server would know it if that's what was 
happening, so...)


3) If the server doesn't receive any response to the ARP request, then 
it will send a DHCP response to the requested IP + client MAC saying 
"Yes, you can use that IP address.


4) I'm not sure why (because it's been > 20 years since I last read the 
DHCP RFC), but in the case I just looked at on my host (which is using 
dnsmasq as the server, and dhclient as the client), the same request and 
response are sent/received at the same IP+MAC addresses a 2nd time.


5) at this point everybody agrees on the new IP address, the client sets 
its IP address, the server updates its leases table, and life carries on.


But to back up for a minute - it's completely normal for the DHCP server 
to send out an ARP request and get no response. I think things are going 
south sometime after that. Are you seeing a DHCP reply at all? If you 
don't see it on the libreelec (client) machine's tap device, check if 
you see it going *out* on the DHCP server's tap. If it's not there, then 
you'll need to debug inside the guest running the DHCP server.


Before this packet is receivd, the guest doesn't yet know that's its IP 
address, but it does know that's its MAC address, and it's waiting for a 
DHCP reply, so it takes the info from the reply, then sends another 
request, this time including all the options it received in the first reply.


4) Now

Re: network config not working on newer libvirt

2020-09-04 Thread Laine Stump


On 9/4/20 12:38 AM, daggs wrote:

Greetings,

up until a year ago, I was running a server with Debian 10 (stable) on it with 
the latest versions of libvirt, qemu and kernel 4.19.x Debian 10 had to offer 
(both libvirt and qemu versions were really old).

the network config was simple, one of the vm acted as a router and provided the 
ip for both the host and the vm.
I've recently switched distro and now I'm running latest stable libvirt, qemu 
and kernel 5.4.x


You haven't said which distro, nor what is the libvirt exact libvirt 
version (probably won't matter in this case, but in general "libvirt 
x.y.z" is more useful than "latest stable libvirt").


If you have full connectivity once you've manually assigned IP 
addresses, then you don't have any routing problems, so that can be 
counted out. (Anyway, DHCP packets never go beyond the local network).


In that case, you've most likely either got a firewall problem on host 
or guest, or a problem with your dhcp server.




I've tried to reinstate the network config on the new distro but I cannot get 
ip via dhcp for the second vm.
if I assign manual ip and gateway, I have access to the outside world.


From where? The host? or the guests?



here are the relevant dumps:
network on the router vm:
 
   
   
   
   
   
 

the other vm
 
   
   
   
   
   
 

and finally:

   default
   61bc1a72-bd02-408a-b88e-dec696742c20
   
   



Your config is for a bridge that's created by libvirt, but with no 
iptables rules, no dnsmasq instance, and no IP address on the host. So 
any DHCP server config is outside libvirt's realm, as are any iptables 
or nftables rules, so in this case there is nothing to look at in the 
libvirt config for either of these issues.


I would start troubleshooting by making sure that the dhcp server is 
running, and that you can communicate between the machine with DHCP 
server and the guest once a manual IP is assigned. Then use tcpdump or 
wireshark at different places on the path between those two to see how 
far the DHCP request is getting out, whether a response is being sent by 
the server, and if so how far the response is getting back (i.e. on the 
host, run tcpdump on the guest's tap device; if you see the DHCP request 
there, then run tcpdump on the bridge, if you see it there, run it on 
the tap device for the guest, if you see it there, then run tcpdump 
inside the guest; then check the dhcp server logs to see if it's 
receiving requests. While you're doing all of this, you can also be 
noticing whether or not a DHCP response is arriving at each step (and if 
you see the response, you can skip looking further ahead in the packet 
path, since you know by inference that it made it all the way to the 
DHCP server). Once you find the point that the packet is blocked, you'll 
be better able to determine why.





as it is possible I'm missing a kernel config, here is the output of lsmod:
vfio_pci   49152  6
vfio_virqfd16384  1 vfio_pci
vfio_iommu_type1   32768  2
vfio   28672  16 vfio_iommu_type1,vfio_pci
ip6table_nat   16384  1
iptable_nat16384  1
ebtables   24576  0
bridge143360  0
stp16384  1 bridge
llc16384  2 bridge,stp
cfg80211  647168  0
x86_pkg_temp_thermal20480  0
kvm_intel 237568  8
vhost_net  24576  3
vhost  36864  1 vhost_net
tap24576  1 vhost_net
kvm   663552  1 kvm_intel
tun53248  8 vhost_net
r8152  73728  0
nct677557344  0
mei_me 32768  0
hwmon_vid  20480  1 nct6775
irqbypass  16384  22 vfio_pci,kvm
mii16384  1 r8152
mei77824  1 mei_me
coretemp   16384  0
efivarfs   16384  1

and the .config at https://dpaste.com/9ZUCBDE9R

any ideas how to fix it?

Thanks,

Dagg.

Re: bridge + SR-IOV guests with KVM

2020-09-01 Thread Laine Stump

On 8/27/20 4:12 AM, Philipp Rosenberger wrote:

Hi,

I managed to get SR-IOV with an Intel I350 NIC to work.For this I
followed the documentation on this page:
https://wiki.libvirt.org/page/Networking#Assignment_from_a_pool_of_SRIOV_VFs_in_a_libvirt_.3Cnetwork.3E_definition

But as I have more VMs then VF on the NIC I also have a bridge wich
serves the other guests. As I run Debian Buster as host I followed the
documentation here:

https://wiki.libvirt.org/page/Networking#Debian.2FUbuntu_Bridging

Do I correctly assume that you're attaching the PF of the SRIOV card to
the bridge? (in particular, the PF of the same port that the VFs are from)

If I use only the SRIOV everything works as expected. All guests can be
reached for the network and the guests and host can reach each other.
The same goes for a sole bridged environment.

I think you missed a paragraph here - I'm inferring that at this point
you meant to say "But when I have both guests connected with an assigned
VF and guests connected via an virtio-net device connected to the bridge
via a tap device, the VF-connect guests cannot communicate with the
bridge-connected guests." Is that correct, or am I inferring too much?

As I dived into the issue I found an answer form the intel community:
https://community.intel.com/t5/Ethernet-Products/82599-VF-to-Linux-host-bridge/td-p/351802

By default all the ports on a Linux host bridge should have flood and
learning turned on, and I would have thought that (if manually adding a
mac address to the linux host bridge is enough to make the traffic flow)
that having flood+learning turnes on would be enough to get the bridge
to learn the proper port for traffic destined to the VF

Really, my first assumption would have been that the switch screwing up
was the switch in the SRIOV card incorrectly sending traffic for the
bridged guests directly out the PF's physical port instead of rather
than the Linux host bridge, and that the solution to make everything
work would be to somehow add the MAC address of the *bridged* guests
into the fdb of the SRIOV card.

[at this point I read further through your message and follow the link
to the slides...]

Ooohh. from slide 33, I see that *is* what's being done. It sounds
like that's what you *are* doing - adding an fdb entry to the internal
switch in the SRIOV card, *not* to the Linux host bridge (as I had
thought right up until 3 paragraphs ago), correct?

It again is interesting though - makes it sound like the switch in the
SRIOV card has no learning and no flood.

It says I need to add the "VF mac addresses and eth0 mac address to
bridge forwarding database".

That definitely is what's said in the comments from the Intel forum. But
isn't that the opposite of what they're saying on slide 33 of the
presentation you linked to? My understanding is that it's saying that
you need to add the *bridged* guest interface's MAC address to the fdb
on the SRIOV card's switch.

I have done this with the following command:
bridge fdb add 52:54:00:3c:1c:e6 dev eth_lan0

The mac address is from my VM which is on the bridge. And the eth_lan0
interface is the physical interface of my bridge and also the PF of my
I350 NIC.

Ah, okay, this verifies my first assumption (that the bridge is attached
to the PF). It also verifying that you did what's suggested in slide 33
of the presentation, not what was suggested in the Intel forum post. Is
that correct?

This seems to work. But doing this manually is annoying and a bit of a
hassle when creating new VMs on the bridge.

I there a way to let libvirt do this work?

Well, if you create a libvirt network for your bridge device, something
like this:

bridgenet

(this is an "unmanaged" network, i.e. libvirt expects the bridge to
already exist, doesn't add any iptables rules or dnsmasq instance - it
just creates tap devices and connects them to the already-existing
bridge), and then configure all your guest interfaces with:

...

then a network hook will be called each time one of these interfaces is
added or removed, and that script can add/remove the fdb entry. Hooks
are described here:

https://libvirt.org/hooks.html

If you have an executable file named /etc/libvirt/hooks/network (or a
file of your own naming in /etc/libvirt/hooks/network.d if your libvirt
is 6.5.0 or newer) it will be called anytime a guest interface is added
or removed from a libvirt network. The arguments will describe the
current action, and stdin of the script will receive the full XML config
of the network, as well as the xml for a , which contains
the interface's MAC address among other things. This should be enough
information to derive the proper "bridge fdb add" command. (you'll want
a similar clause in the same hook script that takes action when the
interface is removed from the network).

If you have trouble with the script, you can look

Re: support for live migration with PCI passthrough devices

2020-08-25 Thread Laine Stump


On 8/25/20 8:56 AM, Michal Privoznik wrote:

On 8/25/20 1:40 PM, Henry lol wrote:

Hi guys,

I'm wondering whether libvirt supports live migration for the VM with 
PCI passthrough devices.
or it must be assumed before live migration that all passthrough 
devices be unplugged?


Unfortunately, this is still not supported. The problem is that PCI 
devices themselves are not capable of dumping their internal state and 
restoring from it on destination.


There is a long thread started last month that discuss what the 
interface should look like, but at this point I guess we are still far 
away from it:


https://www.redhat.com/archives/libvir-list/2020-July/msg00675.html



If so, all unplugged devices should be manually hot-plugged to the VM 
after migration??


This is the usual mode of operation, yes.


There is one exception to this - if the device to be migrated is the VF 
of an SRIOV-capable network card, then there is machinery in qemu to 
automatically unplug the device before migration starts, and then 
automatically plug in a new similar device on the other end after the 
guest has started there.


This all works in concert with a virtio-net emulated network device that 
is bonded together in the guest (linux only) to provide a single network 
device in the guest - when the VF is visible and working, it will be 
preferred for all traffic, but if the VF is missing then the virtio-net 
emulated device will be used.


The result is that under normal operation the vfio-assigned VF is used 
for all network traffic, but during migration when the VF has been 
unplugged, the virtio-net device is used, so there is no disruption in 
network traffic, just a bit of degraded performance during migration.


Here is a description of how this is configured in a libvirt domain 
definition:



https://www.libvirt.org/formatdomain.html#teaming-a-virtio-hostdev-nic-pair


This is all fairly new, and unfortunately doesn't work properly for all 
SRIOV cards.

Re: multiple vms with same PCI passthrough

2020-08-18 Thread Laine Stump


On 8/18/20 3:09 PM, Laine Stump wrote:

On 8/17/20 8:40 PM, Daniel Black wrote:



This, for 4 pci devices, confused libvrtd in the meantime however it 
was still functional.


Aug 18 10:31:27 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files
Aug 18 10:31:55 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files
Aug 18 10:32:17 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files
Aug 18 10:32:32 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files


Unless you are assigning network devices, those errors are unrelated 
(and even then I'm not sure how they would be related). Those errors are 
saying that the very old and mysterious augeas + xslt parser of ifcfg 
files in the netcf package encountered an error while parsing said 
files. You *might* be able to get a better idea of the source of the 
problem by running "NETCF_DEBUG=1 ncftool list --all", but in the end 
it's not going to affect assignment of devices to guests using 




Oh, I just noticed that you're running on ubuntu. That means the error 
is coming from an equally old and mysterious augeas + xslt parser of 
*/etc/network/interfaces*, not ifcfg files. The same comment otherwise 
applies though.

Re: multiple vms with same PCI passthrough

2020-08-18 Thread Laine Stump


On 8/17/20 8:40 PM, Daniel Black wrote:



This, for 4 pci devices, confused libvrtd in the meantime however it was 
still functional.


Aug 18 10:31:27 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files
Aug 18 10:31:55 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files
Aug 18 10:32:17 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files
Aug 18 10:32:32 grit libvirtd[106082]: internal error: failed to get 
number of host interfaces: unspecified error - errors in loading some 
config files


Unless you are assigning network devices, those errors are unrelated 
(and even then I'm not sure how they would be related). Those errors are 
saying that the very old and mysterious augeas + xslt parser of ifcfg 
files in the netcf package encountered an error while parsing said 
files. You *might* be able to get a better idea of the source of the 
problem by running "NETCF_DEBUG=1 ncftool list --all", but in the end 
it's not going to affect assignment of devices to guests using

Re: multiple vms with same PCI passthrough

2020-08-17 Thread Laine Stump


On 8/8/20 11:53 PM, Daniel Black wrote:


In attempting to isolate vfio-pci problems between two different guest 
instances, the creation of a second guest (with existing guest shutdown) 
resulted in:.


Aug 09 12:43:23 grit libvirtd[6716]: internal error: Device :01:00.3 
is already in use
Aug 09 12:43:23 grit libvirtd[6716]: internal error: Device :01:00.3 
is already in use
Aug 09 12:43:23 grit libvirtd[6716]: Failed to allocate PCI device list: 
internal error: Device :01:00.3 is already in use


Hmm. Normally the error that would be logged if a device is already in 
use would say something like this:


error: Failed to start domain Win10-GPU
error: Requested operation is not valid: PCI device :05:00.0 is in
   use by driver QEMU, domain F30

So you're encountering this in an unexpected place.



Compiled against library: libvirt 6.1.0
Using library: libvirt 6.1.0
Using API: QEMU 6.1.0
Running hypervisor: QEMU 4.2.1
(fc32 default install)

The upstream code seems  also to test definitions rather than active 
uses of the PCI device.



That isn't the case. You're misunderstanding what devices are on the 
list. (see below for details)




My potentially naive patch to correct this (but not the failing test 
cases) would be:


diff --git a/src/util/virpci.c b/src/util/virpci.c
index 47c671daa0..a00c5e6f44 100644
--- a/src/util/virpci.c
+++ b/src/util/virpci.c
@@ -1597,7 +1597,7 @@ int
  virPCIDeviceListAdd(virPCIDeviceListPtr list,
                      virPCIDevicePtr dev)
  {
-    if (virPCIDeviceListFind(list, dev)) {
+    if (virPCIDeviceBusContainsActiveDevices(dev, list)) {
          virReportError(VIR_ERR_INTERNAL_ERROR,
                         _("Device %s is already in use"), dev->name);
          return -1;

Is this too simplistic or undesirable a feature request/implementation?


Only devices that are currently in use by a guest (activePCIHostdevs), 
or that libvirt is in the process of detaching from the guest + vfio and 
rebinding to the device's host driver (inactivePCIHostdevs) are on 
either list of PCI devices maintained by libvirt. Once a device is 
completely detached from the guest and (if "managed='yes'" was set in 
the XML config) re-binded to the natural host driver for the device, it 
is removed from the list and can be used elsewhere.


I just tested this with an assigned GPU + soundcard on two guests to 
verify that it works properly. (I'm running the latest upstream master 
though, so it's not an exact replication of your test)





I'd be more than grateful if someone carries this through as I'm unsure 
when I may get time for this.



Can you provide the XML for your  in the two guests, and the 
exact sequence of commands that lead to this error? There is definitely 
either a bug in the code, or a bug in what you're doing. By seeing the 
sequence of events, we can either attempt to replicate it, or let you 
know what change you need to make to your workflow to eliminate the error.

Re: Post-firewall hook to insert custom rules?

2020-08-17 Thread Laine Stump


On 8/17/20 5:15 AM, Gunnar Niels wrote:

Hello, I have a set of iptables rules that I need to insert *after* libvirt
has set up all of its firewall rules. Is there a hook that I can tap 
into in
order to run something like a custom script to make sure this happens? 
Any ideas?


-GN



You should be able to use a libvirt network hook script to do this:


https://libvirt.org/hooks.html

Basically you put an executable script in /etc/libvirt/hooks/network 
Once the network is started, the hook will be called with this commandline:


/etc/libvirt/hooks/network network_name started begin -

stdin will contain the entire network XML definition in case you want 
details, or want to extract some task-specific metadata from the network 
definition (syntax for that is here: 
https://libvirt.org/formatnetwork.html#elementsMetadata )


The same script will be called before the network is started, after it's 
shut down, and whenever a guest interface is attached or detached from 
the network - the details are in the web page linked above.

Re: pass-though Intel gpu int o a vm

2020-08-12 Thread Laine Stump




On 8/12/20 12:46 PM, daggs wrote:

what is the proper uri to use?


If you run virt-manager on some other machine, then select "File / New 
Connection", you will get a dialog box that will prompt you for the info 
needed, and create the proper URI for you. Utually it is 
"qemu+ssh://root@$remotehost/system".





Sent: Wednesday, August 12, 2020 at 7:23 PM
From: "Erik Skultety" 
To: "daggs" 
Cc: libvirt-users@redhat.com
Subject: Re: pass-though Intel gpu int o a vm


PS: You can even try assigning the GPU from within virt-manager which produces
the right XML bits for libvirt. I've never tried assigning an integrated Intel
GPU on my laptop (for obvious reasons), so I can't give you the kind of answer
guaranteeing this would work 100%.



the system in question is a headless server so virt-manager is nto an option.
thanks for the suggestion thought.


virt-manager can connect to libvirtd running on the headless server machine
remotely, so I still think it could be of help to you ;).

Erik

Re: ipv6 NAT; accept_ra errors and about network choice

2020-08-11 Thread Laine Stump


On 8/10/20 11:23 PM, Ian Wienand wrote:

Hello,

Firstly THANK YOU for the IPv6 NAT support merged in 6.5.  It has been
almost impossible to get IPv6 into a VM on a laptop that switches
between wifi and wired (dock) connections, because you can not add a
wifi interface to a bridge.  I know NAT is against the IPv6 end-to-end
xen but it makes this "just work" for the vast majority of people like
me who need to ssh/curl/talk to ipv6 only hosts!

So I installed 6.6.0 from the virt-preview repos on Fedora 32 to
eagerly test it out.

My network config looks like

   
   network
...  
   
 
   
   
   
   
   
 
   
 
   
   
   
  

The first problem I hit was trying to start that network:

  error: internal error: Check the host setup: enabling IPv6 forwarding
  with RA routes without accept_ra set to 2 is likely to cause routes
  loss. Interfaces to look at: wlp4s0

wlp4s0 is my wifi card that is configured by NetworkManager in a
completely unremarkable fashion.  By default it gets an ipv6 via SLAAC
from my router.  This feels a bit like the unresolved bug [1] which
says that systemd-networkd is handling the RA's in userspace for
... reasons [2].  It's unclear to me if NetworkManager is doing
similar.


Yes, and yes. The only reason I haven't done something about this is 
that I'm undecided *what* to do. On one hand it seems many (most) 
systems are handling RAs with a userspace process, so it doesn't matter 
that it's disabled in the kernel. On the other hand, the person who 
added this check must have had a valid reason for going to the trouble 
of adding it (rather than just documenting that you needed to set 
accept_ra to 2 for some set of interfaces (I forget right now exactly 
which ones, and I'm trying to wind my brain down for the end of the day, 
so don't want to go look it up :-)


I can see 3 possibilities:

1) completely remove the check, with the idea that while it was a good 
thing at the time, it's now obsolete.


2) have a config item (in /etc/libvirt/network.conf (which doesn't 
currently exist) maybe?) to let people manually disable the check.


3) try to make libvirt's code intelligent, and look for clues that RAs 
are handled elsewhere (someone would need to figure out what those 
"clues" are).




I feel like this must be a red-herring.  My wired interface has the
same setting of 0

  $ cat /proc/sys/net/ipv6/conf/enp0s31f6/accept_ra
  0

and is similarly just a very standard auto-configured NetworkManager
interface.  When I "net-start" the network whilst on wifi libvirt
doesn't seem to care about that interface (I presume it only looks at
the active one?).  When I dock and turn off wifi, ipv6 connectivity
continues to work through enp0s31f6, so I don't think the accept_ra
really matters in this case.


Because you're using NetworkManager. I've confirmed with [some NM 
person, I forget who or in what venue] that NM handles RAs itself, so 
accept_ra should be turned off in the kernel (it's not harmful if it's 
on as far as I know, it just does nothing useful)




I feel like this message is incorrect, and being as I've done nothing
special to my underlying interfaces probably going to be wrong for a
lot of people trying this?  Does anyone know the details of this
message and see why it would be required in this situation?


It isn't. We just need to decide which of the ways listed above to fix it.



The other thing that I'd like to expand the documentation on, if I can
get some clarity, is the choice of network.  It seems like it has to
be a /64, and it seems like the best choice is within fc00::/7, or at
least that is what has been assigned for private networks like this
[3]?


"locally assigned" addresses in IPv6 are... different. I've been trying 
to figure this out myself (in order to *automatically* assign a network 
address to a libvirt virtual network, as Dan suggested in the cover 
letter for the IPv6 NAT patches), and I *think* you need to at least set 
the lowest bit of the first byte of the address (that's the "locally 
assigned" bit). So that would mean that all networks should be somewhere 
within FD00::/8 (but please correct me if I'm wrong!)




The only problem with this is that I think glibc filters this range so
nothing prefers IPv6.


What?? Exactly what isn't preferring IPv6? Do you mean outbound 
connections that would be to an IPv6 address will be nixed in favor of 
an IPv4 address if the source IP of the connection was going to be in 
FC00::/7? Or something else? Do you have a reference for this?



 Is this the range expected to be used for ipv6
NAT?  If so, would a patch to drop some documentation breadcrumbs
about setting gai.conf or something be useful?


The man page for gai.conf *implies* that glibc is following the 
preference rules suggested in RFC3484, which was written prior to 
RFC4193, so it seems strange that it would give any special treatment to 
addresses in that range. Does it behave in the same way if you use 
FD00::... instead

Re: Problem with xen config

2020-08-10 Thread Laine Stump


On 8/10/20 4:38 AM, Christoph wrote:


xen_platform_pci=1


This setting is not supported in the libvirt libxl driver, but AFAICT
libxl sets the default to 'true' for HVM guests.


pci_msitranslate=1


This is also not supported in libvirt and unfortunately defaults to
'false'.


pci_permissive = 1


Patches have been submitted for this setting but so far there has not
been agreement on the schema change

https://www.redhat.com/archives/libvir-list/2020-April/msg01230.html


Hmm, so I see that discussion stalled after I asked a few questions, and 
then you (Jim) made a few suggestions about where the attribute should 
go and what it should be named. (note for newcomers - the original patch 
messages linked above were at the end of April, but the replies 
discussing them were in May, so they don't show up directly in the 
primitive mailman archive page for the original messages; to see them 
you need to go here:


https://www.redhat.com/archives/libvir-list/2020-May/thread.html

then search for "permissive" in the page - this will show you followups 
to all the original patch mails.



I'm not sure who should be pushing the discussion/decision - I've 
already said my piece, and would be just as happy with anything as long 
as it doesn't interfere with usage for qemu. Maybe someone else has a 
stronger opinion.

Re: virsh attach-interface auto up

2020-08-08 Thread Laine Stump


On 8/8/20 11:26 PM, brent s. wrote:

On 8/8/20 9:42 AM, Marc Roos wrote:
  
I am doing a virsh detach-interface and an attach-interface. Is it

possible to automatically bring the interface up after attaching it?



By coincidence, I was just playing with this with the python API.

The interface being brought up automatically, if I understand your
question correctly, depends on the OS of the guest having hotplug
support for the NIC you have selected for it.

It takes a couple seconds, but I can generally detach and re-attach an
interface in CentOS Linux, for instance, with only about a 2-5 second
hiccup in network traffic.


Yes, by default any hotplugged interface will be online when it's 
attached. You need to modify the XML of the interface to have it plugged 
in with an offline status. If it's not coming up in the guest, then 
that's something in the guest OS, not the emulated interface's online 
status, and will need to be taken care of in the guest OS.

Re: Libvirtd Fails to Launch First Time

2020-08-08 Thread Laine Stump


On 8/8/20 10:44 AM, Ken Swenson wrote:

Hello,

I'm having a quite odd issue with libvirtd. I have it set to start on 
boot via systemd service, however it seems to fail and the service 
'succeeds' and does not continue running the daemon. There is nothing in 
the journal logs however the libvirtd debug logs seem to show


2020-08-08 13:55:26.362+: 1386: debug : virCommandRunAsync:2619
: About to run ip link set lo netns -1
2020-08-08 13:55:26.362+: 1386: debug : virFileClose:134 :
Closed fd 26
2020-08-08 13:55:26.362+: 1386: debug : virFileClose:134 :
Closed fd 28
2020-08-08 13:55:26.362+: 1386: debug : virFileClose:134 :
Closed fd 30
2020-08-08 13:55:26.362+: 1386: debug : virCommandRunAsync:2621
: Command result 0, with PID 1610
2020-08-08 13:55:26.390+: 1386: debug : virCommandRun:2461 :
Result fatal signal 2, stdout: '' stderr: 'RTNETLINK answers: No
such process

I am not sure if this is relevant to the problem or not as I do not see 
anything else in the logs that indicates an issue during the launch 
process, other than many file descriptors closing but I assume that is 
normal. If I then start libvirtd a second time after it has been 
launched once it runs fine. I don't believe it is a race condition on 
boot as if I disable the auto start and manually start it it will still 
fail in the same way the first time.


If anyone has any ideas on what I may be able to check to fix this I 
would really appreciate it. It has been bugging me for some time now.


The command "ip link set lo netns -1" is called by the LXC driver 
(during its driver init) as a check to see if the OS supports network 
namespaces. This is done in the function lxcCheckNetNsSupport(), which 
was added to libvirt's LXC driver in commit 99ded85f4 in 2008, has 0 
comments about how it works, and has been functionally unchanged since then.


It appears that on your system for whatever reason the attempt to send 
the netlink message that sets the netns for lo to -1 fails the first 
time it is run, causing lxcCheckNetNsSupport() to return false. I don't 
see how this would cause libvirtd to exit though (I *guess* that's what 
you're saying happens?). I'm guessing this is a red herring.



What is the practical problem you have? After this "success" are you not 
able to start guests or run commands with virsh? Or were you just 
surprised that the libvirtd process wasn't there when you looked? If the 
latter, that could be due to using socket activation - libvirtd is no 
longer running all the time; systemd sets up the listening sockets and 
starts libvirtd when needed; if 120 seconds go by and there are no 
guests running and no management clients connected to libvirtd, then it 
will automatically exit.




(BTW, I'm probably the least knowledgeable person about kernel 
namespaces around, but I thought that all Linux systems have had 
namespace support for a very long time. Does anybody ever actually 
disable it? Do we really need this check?)

Re: Routed network can't reach outside network

2020-07-27 Thread Laine Stump

On 7/23/20 6:14 PM, Rui Correia wrote:

On Thu, Jul 23, 2020 at 10:36 PM Rui Correia > wrote:

Thanks for the headsup. I'll ask the Manjaro guys about the nft.
Hopefully they'll know if nft is installed and running.

Well, that was fast.
I've asked the guys and they told me Manjaro KDE doesn't come with 'nft' 
installed by default.
Then I searched the installed packages by nft and the only thing 
installed is the libnftnl package which seems to be related to nft but 
not nft itself.

So, I guess my system only has firewall iptables and ufw installed.
Hope this helps. I could run wireshark but I wouldn't know what to look for.
Any tips?

>

Back in your original message, you said this:

On 7/19/20 6:54 AM, Rui Correia wrote:

The host can ping the Debian VM and the Debian VM can ping the host but
the Debian VM cannot ping the router 10.0.0.1 or any ip address on the internet.

But in a later message you say this:

On 7/23/20 10:34 AM, Rui Correia wrote:
> But, for testing purposes (trying to reach the VM's from the KVM host)
> I don't need those static routes, right? Because right now I'd be ok
> if I could reach the VM's from the KVM host and right now I can't.

So which is correct?

It will probably make no difference (unless traffic leaving your "KVM 
Host" isn't actually using the interface named "wlo1", and in that case 
it makes *all* the difference!), but I would change this to simply:

The purpose of the "forward dev" is commonly misunderstood as having 
something to do with routine, but it doesn't - it only serves to add an 
iptables rule that will block traffic if it's coming from or going to 
any interface other than (in this case) "wlo1". ie. it's a security 
knob, not a routing knob; if you're not concerned about rogue guests 
then at best it's just creating extra overhead for each packet, and at 
worst it could be blocking traffic if it's misconfigured.

As for checking with wireshark/tcpdump, mainly the intent is just to 
see, when you send a packet from one end or the other, whether a 
corresponding packet shows up in the output of wireshark/tcpdump. As an 
example, let's say that you are trying to ping (from your original 
diagram) "desktop manjaro" (10.0.0.11) from "debian 10 VM" (10.2.2.10). 
First start a ping in a shell on debian 10 VM", then run a command like 
(as root) this on the KVM Host:

tcpdump -i virbr2 -n host 10.2.2.10

You should at least see one icmp "echo request" packet for each ping 
that is sent. You might even see an icmp response (and if so, hopefully 
is is an icmp echo reply, rather than destination unreachable or 
something like that).

If you see the outbound icmp echo request and an echo reply, then the 
problem is on your host or in the guest. If you see an echo request but 
no echo reply, then look at the next step out - wlo1 interface on the 
KVM host:

tcpdump -i wlo1 -n host 10.2.2.10

You should still see the outbound echo request. If not, then again your 
problem is on the KVM host. If you see the echo request, but no reply, 
then you need to go look on "manjaro desktop". Run the same tcpdump 
command there (as root), but replace "wlo1" with whatever is the name of 
the ethernet device on that host connecting it to the network.

At this point you may see an echo request *and* an outgoing echo 
response, but not see that response back at the KVM host. That's when 
you'll want to rerun tcpdump telling it to display the MAC address of 
the packets:

   tcpdump -i  -e -n host 10.2.2.10

Now you can look at the MAC address in the tcpdump output - it should 
contain the MAC of the KVM host, *not* the MAC of your router. If it has 
the MAC of your router, then you haven't added a routing table entry to 
the manjaro desktop's network config. Do that.

(or, possibly you just want to add a route to the router. That will 
work, but will result it a lot of duplicated traffic and ICMP redirect 
packets from the router to the manjaro desktop).

Anyway, there are many paths this can take, but that gives you an idea 
of how to use tcpdump. (you could do the same thing with wireshark, it's 
just a lot more overhead and lots of info when you really need very 
little (and also requires that wireshark be installed and a desktop 
session open, on all the machines involved).

Re: host and vm on isolated network, there is ip (via dhcp) but not ping

2020-07-21 Thread Laine Stump


On 7/20/20 12:38 PM, daggs wrote:

Greetings,

I've setup an vm with openwrt in it, defined a isolated lan between the vm and 
the host and booted the vm up.
I see the vm is up, made sure the vnic is visible in both the host and guest 
and added it to the br in the guest.
I've issued an dhcpd call on the vnic (labeled vnic0) in the host and got an 
ip, see:
dagg@NCC-5001D ~ $ dhcpcd vnet0


You didn't run "dhcpd" (which is a dhcp server) on the host, you ran 
"dhcpcd", which is a dhcp *client*. So you've ended up assigning an IP 
address to the tap device on the host. I guess the dhcp server that's 
issuing this IP address is part of openwrt in the guest?


A tap device on the host that is attached to a bridge is merely a 
conduit between the guest's emulated NIC and  the bridge device on the 
host, and should not have its own IP address (although it may work in 
certain cases, yours apparently being one of them, since you say the 
same setup works on a debian 10 host; hmm - maybe in the debian host you 
had been running dhcpcd on the bridge device rather than the tap?). In 
general when there is a bridged connection on the host, the IP address 
for the guest should be on the emulated network device *in the guest*, 
and the IP address for the host side of that connection should be on the 
bridge device in the host, *not* the tap device.


Now if the openwrt guest and the host are the only two entities 
communicating on this connection, then you could put an IP address on 
the tap device directly, but in that case you wouldn't want the tap to 
be attached to a bridge anyway. If that's the case, just define the 
interface in the guest as something like this:


   
  
  

  
  


The IP address inside  will set the IP of the *host* side of the 
tap device. You can also add routes to the host's routing table inside 
. See https://libvirt.org/formatdomain.html#ipconfig for details 
(it is very important to remember that the / *inside the 
 element* is used to set the IP address of the host side of the 
tap. An / as a toplevel subelement of  is intended 
to set those properties *in the guest*, and won't work at all in the 
case of qemu, since the hypervisor in that case has no visibility into 
the guest's IP network configuration).



DUID 00:01:00:01:23:dd:d8:5b:e0:d5:5e:d9:f2:e2
vnet0: IAID 00:10:20:bf
vnet0: rebinding lease of 192.168.1.130
vnet0: probing address 192.168.1.130/24
vnet0: soliciting an IPv6 router
vnet0: leased 192.168.1.130 for 43200 seconds
vnet0: adding route to 192.168.1.0/24
vnet0: adding default route via 192.168.1.1
forked to background, child pid 26279
dagg@NCC-5001D ~ $ ifconfig
virtsw0: flags=4163  mtu 1500
 ether 52:54:00:3e:3f:88  txqueuelen 1000  (Ethernet)
 RX packets 123098  bytes 16327962 (15.5 MiB)
 RX errors 0  dropped 0  overruns 0  frame 0
 TX packets 6  bytes 252 (252.0 B)
 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vnet0: flags=4163  mtu 1500
 inet 192.168.1.130  netmask 255.255.255.0  broadcast 192.168.1.255
 inet6 fe80::fc54:ff:fe10:20bf  prefixlen 64  scopeid 0x20
 ether fe:54:00:10:20:bf  txqueuelen 1000  (Ethernet)
 RX packets 45  bytes 8002 (7.8 KiB)
 RX errors 0  dropped 0  overruns 0  frame 0
 TX packets 39  bytes 2676 (2.6 KiB)
 TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

dagg@NCC-5001D ~ $ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
^C
--- 192.168.1.1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1018ms

the vm's xml can be found at https://pastebin.com/1gXBGcPb
virtsw0 is defined as follows:

   virtsw0
   c8eb15a3-cc5c-4bd6-8f3b-5790792ddccc
   
   


the os is gentoo, the versions are libvirt-6.2.0 qemu-5.0.0.
I have another server running debian 10 with the same virtsw0 definition, there 
the connection is working.



Check the iptables rules on both hosts and both guests to see if there 
are any differences.



/var/lib/libvirt/dnsmasq/virtsw0.macs has only [] in it, can that be the issue?


Since in your case the host is a dhcp *client*, that is irrelevant. I'm 
actually surprised that the file exists at all, since you have no  
section in your network definition, so dnsmasq should even be run.




Thanks,

Dagg.

Re: Could you please help with questions about the net failover feature

2020-07-08 Thread Laine Stump

On 7/8/20 10:02 AM, Ken Cox wrote:

On 7/8/20 1:30 AM, Stefan Assmann wrote:

On 2020-07-06 10:01, Laine Stump wrote:

On 7/6/20 5:10 AM, Yalan Zhang wrote:

Hi Laine,

For the feature testing before, I only test the linux bridge setting as
in 2), it works.
Now I tried 1), to use macvtap bridge mode connected to the PF, it can
not work as the hostdev interface can not get dhcp ip address on the
guest.
Check on host, the /var/log/messages and dmesg both says:

"Jul  6 04:54:45 dell-per730-xx kernel: ixgbe :82:00.1 
enp130s0f1: 1

Spoofed packets detected
..
Jul  6 04:56:17 dell-per730-xx kernel: ixgbe :82:00.1 enp130s0f1: 1
Spoofed packets detected
Jul  6 04:56:54 dell-per730-xx kernel: ixgbe :82:00.1 enp130s0f1: 1
Spoofed packets detected
"
(enp130s0f1 is the PF's interface name, and :82:00.1 is the PF's 
pci

address)
# rpm -q kernel
kernel-4.18.0-193.4.1.el8_2.x86_64

Could you please help to confirm if this is a kernel bug?  Thank you
very much!
Interesting. I'm not sure if this is expected behavior, or if it's 
improper

behavior and it just hasn't been tested before (obviously based on my
earlier recommendation, I think it *should* be able to work like 
this, and I

*thought* I had tried it, but maybe I just imagined it :-/).

I'm Cc'ing Stefan Assmann to see if he has an opinion on whether or 
not this
should work. For his convenience, here is a summary of the config: 
The setup
is that there is a bridge-mode macvtap interface on the PF, and one 
of the

VF's has been given the same MAC address as the macvtap. the macvtap
interface is connected to an emulated NIC in the guest, and the VF is
assigned to the guest with VFIO.

IIUC, the problem is using the same mac address for macvtap and for a 
VF.  This is what's causing the spoofed packets.

How is it a spoof? Because one interface detects a packet not 
originating from itself that has its own MAC address?

Is this check done by the kernel, or by the firmware on the card?

Both interfaces are specifically and consciously configured with the 
same MAC address (since that is a requirement for the simplified bonding 
provided by the virtio-net "failover" feature).

If DHCP isn't working, then I guess the guest is sending a DHCP discover 
packet out through the VF. How is this packet triggering anti-spoof 
protection, since it is the legitimate MAC address of that interface?

Or am I misinterpreting what's going on? (the log message just says a 
spoofed packet was detected, so it could be some other packet triggering 
the log, and this is all just a red herring...)

I'll try this today on my setup, which uses I350 cards (igb driver).

You have two choices for the backup virtio interface:

1) it can be a macvtap device connected to the PF of the same SRIOV 
device.

2) it can be a standard tap device connected to a Linux host bridge
(created outside libvirt in the host system network config) that is
attached to the PF (or alternately one of the VFs that isn't being used
for VMs, or to another physical ethernet adapter on the host that is
connected to the same network.

---
Best Regards,
Yalan Zhang
IRC: yalzhang

On Sun, Mar 22, 2020 at 6:50 AM Laine Stump mailto:la...@redhat.com>> wrote:

 On 3/21/20 1:08 AM, Yalan Zhang wrote:

  > In my understanding, the standby and primary hostdev interface
 may be in
  > different subnet.

 There is only one hostdev device in the team pair (that will be 
the one

 with  since it needs to be unplugged
 during migration). The other device must be a virtio device 
(the one
 with ). And no, they cannot be on 
different

 subnets. They must both connect into the same ethernet "collision
 domain", such that the guest could assign the same IP address 
to either

 of them and be able to communicate on the network.

 There is some explanation of the use case for this option. and 
some

 example config, here:

 https://www.libvirt.org/formatdomain.html#elementsTeaming

  > I'm not sure whether it is correct. Could you please help to
 explain?
  > Thank you in advance.
  >
  > For example, primary hostdev is connected to vf-pool with
 ,
  > while the standby is connected to NAT network with " forward
 dev='eth0'".
  > The standby interface will get ip as 192.168.122.x, but after
 NAT, it
  > will be in the same subnet of the vf.
   >
  > So after the VF is unplugged, the packet will still 
broadcast in the
  > same subnet, and the vm will get the packet as the standby 
share the

  > same mac. Right?

 No, not right :-)

 The VF of an SRIOV network adapter is connected directly to the
 physical
 network, and will have an IP address that is on that network. Tap
 devices plugged into the default network (or any other libvirt 
network
 based on a bridge device

Re: Why wireless interface cannot be attached to a Linux host bridge?

2020-07-07 Thread Laine Stump


On 7/7/20 12:29 PM, Ken D'Ambrosio wrote:

On 2020-07-07 11:26, ryotaro kobayashi wrote:



I'm from japan and using machine translation, so I apologize if it's 
hard to read.

[...]
P.S.  The machine translation worked very, very well.


Scary! :-) I didn't even notice that until you mentioned it in your 
response, and it all looked very natural :-O

Re: Why wireless interface cannot be attached to a Linux host bridge?

2020-07-07 Thread Laine Stump


On 7/7/20 11:26 AM, ryotaro kobayashi wrote:

Hello, everyone.

I'm from japan and using machine translation, so I apologize if it's 
hard to read.


I am currently trying to build a virtual environment using Ubuntu and kvm.

However, I found out from the following page that the virtual machine 
cannot use the bridge network because I am using a wireless network.


https://wiki.libvirt.org/page/Networking

I am having trouble with this because my PC is using a wireless LAN.

On that page it says "wireless interfaces cannot be attached to a Linux 
host bridge",  I Can you tell me why this is so?


Is it a limitation of the NIC driver for the wireless LAN?


Not really.


Or is it a limitation of libvirt?


Definitely not. Completely out of libvirt's control.


It is a limitation of

1) the way that almost all wireless connections work (only traffic 
to/from a single MAC address is allowed on a particular wireless 
connection).


(Yes, I know there are some wireless modes that allow multiple MAC 
addresses on a single association. But those modes aren't supported by 
most wireless APs or clients.)


2) Because of (1), the Linux kernel doesn't allow wireless network 
devices to be attached to a Linux host bridge device.



Some people have had success with IPv6 by enabling proxy ARP on the 
bridge device and wireless interface (without directly attaching them to 
each other, then manually adding a host route pointing to the guest's 
IP. It might be possible to do something similar for IPv4 using proxy 
ARP, but I personally haven't played with the idea.

Re: Could you please help with questions about the net failover feature

2020-07-06 Thread Laine Stump

On 7/6/20 5:10 AM, Yalan Zhang wrote:

Hi Laine,

For the feature testing before, I only test the linux bridge setting as 
in 2), it works.
Now I tried 1), to use macvtap bridge mode connected to the PF, it can 
not work as the hostdev interface can not get dhcp ip address on the guest.

Check on host, the /var/log/messages and dmesg both says:

"Jul  6 04:54:45 dell-per730-xx kernel: ixgbe :82:00.1 enp130s0f1: 1 
Spoofed packets detected

..
Jul  6 04:56:17 dell-per730-xx kernel: ixgbe :82:00.1 enp130s0f1: 1 
Spoofed packets detected
Jul  6 04:56:54 dell-per730-xx kernel: ixgbe :82:00.1 enp130s0f1: 1 
Spoofed packets detected

"
(enp130s0f1 is the PF's interface name, and :82:00.1 is the PF's pci 
address)

# rpm -q kernel
kernel-4.18.0-193.4.1.el8_2.x86_64

Could you please help to confirm if this is a kernel bug?  Thank you 
very much!

Interesting. I'm not sure if this is expected behavior, or if it's 
improper behavior and it just hasn't been tested before (obviously based 
on my earlier recommendation, I think it *should* be able to work like 
this, and I *thought* I had tried it, but maybe I just imagined it :-/).

I'm Cc'ing Stefan Assmann to see if he has an opinion on whether or not 
this should work. For his convenience, here is a summary of the config: 
The setup is that there is a bridge-mode macvtap interface on the PF, 
and one of the VF's has been given the same MAC address as the macvtap. 
the macvtap interface is connected to an emulated NIC in the guest, and 
the VF is assigned to the guest with VFIO.

I'll try this today on my setup, which uses I350 cards (igb driver).

You have two choices for the backup virtio interface:

1) it can be a macvtap device connected to the PF of the same SRIOV device.

2) it can be a standard tap device connected to a Linux host bridge
(created outside libvirt in the host system network config) that is
attached to the PF (or alternately one of the VFs that isn't being used
for VMs, or to another physical ethernet adapter on the host that is
connected to the same network.

---
Best Regards,
Yalan Zhang
IRC: yalzhang

On Sun, Mar 22, 2020 at 6:50 AM Laine Stump <mailto:la...@redhat.com>> wrote:

On 3/21/20 1:08 AM, Yalan Zhang wrote:

 > In my understanding, the standby and primary hostdev interface
may be in
 > different subnet.

There is only one hostdev device in the team pair (that will be the one
with  since it needs to be unplugged
during migration). The other device must be a virtio device (the one
with ). And no, they cannot be on different
subnets. They must both connect into the same ethernet "collision
domain", such that the guest could assign the same IP address to either
of them and be able to communicate on the network.

There is some explanation of the use case for this option. and some
example config, here:

https://www.libvirt.org/formatdomain.html#elementsTeaming

 > I'm not sure whether it is correct. Could you please help to
explain?
 > Thank you in advance.
 >
 > For example, primary hostdev is connected to vf-pool with
,
 > while the standby is connected to NAT network with " forward
dev='eth0'".
 > The standby interface will get ip as 192.168.122.x, but after
NAT, it
 > will be in the same subnet of the vf.
  >
 > So after the VF is unplugged, the packet will still broadcast in the
 > same subnet, and the vm will get the packet as the standby share the
 > same mac. Right?

No, not right :-)

The VF of an SRIOV network adapter is connected directly to the
physical
network, and will have an IP address that is on that network. Tap
devices plugged into the default network (or any other libvirt network
based on a bridge device that is created/managed by libvirt) have no
direct connection to the physical network, and are on a different
subnet. The fact that traffic from the guest *seems* to be coming from
an IP on the physical subnet is meaningless. The *guest* needs to be
able to use both NICs using the same IP address, and anything plugged
into the default network will need to have an IP address on a different
subnet from the perspective of the guest.

You have two choices for the backup virtio interface:

1) it can be a macvtap device connected to the PF of the same SRIOV
device.

2) it can be a standard tap device connected to a Linux host bridge
(created outside libvirt in the host system network config) that is
attached to the PF (or alternately one of the VFs that isn't being used
for VMs, or to another physical ethernet adapter on the host that is
connected to the same network.

It is simplest to have the same name refer to the connection on the
source and destination hosts of a migration. That can be handled by

Re: Virtual Bridge "Network" for Sandbox

2020-06-29 Thread Laine Stump


On 6/29/20 12:43 PM, Paul O'Rorke wrote:

Thanks Laine,

I will take a look at Open vSwitch, it looks interesting.

I am a generalist, I need to know enough about a lot of things to get 
many different tasks done, but do not have the in depth knowledge 
required to "patch" anything.  If I manage to wrangle a working solution 
should I post it?


Even a list of the steps you took to implement it manuall external to 
libvirt would be useful. Maybe that would inspire someone else to add 
support in libvirt virtual networks. We used to put stuff like that in 
the wiki, but I think the preferred location has changed / is changing 
and I'm not sure at the moment what the new norm is.




Needless to say I would be supportive of said feature being implemented 
by those more competent than I...


Jocularity aside, thanks for the heads up on Open vSwitch.

*Paul O'Rorke*


On 2020-06-29 9:13 a.m., Laine Stump wrote:

On 6/29/20 11:01 AM, Paul O'Rorke wrote:

Hi all,

I couldn't find any documentation on this, hopefully someone can 
point me in the right direction.


I recently set up a sand-boxed environment for our developers. There 
are domain controller(s), workstations and servers in there.  The 
whole thing is running on a single host using a "Virtual Network" 
defined in virt-manager on that host.


Now I find I want to add more guests and there are not enough 
resources on this one host.  Can I somehow make this Virtual Network 
available to two hosts?  I do not want to move to a bridged network 
and have to physically join the two hosts with a discrete link when 
they are already on the same subnet at the host level.


Is that possible?


You might be able to this using OpenvSwitch (iow "probably can, but I 
don't know the details" :-)) but libvirt doesn't have anything to set 
it up for you; you would need to create and configure the OVS switch 
outside of libvirt, then attach the libvirt guests to that switch 
(using " ...  
...")


I've idly thought about having this as a libvirt feature over the 
years, but as I never have that many guests, it was never a personal 
priority, and it wasn't immediately clear what was the best way to 
handle, e.g. DHCP, and routing to the outside. Definitely "patches are 
welcome" though :-)

Re: Virtual Bridge "Network" for Sandbox

2020-06-29 Thread Laine Stump


On 6/29/20 11:01 AM, Paul O'Rorke wrote:

Hi all,

I couldn't find any documentation on this, hopefully someone can point 
me in the right direction.


I recently set up a sand-boxed environment for our developers. There are 
domain controller(s), workstations and servers in there.  The whole 
thing is running on a single host using a "Virtual Network" defined in 
virt-manager on that host.


Now I find I want to add more guests and there are not enough resources 
on this one host.  Can I somehow make this Virtual Network available to 
two hosts?  I do not want to move to a bridged network and have to 
physically join the two hosts with a discrete link when they are already 
on the same subnet at the host level.


Is that possible?


You might be able to this using OpenvSwitch (iow "probably can, but I 
don't know the details" :-)) but libvirt doesn't have anything to set it 
up for you; you would need to create and configure the OVS switch 
outside of libvirt, then attach the libvirt guests to that switch (using 
" ...  ...")


I've idly thought about having this as a libvirt feature over the years, 
but as I never have that many guests, it was never a personal priority, 
and it wasn't immediately clear what was the best way to handle, e.g. 
DHCP, and routing to the outside. Definitely "patches are welcome" 
though :-)

Re: No outbound connectivity from guest VM(fedora 32)

2020-06-09 Thread Laine Stump


On 6/8/20 8:55 AM, Justin Stephenson wrote:

On Mon, Jun 8, 2020 at 5:09 AM Daniel P. Berrangé  wrote:


On Fri, Jun 05, 2020 at 01:27:08PM -0400, Justin Stephenson wrote:

Hi,

I recently installed a fresh install of Fedora 32 and I am having
trouble with my virtual machine networking, I can ssh and connect into
my guest VMs from my host, but the guest VMs cannot ping out to the
internet.

I am using the "default" NAT virtual network, the interesting thing is
I have made no configuration changes on my host or in the guest VMs,
simply created and installed two VMs(Fedora and RHEL8) in Fedora where
the VMs are having the same issue.

I am happy to provide any logs or command output if that would help.


Do you have "podman" installed on your host ? As there is an issue
with podman loading "br_netfilter" which is harming libvirt default
network traffic..


Hi, yes I am using podman for some development tasks. However I don't
see any br_netfilter module loaded:

  # lsmod | grep br_netfilter
  # grep 'netfilter' /proc/modules

I'm not sure if it matters but my host laptop is also connected wirelessly.


Since it's not the "problem du jour" with F32, here's a few other things 
you can try:


1) Try "systemctl restart libvirtd.service" (which reloads libvirt's 
iptables rules), and then start the VM again to see if the problem is 
solved. (If this fixes it, then something that is starting after 
libvirtd.service is adding a firewall rule that blocks the outbound 
guest traffic)


2) You say this was a fresh install of F32. Have yourun dnf update to 
make sure you have all post-release updates to libvirt and firewalld 
packages? If not, try that first.


(BTW, can you ssh from guest to host?)

3) see if you can ping from the guest to the outside network. If you can 
ping but can't ssh, then again there is a firewall problem. make sure 
the libvirt zone exists in firewalld config, and that virbr0 is a part 
of that zone. (aside from allowing inbound dns, dhcp and ssh from guests 
to the host, the libvirt zone has a default "ACCEPT" policy, which will 
allow packets to be forwarded from the guest through the host. If virbr0 
is on a different zone, then the default policy won't be ACCEPT, and 
forwarded traffic will be rejected. all libvirt networks are put into 
firewalld's "libvirt" zone by default, so this should always be the case)


Beyond those suggestions, I'm not sure what else to recommend, other 
than that you might get a quicker response on troubleshooting like this 
by logging into irc.oftc.net and joining the #virt channel :-)

Re: macvtap direct

2020-05-18 Thread Laine Stump

On 5/18/20 10:51 PM, Subhendu Ghosh wrote:

On Thu, May 14, 2020 at 1:32 PM Laine Stump <mailto:la...@redhat.com>> wrote:

On 5/13/20 12:52 AM, Subhendu Ghosh wrote:
 > Hi
 >
 > Couple of questions around macvtap direct usage:
 >
 > 1) is the document here current?
 > https://libvirt.org/formatnetwork.html#examplesDirect

Yes. None of that has changed in any major way in many years.

kernelNewbies documents mactap bridge as VMs can host can all talk to 
each other without an external bridge

Correct. The VMs can talk to each other, but they can't talk to the host.

External bridge/switch is only needed for VEPA mode with hairpin.

VEPA is a special IBM thing that requires a particular kind of switch. 
bridge mode macvtap still doesn't allow direct communication between 
host and guest - it requires a switch that hairpins traffic, or for the 
host to have a separate macvlan interface that is attached to the 
ethernet device (so that it is a peer to the guests' macvtap devices, 
and they can communicate with it).

https://virt.kernelnewbies.org/MacVTap

Perhaps the original development of macvtap to support VEPA influenced 
the early docs and was never reviewed after bridge mode matured?

No.

If you are able to communicate between your host and guests that are 
connected only via a macvtap bridge mode connection, then either your 
switch is hairpinning the traffic, or you have a separate macvlan 
interface for the host that is attached to the ethernet. There was no 
"change in design after early docs that was never reviewed" - the 
design/implementation of macvtap is as it is documented.

(Just because your claim made me doubt myself, I checked again on a 
Fedora 32 host and verified that it still works as it always has).

 >
 > I have been able to get host to guest network traffic without any
 > special configuration or switch since Fedora 28 when I first started
 > using it. Using  requires switch port
mirroring, but
 > just using  doesn't.

If that is the case, then either your guest and host have a secondary
network connection, or your switch is mirroring traffic and you just
didn't know about it. The inability to do direct host<->guest
communication is inherent in the design of macvtap interfaces.

 >
 > 2) do any of the language libraries make assumptions that libvirt
 > networks must have a  attribute? Foreman's Ruby
 > interface to libvirt errors out with attempting to build a VM on
a KVM
 > host with a network defined with 
 > https://projects.theforeman.org/issues/25890

The 2nd line in the log attached to that issue report says this:

  >Call to virNetworkGetBridgeName failed: internal error: network
'macvtap-net' does not have a bridge name.

So, your application (or whatever this "Foreman's Ruby interface to
libvirt" is) has called virNetworkGetBridgeName() (whatever it's called
in the Ruby bindings), and since you have a macvtap network, which has
no bridge device, libvirt sent back an error. You need to find whatever
in your code is calling virNetworkGetBridgeName().

Re: macvtap direct

2020-05-14 Thread Laine Stump


On 5/13/20 12:52 AM, Subhendu Ghosh wrote:

Hi

Couple of questions around macvtap direct usage:

1) is the document here current?
https://libvirt.org/formatnetwork.html#examplesDirect


Yes. None of that has changed in any major way in many years.



I have been able to get host to guest network traffic without any 
special configuration or switch since Fedora 28 when I first started 
using it. Using  requires switch port mirroring, but 
just using  doesn't.



If that is the case, then either your guest and host have a secondary 
network connection, or your switch is mirroring traffic and you just 
didn't know about it. The inability to do direct host<->guest 
communication is inherent in the design of macvtap interfaces.





2) do any of the language libraries make assumptions that libvirt 
networks must have a  attribute? Foreman's Ruby 
interface to libvirt errors out with attempting to build a VM on a KVM 
host with a network defined with 

https://projects.theforeman.org/issues/25890


The 2nd line in the log attached to that issue report says this:

>Call to virNetworkGetBridgeName failed: internal error: network 
'macvtap-net' does not have a bridge name.


So, your application (or whatever this "Foreman's Ruby interface to 
libvirt" is) has called virNetworkGetBridgeName() (whatever it's called 
in the Ruby bindings), and since you have a macvtap network, which has 
no bridge device, libvirt sent back an error. You need to find whatever 
in your code is calling virNetworkGetBridgeName().

Re: Libvirt APIs for creating virtual networks

2020-05-08 Thread Laine Stump


On 5/7/20 2:18 PM, Santhosh Kumar Gunturu wrote:


Does the libvirt has any capabilities to get the statistics of DHCP 
server ? How many packets received/sent ?

Is there a way to get those statistics if the APIs are not available ?


libvirt spawns a dnsmasq instance for each network to handle DHCP 
services, so libvirt itself doesn't see any of that traffic. A quick 
search for "statistic" in the dnsmasq man page shows that when you send 
the dnsmasq process a SIGUSR1, it will write a bung of statistics to the 
system log, and that you can apparently also get some of the data via 
sending a special dns request to the server. Just search for 
"statistics" in  "man dnsmasq". I don't know if any of the information 
they're logging is what you're after, but as far as I can see, that's 
what is available and how to get to it.

Re: Libvirt APIs for creating virtual networks

2020-04-30 Thread Laine Stump


On 4/28/20 12:01 PM, Daniel P. Berrangé wrote:

On Tue, Apr 28, 2020 at 08:51:45AM -0700, Santhosh Kumar Gunturu wrote:

Okay. Thanks.

Do we have any facility APIs to set the DHCP Options via XML ?
Default gateway ?


libvirt has no supported method of specifying a default gateway other 
than the IP of the bridge device on the virtualization host it self, and 
DHCP clients on these networks will always end up getting their default 
gateway set to the IP address of that bridge. Fortunately (for you :-) 
that's not because libvirt is explicitly setting that address in the 
dnsmasq config file, but just because that's what dnsmasq does when no 
gateway address is specified in the config file.


You would set this in the dnsmasq.conf file with dhcp-option, e.g.:

  dhcp-option=option-router,192.168.122.5

and recent libvirt (5.6.0 and newer) allows adding arbitrary lines to 
the dnsmasq.conf files it creates for its networks, using the 
 element in the network XML. For details on how to do 
this, look at:


  https://libvirt.org/formatnetwork.html#elementsNamespaces




Dns-server ?


Not exact, but



is *kind of* what you're looking for. It doesn't set the IP address sent 
back in the dhcp reply, but sets up the dnsmasq instance listening for 
the network to forward all requests on to 8.8.8.8 (you can also refine 
this to forward the requests for only certain domains, by adding 
"domain='example.com'" to the  element).


(If you *really* need to have the guest send DNS requests directly to 
the upstream DNS server rather than via dnsmasq, then you would need to 
use  to set something like 
"dhcp-option=option-dns-server,8/8/8/8")



domain-name ?


domain can be set with "".




Everything is controlled through the XML document described here:

https://libvirt.org/formatnetwork.html

We don't have separate APIs for each piece of info - just the one
virNetworkDefineXML API that takes the XML document.

Regards,
Daniel

Re: plug pre-created tap devices to libvirt guests

2020-04-06 Thread Laine Stump


On 4/6/20 9:54 AM, Daniel P. Berrangé wrote:

On Mon, Apr 06, 2020 at 03:47:01PM +0200, Miguel Duarte de Mora Barroso wrote:

Hi all,

I'm aware that it is possible to plug pre-created macvtap devices to
libvirt guests - tracked in RFE [0].

My interpretation of the wording in [1] and [2] is that it is also
possible to plug pre-created tap devices into libvirt guests - that
would be a requirement to allow kubevirt to run with less capabilities
in the pods that encapsulate the VMs.

I took a look at the libvirt code ([3] & [4]), and, from my limited
understanding, I got the impression that plugging existing interfaces
via `managed='no' ` is only possible for macvtap interfaces.



No, it works for standard tap devices as well.


The reason the BZs and commit logs talk mostly about macvtap rather than 
tap is because 1) that's what kubevirt people had asked for and 2) it 
already *mostly* worked for tap devices, so most of the work was related 
to macvtap (my memory is already fuzzy, but I think there were a couple 
privileged operations we still tried to do for standard tap devices even 
if they were precreated (standard disclaimer: I often misremember, so 
this memory could be wrong! But definitely precreated tap devices do work).



I think though that when someone from kubevirt actually tried using a 
precreated macvtap device, they found that their precreated device 
wasn't visible at all to the unprivileged libvirtd in the pod, because 
it was in a different network namespace, or something like that. So 
there may still be more work to do (or, again, my info might be out of 
date and they figured out a proper solution).





Would you be able to shed some light into this ? Is it possible on
libvirt-5.6.0 to plug pre-created tap devices to libvirt guests ?

[0] - https://bugzilla.redhat.com/show_bug.cgi?id=1723367

This links to the following message, which illustrates how to use pre-create
tap and macvtap devices:

   https://www.redhat.com/archives/libvir-list/2019-August/msg01256.html

Laine: it would be useful to add something like this short guide to the
knowledge base docs



You mean the wiki? Sure, I can do that.


(BTW - that was admirable reading / searching / responding - 7 minutes 
and it wasn't even your patch! How do you do that? :-))

Re: Could you please help with questions about the net failover feature

2020-03-21 Thread Laine Stump


On 3/21/20 1:08 AM, Yalan Zhang wrote:

In my understanding, the standby and primary hostdev interface may be in 
different subnet.


There is only one hostdev device in the team pair (that will be the one 
with  since it needs to be unplugged 
during migration). The other device must be a virtio device (the one 
with ). And no, they cannot be on different 
subnets. They must both connect into the same ethernet "collision 
domain", such that the guest could assign the same IP address to either 
of them and be able to communicate on the network.


There is some explanation of the use case for this option. and some 
example config, here:


https://www.libvirt.org/formatdomain.html#elementsTeaming

I'm not sure whether it is correct. Could you please help to explain? 
Thank you in advance.


For example, primary hostdev is connected to vf-pool with , 
while the standby is connected to NAT network with " forward dev='eth0'".
The standby interface will get ip as 192.168.122.x, but after NAT, it 
will be in the same subnet of the vf.

>
So after the VF is unplugged, the packet will still broadcast in the 
same subnet, and the vm will get the packet as the standby share the 
same mac. Right?


No, not right :-)

The VF of an SRIOV network adapter is connected directly to the physical 
network, and will have an IP address that is on that network. Tap 
devices plugged into the default network (or any other libvirt network 
based on a bridge device that is created/managed by libvirt) have no 
direct connection to the physical network, and are on a different 
subnet. The fact that traffic from the guest *seems* to be coming from 
an IP on the physical subnet is meaningless. The *guest* needs to be 
able to use both NICs using the same IP address, and anything plugged 
into the default network will need to have an IP address on a different 
subnet from the perspective of the guest.


You have two choices for the backup virtio interface:

1) it can be a macvtap device connected to the PF of the same SRIOV device.

2) it can be a standard tap device connected to a Linux host bridge 
(created outside libvirt in the host system network config) that is 
attached to the PF (or alternately one of the VFs that isn't being used 
for VMs, or to another physical ethernet adapter on the host that is 
connected to the same network.



It is simplest to have the same name refer to the connection on the 
source and destination hosts of a migration. That can be handled by 
creating a libvirt network to refer to the bridge device created outside 
libvirt (or to the PF directly if you're going to use macvtap.


For example, if you're going to use macvtap, and the PF's name on the 
host is ens4f0, you'd just create this network:


  
persistent-net

  

   

any guest interface with this:

 
   
   
   
   
   
 

will get a macvtap device that's connected to ens4f0 in bridge mode.

Or, if your host has a bridge device called br0 that is directly 
attached to the physical network (in whatever manner, it doesn't 
matter), you can define the network this way:


  
persistent-net


   

The XML for the guest interface would be the same.

Then for the vfio (transient) interface, you could also define a network:

   
 transient-net
 
   
 
   

and instead of using  in the guest config, you 
would use this:




  
  [1]
  
  
   

Even if the device names change on the other host (the destination of 
the migration), as long as the other host has networks named 
"persistent-net" and "transient-net" that are of similar types (macvtap 
or bridged for persistent-net, and hostdev for transient-net) then 
libvirt will be able to migrate the guest from one host to the other 
with no mangling of the XML required.

Re: can hotplug vcpus to running Windows 10 guest, but not unplug

2020-02-17 Thread Laine Stump


On 2/14/20 11:17 AM, Gianluca Cecchi wrote:
On Fri, Feb 14, 2020 at 4:54 PM Lentes, Bernd 
> wrote:



qemu-kvm-2.11.2-5.18.1.x86_64

[...]

I found a table on

https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.3/html/virtual_machine_management_guide/cpu_hot_plug
saying that hotplugging is possible but no hotunplugging.
But i don't know how recent this information is and if RedHat uses
libvirt/qemu.

RHV uses a special version of qemu-kvm named qemu-kvm-rhev.
oVirt, the upstream product of RHV, uses a rebuilt package named 
qemu-kvm-ev.


Just to make sure there's no misunderstanding about the content of these 
special versions of the qemu-kvm package, I wanted to point out that the 
qemu-kvm-rhev/qemu-kvm-ev used by RHV/oVirt (and also OpenStack) are 
actually *closer* to upstream qemu, not further away, than the standard 
qemu-kvm package in the same release of RHEL/CentOS. Everything in *all* 
the different builds of the package is upstream, but the -rhev/-ev 
versions of the packages have a more aggressive rebase-from-upstream 
schedule, and also have more not-yet-in-the-rebase features that are 
backported from later upstream releases. The result is that the standard 
RHEL/CentOS qemu-kvm package is more stable (since it mostly only gets 
bugfixes), while the -rhev/-ev packages have more new features (at the 
risk of encountering regressions due to the new code in those features)


Backporting of new features to a downstream release can sometimes mean 
that a feature not present in qemu-kvm-x.y.z upstream *is* present in 
qemu-kvm-rhev-x.y.z, so looking at the upstream documentation might lead 
you to believe the package you're using doesn't have feature X, when 
actually it does. But before that can happen, the feature must have 
already gone upstream and be available there (just in a slightly 
higher-numbered, but earlier-released, version). "Upstream first" isn't 
just a nice idea, it's the rule (and a way of life)! :-)

Re: Bridge-less VM

2020-01-17 Thread Laine Stump


(look towards the bottom of the message for the "hidden lead" :-)


On 1/16/20 1:16 PM, Rob Roschewsk wrote:

I'm trying to create a free standing VM that doesn't connect to a bridge.

This is supposedly do able according to the WIKI:
https://libvirt.org/formatdomain.html#elementsNICSEthernet

But with a config similar to:

   
     
     
   

When starting the domain I get the error:
error: internal error: process exited while connecting to monitor: 
2020-01-16T18:08:04.788860Z qemu-system-x86_64: -netdev 
tap,id=hostnet0,vhost=on,vhostfd=26: could not open /dev/net/tun: 
Operation not permitted


Checked permissions on /dev/net/tunand it's 666

If I just configure it as a "bridge" connection the domain starts. Then 
I can use brctl to remove it from the bridge to get what I want. That 
just proves it possible but with extra steps (Shout out to Rick and Morty)


Thoughts??
Running Ubuntu 16.04.1 Kernel 4.15.0-74
libvirt 1.3.1-1ubuntu10.27


Oh, I see - you're running a version of libvirt that was released 
exactly 4 years ago today! You should notice in the web page you 
referenced above that  has only been 
supported since libvirt 5.7.0, which was released just last September. 
What you're running is incredibly ancient, and there is *a lot* of stuff 
documented on libvirt.org that it doesn't support (although we do try to 
note the minimum version required for any new feature).


Aside from missing the very recent "managed='no'" feature (which I think 
you probably don't need or even care about - you'd rather just let 
libvirt create the tap device for you, right?), the version of libvirt 
you're running doesn't even contain the very *old* commit 9c17d665fd 
(from March of 2016) which removed the necessity to run qemu as root 
when using .


So you have two choices:

1) upgrade to a newer libvirt (something in the 5.x.0 or even 6.0.0 if 
possible). In this case you'll also probably want to remove the dev='blah' managed='no'/> line (unless you really do want give the 
device a specific name (controlled by "dev='blah'") and/or precreate the 
tap device yourself (controlled by "managed='no'").


2) change /etc/libvirt/qemu.conf to tell libvirt it should run qemu as 
user root and not clear the capabilities privileges for the qemu process 
(done by uncommenting 'user = "root"' and 'clear_emulator_capabilities = 
0'). This is a *VERY* bad idea, since 1) it allows the qemu process to 
run as root, meaning that if a virtual machine finds an exploit in your 
(*also* very old) qemu binary and "breaks out", it will have full root 
access to your host. Also in this case you will need to manually create 
the tap device beforehand (before commit 9c17d665fd libvirt would not 
auto-create a tap device for ).


I hope I've convinced you to take choice (1)!



qemu 1:2.5+dfsg-5ubuntu10.41

Thanks,
--> Rob

Re: [libvirt-users] Connecting a VM to an existing OVS bridge

2020-01-10 Thread Laine Stump


On 1/4/20 4:48 AM, Amir Sela wrote:

Hi,
I have an existing OVS bridge, that I can see in ovs-vsctl and use
for other purposes.


Does the bridge show up when you run "ovs-vsctl list-br"? Both OVS 
bridges I have on my system are seen in that list. I created them both 
with "ovs-vsctl add-br BLAH". How did you create your bridges?




I've edited the machine's XML as instructed in
http://docs.openvswitch.org/en/latest/howto/libvirt/

When I try to start the VM, i get
error: Cannot get interface MTU on 'ovsbr': No such device


Is your OVS switch named "ovsbr"?



Any ideas?

(Note: I can't see the ovs switch in brctl show or any other regular
kernel tool, should it appear there?)


On my Fedora 31 system at least, OVS devices are not visible in "brctl 
show", but they *are* visible with "ip link show". For example:


19: ovs-system:  mtu 1500 qdisc noop state DOWN 
mode DEFAULT group default qlen 1000

link/ether 0a:f7:2a:85:08:7c brd ff:ff:ff:ff:ff:ff
20: ovsbr0:  mtu 1500 qdisc noop state DOWN mode 
DEFAULT group default qlen 1000

link/ether 96:92:7a:1d:a6:4c brd ff:ff:ff:ff:ff:ff




Versions:
openvswitch-2.10.1-3.fc30.x86_64
libvirt-daemon-5.1.0-9.fc30.x86_64


The only differences on my system is the versions - I'm running Fedora 
31, the openvswitch package is at 2.12, and libvirt is 6.0.0 
(unreleased), but this is the same setup I've had for at least a few 
years. The only other package on the system with "openvswitch" in the 
name is "network-scripts-openvswitch", and I doubt that would have any 
effect like what you're talking about.


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Error in launching chasis

2019-12-17 Thread Laine Stump


On 12/17/19 5:52 AM, abhishek jain wrote:



Hi

I am new to Libvirt and is starting Chassis but getting following error

qemu-system-x86_64: -netdev 
tap,id=net0,ifname=tap01,vhost=on,script=no,downscript=no: tap: open 
vhost char device failed: Operation not permitted


You aren't using libvirt, you are running the qemu-system-x86_64 command 
directly.



What could be the reason


Since you're not actually using libvirt in this example, this isn't the 
best place to ask, but based on the error message, I'd say that you are 
running qemu-system-x86_64 as an unprivileged user, and qemu is trying 
to open the device /dev/vhost-net, but can't due to the lack of privileges.


If you used a libvirt-based management application to start your qemu 
(e.g. virsh, virt-manager, cockpit, ovirt, openstack) then they would 
call the libvirtd daemon, which is running with full root privileges and 
would open the vhost-net device, then pass that to an unprivileged qemu. 
If you really must continue using qemu directly, then you'll need to 
either stop using vhost=on, or run qemu as root (and take all the 
security risks associated with that).





Regards
Abhishek

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users



___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] What's the best way to make use of VLAN interfaces with VMs?

2019-11-27 Thread Laine Stump


On 11/26/19 11:07 PM, Richard Achmatowicz wrote:

Hello

I have a problem with attaching VMs to a VLAN interface.

Here is my setup: I have several physical hosts connected by a physical 
switch.  Each host has two NICs leading to the switch, which have been 
combined into a team, team0. Each host a has a bridge br1, which has 
team0 as a slave. So communication between hosts is based on the IP 
address of bridge br1 on each host.


Up until recently, using libvirt and KVM, I was creating VMs which had 
one interface attached the default virtual network and one interface 
attached to the bridge:


virt-install ... --network network=default --network bridge=br1 ...

I would then statically assign an IP address to the bridge interface on 
the guest when installing the OS.


A few days ago, a VLAN was introduced to split up the network. I created 
a new VLAN interface br1.600 on each of the hosts. My initial attempt 
was to do try this:


virt-install ... --network network=default --network bridge=br1.600 ...

which did not work. It then dawned on me that a VLAN interface and a 
bridge aren't treated the same. So I started to look for ways to allow 
my VMs to bind to this new interface.


This would seem to be a common situation. What is the best way to work 
around this?


Both the host bridge and the host VLAN interface already have their 
assigned IP addresses and appear like this in libvirt:


[root@clusterdev01 ]# ifconfig
br1: flags=4163  mtu 1500
     inet 192.168.0.110  netmask 255.255.255.0  broadcast 192.168.0.255
     inet6 fe80::1e98:ecff:fe1b:276d  prefixlen 64  scopeid 0x20
     ether 1c:98:ec:1b:27:6d  txqueuelen 1000  (Ethernet)
     RX packets 833772  bytes 2976958254 (2.7 GiB)
     RX errors 0  dropped 0  overruns 0  frame 0
     TX packets 331237  bytes 23335124 (22.2 MiB)
     TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br1.600: flags=4163  mtu 1500
     inet 192.168.1.110  netmask 255.255.255.0  broadcast 192.168.1.255
     inet6 fe80::1e98:ecff:fe1b:276d  prefixlen 64  scopeid 0x20
     ether 1c:98:ec:1b:27:6d  txqueuelen 1000  (Ethernet)
     RX packets 189315  bytes 9465744 (9.0 MiB)
     RX errors 0  dropped 0  overruns 0  frame 0
     TX packets 302  bytes 30522 (29.8 KiB)
     TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@clusterdev01]# virsh iface-list --all
  Name State  MAC Address
---
  br1  active 1c:98:ec:1b:27:6d
  br1.600   active 1c:98:ec:1b:27:6d

[root@clusterdev01 sysadmin]# virsh iface-dumpxml br1.600

   
     
   
   
     
   
   
   
     
   


I tried following some suggestions which wrapped the vlan interface in a 
bridge interface, but in ended up trashing the br1.600 interface which 
was originally defined on the host.


Is there a failsafe way to deal with such a situation? Am I doing 
something completely wrong here? In would like br1.600 to behave like 
br1 .


Any suggestions or advice greatly appreciated.



I guess what you need is for all the traffic from your guests to go out 
on the physical network tagged with vlan id 600, and you want that to be 
transparent to the guests, right?


The simplest way to handle this is to create a vlan interface off of the 
ethernet that you have attached to br1 (not br1 itself), so it would be 
named something like "eth0.600", and then create a new bridge (call it, 
say "br600") and attach eth0.600 to br600. Then your guests would be 
created with "--network bridge=br600"


(Note that Linux host bridges do now support vlan tagging (and maybe 
even trunking) at the port level, but libvirt hasn't added support for 
it. (in other words, "Patches Welcome!" :-))


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Confused setting up a "Virtual Server Hosting" config

2019-10-23 Thread Laine Stump

On 10/23/19 12:43 AM, Paul O'Rorke wrote:

Hi list,

Can anyone advise me on the correct/best set up for Virtual Server Hosting?

I have a guest in my server room wish to migrate to dedicated server I
rented in an offsite in a data centre. I rented a box with one NIC and
one public IP. I installed KVM on it and a guest. (both Ubuntu 18.04
LTS server edition). I am struggling to get the networking right.

Essentially I want the "Virtual Server Hosting" config mentioned here:
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html-single/virtualization_administration_guide/index#sub-sect-routed-mode

I have not had any luck setting that up. It is listed in the "Routed"
section but the graphic says the virtual switch should be in bridged mode.

I also tried using macvtap, and since I have only one guest was
expecting to be able to just use the host IP

No, you will need one IP for the host, and one IP for the guest in
either bridged mode or for macvtap.

but it looks like the data
centre have restricted packets to the MAC address of the host NIC.

Yes, there is that restriction too. Usually hosting providers will lock
down the MAC addresses they allow through ports, in order to prevent
hostile clients from doing MAC spoofing to capture other clients' traffice.

When

set up I can ping the public IP (it is both eh host and the guest?)

No. An IP address refers to one entity. It can be the host or the guest,
but not both.

but
not their gateway. Should a macvtap not be presenting the MAC address
of the host NIC to the router and thus allowing packets from the guest?

No, that is not what macvtap does. It creates a virtual NIC (macvtap
device) that is connected directly to the physical NIC, and traffic from
that device is injected directly into the output queue of the physical
device, MAC address and all.

I clearly have a lack of understanding of how this is working and how it
is meant to work. When I tried the same thing on mt hardware/network I
can create myltiple guests that all use the macvtap interface and I have
no problems getting connectivity to the outside world.

Because on your own network you have no MAC address locking on your
switch port, and have multiple IP addresses available (one for each
guest) from the local DHCP server.

Before I approach the data centre about this I want to be sure I
understand what I am doing. I ultimately want to host a mail server
and several different web servers as guests all behind this one host. I
would alias their public IPs to the host NIC and use IPtables to route
traffic based on destination IP.

The only reason you would want iptables to be involved is if you were
limited to only 1 IP address for the host + all the guests. In that case
you could use *port* forwarding to cause incoming traffic to the host on
particular TCP ports to be forwarded to different guests:

https://wiki.libvirt.org/page/Networking#Forwarding_Incoming_Connections

Does that make sense? Can anyone suggest the right way to achieve this?

No, not really :-)

If you can only get a single IP address, then you'll need to look at the
above link. If you can get the hosting provider to sell you extra IP
addresses / MAC addresses (usually extra IPs cost money but MAC
addresses are free, they just want to know what they are - you will need
one *of each* for each guest), then you should put a bridge on your
host's ethernet, and connect all the guests to that bridge, configuring
each with its unique IP address / MAC address / default route info given
to you by the hosting provider. You can use this as a reference to
configure the host and guests:

https://wiki.libvirt.org/page/Networking#Debian.2FUbuntu_Bridging

(you could also avoid setting up the bridge and just use macvtap bridge
mode as you say you've done on your own network. The only limitation of
that is that it doesn't permit direct communication between the host and
the guests. If that limitation is okay with you, then that's fine.)

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Live migrate + change interface name

2019-10-22 Thread Laine Stump


On 10/17/19 4:26 PM, Marc Roos wrote:


Is it possible to do a live migrate of a guest, having on the from host
a source_device=eth2 and to host a source_dev=eth1?


What management tool are you using that the syntax is "source_device=eth2"?

Are you maybe just paraphrasing your config, and what you actually have 
is something like this?:


   
 
 ...
   

?

If so, the way to make this easily migratable is to create a network on 
both hosts that points to the desired physical device, e.g. on host 1:




  direct-net
  

  


and on host2:

 (same thing, but use 'eth1')

After net-define-ing the networks, you'll need to net-autostart and 
net-start them. Then in your guest's interface config, you would use this:


 
   ...

You will then be able to migrate from one host to the other without 
needing to modify your XML during the migration.


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] KVM NAT stops from working

2019-09-03 Thread Laine Stump


On 9/2/19 10:31 AM, Francesc Guasch wrote:

Hi. First of all thank you for the work you are doing with libvirt.
I am not sure this is the right place to ask, I'd appreciate
if you can give me any hint or directions.

I have several similar KVM Linux boxes and one of them has a really
strange behavior with the KVM NAT: It just suddenly stops from
working.

This is a Linux Ubuntu Server 19.04 with
  - libvirt-bin 4.0.0
  - qemu-kvm 1:2.11

Everything works fine and then suddenly the virtual machines
can't reach outside. If I run a tcpdump in the host I see
the NAT isn't working.

When the server just boots I can see the packets with the
server address going out:

     x.y.z.w.49138 > 8.8.8.8.53

Then, it may be some hours or days later, instead the server
address I see the internal domains address:


     192.168.122.33.19132 > 8.8.8.8.53
     ^^

I try to restart the iptables but it won't help.

Any hints ? Thank you very much


1) On a freshly booted machine with running clients connected to 
libvirt's default network (and successfully sending/receiving traffic, 
of course :-), get a dump of all active iptables rules with


   iptables-save >iptables-working.txt

2) At whatever later time when you notice that the NAT is no longer 
working properly, get another dump of all the rules with


   iptables-save >iptables-broken.txt

and compare those two files to see what has changed.

Most likely some other piece of software (a firewall management utility 
maybe?) has loaded a new rule that takes precedence over one of the 
rules added by libvirt.


If seeing the rule that was added doesn't point you at the culprit, you 
can see if restarting libvirtd will fix your problem - whenever libvirtd 
is restarted, all iptables rules associated with libvirt's virtual 
networks are reloaded (which will put them back at the beginning of the 
chain, thus fixing any broken precedence).


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] RLIMIT_MEMLOCK in container environment

2019-08-24 Thread Laine Stump


On 8/24/19 3:08 AM, Dan Kenigsberg wrote:



On Fri, 23 Aug 2019, 0:27 Laine Stump, <mailto:la...@redhat.com>> wrote:


(Adding Alex Williamson to Cc so he can correct any mistakes)

On 8/22/19 4:39 PM, Ihar Hrachyshka wrote:
 > On Thu, Aug 22, 2019 at 12:01 PM Laine Stump mailto:la...@redhat.com>> wrote:
 >>
 >> On 8/22/19 10:56 AM, Ihar Hrachyshka wrote:
 >>> On Thu, Aug 22, 2019 at 2:24 AM Daniel P. Berrangé
mailto:berra...@redhat.com>> wrote:
 >>>>
 >>>> On Wed, Aug 21, 2019 at 01:37:21PM -0700, Ihar Hrachyshka wrote:
 >>>>> Hi all,
 >>>>>
 >>>>> KubeVirt uses libvirtd to manage qemu VMs represented as
Kubernetes
 >>>>> API resources. In this case, libvirtd is running inside an
 >>>>> unprivileged pod, with some host mounts / capabilities added
to the
 >>>>> pod, needed by libvirtd and other services.
 >>>>>
 >>>>> One of the capabilities libvirtd requires for successful startup
 >>>>> inside a pod is SYS_RESOURCE. This capability is used to adjust
 >>>>> RLIMIT_MEMLOCK ulimit value depending on devices attached to the
 >>>>> managed guest, both on startup and during hotplug. AFAIU the
need to
 >>>>> lock the memory is to avoid pages being pushed out from RAM
into swap.
 >>
 >>
 >> I recall successfully testing GPU assignment from an unprivileged
 >> libvirtd several years ago by setting a high enough ulimit for
the uid
 >> used to run libvirtd in advance (. I think we check if the current
 >> setting is high enough, and don't try to set it unless we think
we need to.
 >>
 >
 > The PR I linked to in the original email does just that: it starts
 > libvirtd; then, if domain is going to use VFIO, sets ulimit of
 > libvirtd process to VM memory size + 1Gb (mimicking libvirt code) +
 > 256Mb (to stay conservative) using prlimit() syscall; then
defines the
 > domain.

So you're making an educated guess, which is essentially what
libvirt is
doing (based on advice from other people with better information than
us, but still a guess).

 >
 >> If I understand you correctly, you're saying that in your case
it's okay
 >> for the memlock limit to be lower than we try to set it to,
because swap
 >> is disabled anyway, is that correct?
 >>
 >
 > I'm honestly not exactly sure about the reason why we need to set the
 > limit, but I assume it's because of swap. I can be totally
confused on
 > that part though.


What I understand from an IRC conversation with Alex just now is that
increasing RLIMIT_MEMLOCK isn't done just to prevent any of the pages
being swapped out. It's done because "all GPAs (Guest Physical
Addresses) that could potentially be DMA targets need to have fixed
mappings through the iommu, therefore all need to be allocated and
mappings fixed [...] setting rlimit allows us to perform all the
necessary pins within the user's locked memory limit".

So even if swap is disabled, it still needs to be done (either by
libvirt, or by someone else who has the necessary privileges and
control
over the libvirtd process).


 >>>>
 >>>> Libvirt shouldn't set RLIMIT_MEMLOCK by default, unless there's
 >>>> something in the XML that requires it - one of
 >>>
 >>> You are right, sorry. We add SYS_RESOURCE only for particular
domains.
 >>>
 >>>>
 >>>>    - hard limit memory value is present
 >>>>    - host PCI device passthrough is requested
 >>>
 >>> We are using passthrough
 >>
 >> (If you want to make Alex happy, use the term "VFIO device
assignment"
 >> rather than passthrough :-).)
 >>
 >
 > Not sure who Alex is but I'll try to make everyone happy! :)

The Alex I'm referring to is the Alex I just Cc'ed. He is the VFIO
maintainer.


 >>> to pass SR-IOV NIC VFs into guests. We also
 >>> plan to do the same for GPUs in the near future.
 >>
 >>   >>> I believe we would benefit from one of the following
features on
 >>   >>> libvirt side (or both):
 >>   >>>
 >>   >>> a) expose the memory lock value calculated by libvirtd through
 >>   >>> libvirt ABI so that we can use it when calling prlimit()
on libvirtd
 >>   >

Re: [libvirt-users] RLIMIT_MEMLOCK in container environment

2019-08-22 Thread Laine Stump

(Adding Alex Williamson to Cc so he can correct any mistakes)

On 8/22/19 4:39 PM, Ihar Hrachyshka wrote:

On Thu, Aug 22, 2019 at 12:01 PM Laine Stump wrote:

On 8/22/19 10:56 AM, Ihar Hrachyshka wrote:

On Thu, Aug 22, 2019 at 2:24 AM Daniel P. Berrangé wrote:

On Wed, Aug 21, 2019 at 01:37:21PM -0700, Ihar Hrachyshka wrote:

Hi all,

KubeVirt uses libvirtd to manage qemu VMs represented as Kubernetes
API resources. In this case, libvirtd is running inside an
unprivileged pod, with some host mounts / capabilities added to the
pod, needed by libvirtd and other services.

One of the capabilities libvirtd requires for successful startup
inside a pod is SYS_RESOURCE. This capability is used to adjust
RLIMIT_MEMLOCK ulimit value depending on devices attached to the
managed guest, both on startup and during hotplug. AFAIU the need to
lock the memory is to avoid pages being pushed out from RAM into swap.

I recall successfully testing GPU assignment from an unprivileged
libvirtd several years ago by setting a high enough ulimit for the uid
used to run libvirtd in advance (. I think we check if the current
setting is high enough, and don't try to set it unless we think we need to.

The PR I linked to in the original email does just that: it starts
libvirtd; then, if domain is going to use VFIO, sets ulimit of
libvirtd process to VM memory size + 1Gb (mimicking libvirt code) +
256Mb (to stay conservative) using prlimit() syscall; then defines the
domain.

So you're making an educated guess, which is essentially what libvirt is
doing (based on advice from other people with better information than
us, but still a guess).

If I understand you correctly, you're saying that in your case it's okay
for the memlock limit to be lower than we try to set it to, because swap
is disabled anyway, is that correct?

I'm honestly not exactly sure about the reason why we need to set the
limit, but I assume it's because of swap. I can be totally confused on
that part though.

What I understand from an IRC conversation with Alex just now is that
increasing RLIMIT_MEMLOCK isn't done just to prevent any of the pages
being swapped out. It's done because "all GPAs (Guest Physical
Addresses) that could potentially be DMA targets need to have fixed
mappings through the iommu, therefore all need to be allocated and
mappings fixed [...] setting rlimit allows us to perform all the
necessary pins within the user's locked memory limit".

So even if swap is disabled, it still needs to be done (either by
libvirt, or by someone else who has the necessary privileges and control
over the libvirtd process).

Libvirt shouldn't set RLIMIT_MEMLOCK by default, unless there's
something in the XML that requires it - one of

You are right, sorry. We add SYS_RESOURCE only for particular domains.

- hard limit memory value is present
- host PCI device passthrough is requested

We are using passthrough

(If you want to make Alex happy, use the term "VFIO device assignment"
rather than passthrough :-).)

Not sure who Alex is but I'll try to make everyone happy! :)

The Alex I'm referring to is the Alex I just Cc'ed. He is the VFIO
maintainer.

to pass SR-IOV NIC VFs into guests. We also
plan to do the same for GPUs in the near future.

>>> I believe we would benefit from one of the following features on
>>> libvirt side (or both):
>>>
>>> a) expose the memory lock value calculated by libvirtd through
>>> libvirt ABI so that we can use it when calling prlimit() on libvirtd
>>> process;
>>> b) allow to disable setrlimit() calls via libvirtd config file knob
>>> or domain definition.

(b) sounds much more reasonable, as long as qemu doesn't complain (I
don't know whether or not it checks)

Slightly related to this - I'm currently working on patches to avoid
making any ioctl calls that would fail in an unprivileged libvirtd when
using tap/macvtap devices. ATM, I'm doing this by adding an attribute
"unmanaged='yes'" to the interface element. The idea is that if
someone sets unmanaged='yes', they're stating that the caller (i.e.
kubevirt) is responsible for all device setup, and that libvirt should
just use it without further setup. A similar approach could be applied
to hostdev devices - if unmanaged is set, we assume that the caller has
done everything to make the associated device usable.

(Of course this all makes me realize the inanity of adding a for interfaces when hostdevs already have
and . So
to prevent setting the locklimit for hostdev, would we make a new
setting like ? Sigh. I
*hate* trying to make config consistent :-/)

(alternately, we could just automatically fail the attempt to set the
lock limit in a graceful manner and allow the guest to continue)

If that's something maintainers feel good about, I am all for it since
it simplifies the implementation.

Well, after

Re: [libvirt-users] RLIMIT_MEMLOCK in container environment

2019-08-22 Thread Laine Stump

On 8/22/19 10:56 AM, Ihar Hrachyshka wrote:

On Thu, Aug 22, 2019 at 2:24 AM Daniel P. Berrangé  wrote:

On Wed, Aug 21, 2019 at 01:37:21PM -0700, Ihar Hrachyshka wrote:

Hi all,

KubeVirt uses libvirtd to manage qemu VMs represented as Kubernetes
API resources. In this case, libvirtd is running inside an
unprivileged pod, with some host mounts / capabilities added to the
pod, needed by libvirtd and other services.

One of the capabilities libvirtd requires for successful startup
inside a pod is SYS_RESOURCE. This capability is used to adjust
RLIMIT_MEMLOCK ulimit value depending on devices attached to the
managed guest, both on startup and during hotplug. AFAIU the need to
lock the memory is to avoid pages being pushed out from RAM into swap.

I recall successfully testing GPU assignment from an unprivileged 
libvirtd several years ago by setting a high enough ulimit for the uid 
used to run libvirtd in advance (. I think we check if the current 
setting is high enough, and don't try to set it unless we think we need to.

If I understand you correctly, you're saying that in your case it's okay 
for the memlock limit to be lower than we try to set it to, because swap 
is disabled anyway, is that correct?

Libvirt shouldn't set RLIMIT_MEMLOCK by default, unless there's
something in the XML that requires it - one of

You are right, sorry. We add SYS_RESOURCE only for particular domains.

  - hard limit memory value is present
  - host PCI device passthrough is requested

We are using passthrough 

(If you want to make Alex happy, use the term "VFIO device assignment" 
rather than passthrough :-).)

to pass SR-IOV NIC VFs into guests. We also
plan to do the same for GPUs in the near future.

>>> I believe we would benefit from one of the following features on
>>> libvirt side (or both):
>>>
>>> a) expose the memory lock value calculated by libvirtd through
>>> libvirt ABI so that we can use it when calling prlimit() on libvirtd
>>> process;
>>> b) allow to disable setrlimit() calls via libvirtd config file knob
>>> or domain definition.

(b) sounds much more reasonable, as long as qemu doesn't complain (I 
don't know whether or not it checks)

Slightly related to this - I'm currently working on patches to avoid 
making any ioctl calls that would fail in an unprivileged libvirtd when 
using tap/macvtap devices. ATM, I'm doing this by adding an attribute 
"unmanaged='yes'" to the interface  element. The idea is that if 
someone sets unmanaged='yes', they're stating that the caller (i.e. 
kubevirt) is responsible for all device setup, and that libvirt should 
just use it without further setup. A similar approach could be applied 
to hostdev devices - if unmanaged is set, we assume that the caller has 
done everything to make the associated device usable.

(Of course this all makes me realize the inanity of adding a dev='blah' unmanaged='yes'/> for interfaces when hostdevs already have 
 and . So 
to prevent setting the locklimit for hostdev, would we make a new 
setting like ? Sigh. I 
*hate* trying to make config consistent :-/)

(alternately, we could just automatically fail the attempt to set the 
lock limit in a graceful manner and allow the guest to continue)

BTW, I'm guessing that you use  to assign the SRIOV VFs rather 
than , correct? The latter would require that 
you have enough capabilities to set MAC addresses on the VFs (that's the 
entire point of using  instead of plain )

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] UDP broadcasts vs. nat Masquerading issue

2019-07-04 Thread Laine Stump


On 6/28/19 10:23 AM, Nikolai Zhubr wrote:

Hi all,

I'm observing an issue that as soon as libvirt starts, UPD broadcasts 
going through physical network (and unrelated to any virtualization) get 
broken. Specifically, windows neighbourhood browsing through samba's 
nmbd starts suffering badly (Samba is running on this same box).


At the moment I'm running a quite outdated version 1.2.9 of libvirt, but 
other than this issue, it does its job pretty well, so I'd first 
consider some patching/backporting rather than totally replacing it with 
a new one. Anyway, I first need to better understand what is going on 
and what is wrong with it.

This could also be related somewhat to
https://www.redhat.com/archives/libvir-list/2013-September/msg01311.html
but I suppose it is not exactly that thing.

I've already figured the source of trouble is anyway related to these 
rules added:


-A POSTROUTING -o br0 -j MASQUERADE
-A POSTROUTING -o enp0s25 -j MASQUERADE
-A POSTROUTING -o virbr2_nic -j MASQUERADE
-A POSTROUTING -o vnet0 -j MASQUERADE


*None* of those rules were added by libvirt (unless your build of 
libvirt, in addition to being ancient, has also been heavily hacked by a 
third party with downstream-only patches, although I can't imagine how 
the rules you show could possibly be a useful addition).


Detailed documentation of what iptables rules are added by libvirt can 
be found here:



   https://libvirt.org/firewall.html

(particularly in the "virtual network" section).

The masquerade rules added by libvirt are based on the IP address of the 
NATed subnet, e.g.:


A LIBVIRT_PRT -s 192.168.12.0/24 ! -d 192.168.12.0/24 -p tcp \
  -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.12.0/24 ! -d 192.168.12.0/24 -p udp \
   -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.12.0/24 ! -d 192.168.12.0/24 \
   -j MASQUERADE

(this is from a less ancient version of libvirt that puts the rules on 
its own private chain, but in the version you are using the rule would 
be the same, except that "LIBVIRT_PRT" would be replaced with "POSTROUTING")


You can verify my "counter-claim" by running "virsh net-destroy" for all 
of your libvirt networks, and seeing that the offending rules haven't 
been removed.


In short, you need to look elsewhere for the culprit.




Here, virbr2_nic and vnet0 are used by libvirt for arranging network 
configurations for VMs, ok. However, br0 is a main interface of this 
host with primary ip address, with enp0s25 being a physical nic of this 
host, and it is used for all sorts of regular (unrelated to 
virtualization) communications. Also, br0 is used for attaching bridged 
(as opposed to NATed) VMs managed by libvirt.


Clearly, libvirt somehow chooses to set up masquerading for literally 
all existing network interfaces here (except lo),


It's clear that the rules are there. It's not clear that they were added 
by libvirt.


but I can't see a real 
reason for the first two rules in the list above. Furthermore, they 
corrupt UDP broadcats coming from outside and reaching this host 
(through enp0s25/br0) such that source address gets replaced by this 
hosts primary address (as per masquerading). I've verified this by 
arranging a hand-crafted UDP listener and printing the respective source 
addresses as seen by normal userspace.


Now I've discovered that I can "eliminate" the problem by either:

1. Removing "-A POSTROUTING -o br0 -j MASQUERADE" (manually)
2. Inserting "-A POSTROUTING -s 192.168.0.0/24 -d 192.168.0.255/32 -j 
ACCEPT"

(Of course correcting rules by hand is not a solution, just a test)

So question is, how the correct rules should ideally look like? And, is 
this issue known/fixed in most current libvirt?


Except for putting the libvirt-added rules in their own private chains 
(appearing in libvirt 5.1.0, released on Feb 1, 2019), the iptables 
rules added by libvirt to support its virtual networks didn't materially 
change in > 10 years. Your email is the first time I've ever seen such 
rules attributed to libvirt so, as I said above, I think you need to 
take a deeper dive into your host system's config.



Good luck!

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Easy solution for custom firewall rules-

2019-06-03 Thread Laine Stump


On 6/2/19 10:02 PM, Joshua Kramer wrote:

Nakta wrote:

libvirts nwfilter module can achieve that.


I read over those resources and I did what I thought would be correct,
but it's not having any effect.

I created a new nwfilter like this:

   
 
   
   
 
   
   
 
   
   
 
   
   
 
   


I then associated that filter with the Interface device on the VM
server within KVM... and shutdown/restart that VM.
  
   
   
   
   
   
 

After this, nothing happens.  I did 'ebtables --list', and the new
rules aren't there.


Try "ebtables -t nat -L", although as I said in the other message I just 
posted, it's not going to do what you need anyway, because these rules 
will be applied *in addition to* the network's iptables rules, not 
*instead of*.


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Easy solution for custom firewall rules- is it possible?

2019-06-03 Thread Laine Stump

> Am Donnerstag, den 30.05.2019, 21:44 -0400 schrieb Joshua Kramer:
>> Hello All-
>>
>> I've looked in several places and haven't found an answer to this
>> question: is it possible to have libvirt add custom rules to iptables
>> for virtual network interfaces?  I took a look at the "Firewall and
>> Network Filtering in Libvirt" page and it seems overly complicated
>> for
>> what I want to do.
>>
>> Given an interface virbr2 and its network 192.168.4.0/24, libvirt
>> installs the following rules in iptables.  Essentially, these rules
>> will drop any packets for the interface virbr2 where the source or
>> destination is not on the 192.168.4.0/24 network.
>>
>> -P FORWARD ACCEPT
>> -A FORWARD -d 192.168.4.0/24 -o virbr2 -j ACCEPT
>> -A FORWARD -s 192.168.4.0/24 -i virbr2 -j ACCEPT
>> -A FORWARD -i virbr2 -o virbr2 -j ACCEPT
>> -A FORWARD -o virbr2 -j REJECT --reject-with icmp-port-unreachable
>> -A FORWARD -i virbr2 -j REJECT --reject-with icmp-port-unreachable
>>
>> I have a VPN server on the 4/24 network- and it hands out addresses
>> in
>> the 8/24 network.  So I would like libvirt to also create the
>> following rules in iptables:
>>
>> -A FORWARD -d 192.168.8.0/24 -o virbr2 -j ACCEPT
>> -A FORWARD -s 192.168.8.0/24 -i virbr2 -j ACCEPT
>>
>> I've tried creating direct rules in firewalld for the FORWARD_direct
>> chain.  Firewalld happily creates those rules, but they are never
>> reached, because they fall AFTER the libvirt rules.  I've also tried
>> creating an IP address on the virbr2 interface in the 8/24 network,
>> but that doesn't work either.  How can I get this done?
>>

On 5/31/19 10:42 AM, nakata wrote:

Hi,

libvirts nwfilter module can achieve that.

In general it is true that nwfilter can add iptables (or more commonly 
ebtables) rules. But I don't think it can do what is being requested by 
Joshua since those rules are not only specific to the guest interface, 
but also are applied *in addition to* the iptables rules added by the 
libvirt network, not *instead of*.

Much of what nwfilter does is via the ebtables "nat" table, not 
input/output/forward. So any rules you added using nwfilter would only 
be traversed by traffic going in or out of a particular guest's tap 
interface, and even after it passed that, it would *still* be subject to 
the iptables rules added by the network (which may be more restrictive - 
in this case, even though the packets with the source address on the VPN 
would be able to get past the guest's tap device onto the bridge, they 
would then be subject to the iptables rules when trying to leave the 
bridge).

I'm currently working on opt-out patches to disable that functionality
if wished. I also don't use firewalld.

What are you trying to "opt out" of? If your libvirt network has 
 then there will be no iptables rules created at 
all. (Similarly, if you don't like the DHCP service that is setup for a 
libvirt virtual network, remove the  section from the network's 
config before you start it, and if you don't like the provided DNS 
server, add  to the network config).

( and  have been in libvirt 
since v2.2.0. It has always been possible to disable the auto-configured 
dhcp server by removing the  section).

It's both paternalizing and annoying and takes away user flexilibity in
exchange for nothing.

libvirt's virtual networks aren't taking away flexibility for nothing. 
They are a convenience that was added specifically to make it as simple 
as possible to set up a usable network connection that fits the needs of 
95% of users. In order to make it as foolproof as possible, by design it 
has just a few presets with limited configuration (although the amount 
of configuation possible has grown a bit over the years as various 
things have shown themselves to be commonly needed). Anyone with needs 
more complex than what can be satisfied by libvirt's few preset modes 
and configuration knobs can easily setup their own bridge device, 
iptables rules, and dnsmasq instance outside of libvirt in the host 
system config (or you can turn various parts of it off, using the 
options described above, and still use the rest of what libvirt creates.

So, if you just need flexibility for iptables rules, use mode='open'/>. If you need full flexibility, then don't use a libvirt 
virtual network - setup your own bridge, and configure the guest 
interfaces with .

anyways
Check the nwfilter page to write own filters for the beginning:
https://libvirt.org/formatnwfilter.html#nwfwrite

some more info:
https://www.redhat.com/archives/libvir-list/2010-June/msg00762.html

This is a very informative and useful email. For that reason, it was 
formatted and put into libvirt's official documentation here:

   https://libvirt.org/firewall.html

There have been a few changes/additions over the years (although not 
many!) so that is a better reference document.

(if that email is referenced somewhere in the wiki or something, we 
should change it to point to the docs

Re: [libvirt-users] PCI passthrough and abstraction

2019-04-25 Thread Laine Stump


On 4/25/19 10:14 AM, Mauricio Tavares wrote:

So I am reading through , and am wondering what is the difference between




and




if I am trying to give full access to a NIC? Which one exposes more of the card?


I also answered this on IRC, but just in case someone is looking through 
the email archives and comes across this message:


There are a couple of differences between  and type='hostdev'>:


1)   will work with "many" different devices (as long as they 
are a PCI endpoint device, and don't share an IOMMU group with other 
devices that must remain in use by the host), but type='hostdev'> can only be used for a PCI device that is a Virtual 
Function (VF) of an SR-IOV capable network card (if you're unsure 
whether or not your device qualifies, then it almost definitely *doesn't*


2) In preparation for using VFIO device assignment to assign the device 
to a guest, both types of device will be automatically unbound from 
their host driver and bound to the vfio-pci driver *if "managed='yes'" 
is set*. In addition,  will set the MAC 
address (and optionally the vlan tag) on the VF device before assigning 
it to the guest. This isn't done when you use , the result 
being that your guest will end up with a network device that has a 
random MAC address that is different each time it is started.


So if you are assigning VFs from an SR-IOV netcard, then you *really* 
want to use . For all other cases of PCI 
device assignment (including netcards that aren't SR-IOV VFs), you 
*must* use .


As for "managed='yes'" - if you are never using the device directly on 
the host, it's highly recommended that you bind the device to the 
vfio-pci driver during the host bootup, and set "managed='no'" in the 
XML - probably 80% or more of the bugs we encounter with device 
assignment happen when we're unbinding the device from the host net (or 
whatever) driver, and binding it to vfio-pci or vice-versa - all sorts 
of timing issues that are essentially out of our control.




On Wed, Apr 24, 2019 at 1:13 PM Mauricio Tavares  wrote:


When you pass a device in the pci chain (after virsh
nodedev-dettach'ing it from host) to the guest, how much is passed
without being emulated/abstracted?


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users



___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Network hooks for ethernet interfaces

2019-04-10 Thread Laine Stump


On 4/9/19 11:35 AM, Ruben Kerkhof wrote:

On Tue, Apr 9, 2019 at 5:10 PM Michal Privoznik  wrote:


On 4/9/19 4:38 PM, Ruben Kerkhof wrote:

Hi all,

I have a hook script, /etc/libvirt/hooks/network, that doesn't seem to
be called when I attach an interface with type 'ethernet' with this
xml snippet:


  
  
  
  


https://www.libvirt.org/hooks.html#intro says
"A network is started or stopped or an interface is plugged/unplugged
to/from the network (since 1.2.2)".

While I don't have a network defined in xml, I'd expect this to work
just as well for 'ethernet' type interfaces. Am I wrong?



Hotplugging an 'ethernet' type of interface doesn't really relate to any
libvirt network. Hence libvirt doesn't call 'network' hook script. If
you'd continue reading you'll see what is the 'network' hook fed with
(on stdin): info on domain in question AND network where the event
ocurred. But there is no network, is it?


No not in the libvirt sense there isn't, you're right.


But maybe you can work around this by waiting for
DEVICE_ADDED/DEVICE_REMOVED events? What is it that you're trying to solve?


I'd like to enable proxy_arp on the interface among other things.
I can easily do this from the same script that adds the interface
though, so I have a workaround, but a hook that triggers on all
interface events felt cleaner.


Also keep in mind that the hook scripts aren't an officially supported 
part of the API, and are thus liable to change without warning. As an 
example, danpb has proposed changing the network hook:


 https://www.redhat.com/archives/libvir-list/2019-March/msg01280.html

Once this goes in, any network hook script that uses the plugged and 
unplugged hooks will no longer work; you would instead need to use the 
port-created and port-deleted hooks.


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Guest interface names not same as configured name

2019-04-04 Thread Laine Stump


On 4/4/19 2:02 PM, PR PR wrote:

Hi,

I am creating a guest with following description for interfaces in the 
xml using virsh create xml command. For some reason, the guest interface 
names in the VM dont match the names specified in the xml. Is there a 
way to make guest interface names predictable?


For qemu (which you've indicating you're using) there isn't any way to 
set the name of the network device in the guest from the libvirt config. 
There is no visibility of the  element into the guest 
OS; all that the guest can know is that there is an e1000 ethernet 
device plugged into PCI bus 0 slot 3, and it has MAC address 
52:54:00:17:0b:e7. Any determination of name must be done within the 
guest OS.


The  element is only used by the LXC and openvz 
drivers. It really should be flagged as an error during validation for 
other drivers (patches welcome :-).





Following is the qemu version on host

dpkg --list | grep -i qemu
ii  qemu-kvm                              1:2.11+dfsg-1ubuntu7.10
         amd64        QEMU Full virtualization on x86 hardware


     
       
       
       
       
       function='0x0'/>

     
     
       
       
       
       
       
       function='0x0'/>

     

Thanks

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users



___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] Error starting domain: internal error: Unable to add port vnet0 to OVS bridge br0

2019-03-26 Thread Laine Stump

I added libvirt-users@redhat.com back to the Cc for this response. 
Please don't remove the list address when responding to postings on a 
mailing list. A message to the list is *much* more likely to reach 
someone who knows the answer than is a private message to a single person.



On 3/26/19 10:03 AM, Harsh Gondaliya wrote:
Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: error : 
virCommandWait:2553 : internal error: Child process (ovs-vsctl 
--timeout=5 -- --if-exists del-port vnet0 -- add-port br0 vnet0 -- set 
Interface vnet0 'external-ids:attached-mac="52:54:00:90:c6:c3"' -- set 
Interface vnet0 
'external-ids:iface-id="a9700eff-03a7-4c47-a112-429fc20677a2"' -- set 
Interface vnet0 
'external-ids:vm-id="41b4eef0-b820-41da-9034-9de22e1379e0"' -- set 
Interface vnet0 external-ids:iface-status=active) unexpected exit status 
126:

*
*
*libvirt:  error : cannot execute binary ovs-vsctl: Permission denied*

Mar 26 19:25:01 dpdk-OptiPlex-5040 kernel: [ 1932.243181] audit: 
type=1400 audit(1553608501.701:59): apparmor="DENIED" operation="exec" 
profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=20679 
comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0


AppArmor is prohibiting it for some reason. I don't run debian or 
ubuntu, so I don't have any idea how AppArmor works. Possibly someone 
else on the list knows (or maybe you could search for help on AppArmor 
somewhere).





Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: debug : 
virCommandRun:2280 : Result status 0, stdout: '' stderr: 'libvirt:  
error : cannot execute binary ovs-vsctl: Permission denied#012'
Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: error : 
virNetDevOpenvswitchAddPort:155 : internal error: Unable to add port 
vnet0 to OVS bridge br0
Mar 26 19:25:01 dpdk-OptiPlex-5040 NetworkManager[1096]:   
[1553608501.7126] devices removed (path: /sys/devices/virtual/net/vnet0, 
iface: vnet0)



libvrt does not have permissions to execute ovs-vsctl. How can I get 
this issue sorted out?


On Wed, Mar 20, 2019 at 12:10 AM Laine Stump <mailto:la...@redhat.com>> wrote:


On 3/15/19 3:21 AM, Harsh Gondaliya wrote:
 > I have installed OVS from sources using the installation steps
mentioned
 > on this link:
http://docs.openvswitch.org/en/latest/intro/install/general/
 >
 > I had installed libvrt, KVM, QEMU and all the necessary packages
using
 > apt-get. My KVM-QEMU hypervisor has been running well.
 >
 > To add a VM with the port attached to OVS bridge I changed the XML
 > domain file as per the instructions on this page:
 > http://docs.openvswitch.org/en/latest/howto/libvirt/
 >
 > But the when I start the VM using the Virtual Machine Manager I get
 > the following error:
 > *Error starting domain: internal error: Unable to add port vnet0
to OVS
 > bridge br0*

libvirt is creating a tap device, then running ovs-vsctl to attempt to
attach it to the configured switch. To see what command is run, and
what
error is output, add this to your /etc/libvirt/libvirt.d:

    log_filters="1:util.command 1:util.netdevopenvswitch"
    log_outputs="1:syslog:libvirtd.service"

and restart the libvirt service, then attempt to start your guest while
watching the system logs. You will see an ovs-vsctl command run by
virCommandRunAsync. That command and its output should give you a clue
to what is missing from the locally-built openvswitch vs the official
package installed with apt-get.


 > Traceback (most recent call last):
 >    File "/usr/share/virt-manager/virtManager/asyncjob.py", line
90, in
 > cb_wrapper
 >      callback(asyncjob, *args, **kwargs)
 >    File "/usr/share/virt-manager/virtManager/asyncjob.py", line
126, in
 > tmpcb
 >      callback(*args, **kwargs)
 >    File "/usr/share/virt-manager/virtManager/libvirtobject.py",
line 83,
 > in newfn
 >      ret = fn(self, *args, **kwargs)
 >    File "/usr/share/virt-manager/virtManager/domain.py", line
1402, in
 > startup
 >      self._backend.create()
 >    File "/usr/lib/python2.7/dist-packages/libvirt.py", line 1035,
in create
 >      if ret == -1: raise libvirtError ('virDomainCreate()
failed', dom=self)
 > libvirtError: internal error: Unable to add port vnet0 to OVS
bridge br0
 >
 > My output for ovs-vsctl show:
 > 3c28f516-dd5c-43cf-bea1-7c068668d1f6
 >      Bridge "br0"
 >          Port "enp0s31f6"
 >              Interface "enp0s31f6"
 >          Port "br0"
 >

Re: [libvirt-users] vlan tagging for openVSwitch

2019-03-14 Thread Laine Stump


On 3/13/19 6:52 AM, lejeczek wrote:

hi everyone,

I'm trying to get vlans tagged in libvirt as my switch's end (yes
traffic will be leaving the host and into network switches) allows only
tagged vlans.

But with network as such:

...

   
     



I responded to the bug you filed at bugzilla.redhat.com, but I'll 
respond here too in case someone comes across this message in the future.


If you want untagged traffic from the guest to be tagged as it is going 
onto the OVS switch, then you do not want "trunk='yes'" here. Either set 
trunk='no', or just leave it out.


If you set trunk='yes' then (as I understand it) traffic tagged with id 
55 will be allowed through the port, but the tag won't be removed or 
added. in either direction.




   
     
   


and guest as:

     
   
   
   
     

When the guest is fully initialized vSwitch shows:

...

_uuid   : b3c130db-fa84-49f8-9cf5-824ec8cf3b81
bond_downdelay  : 0
bond_fake_iface : false
bond_mode   : []
bond_updelay    : 0
external_ids    : {}
fake_bridge : false
interfaces  : [35c0a914-a21a-43d7-9f63-adacffbb62bc]
lacp    : []
mac : []
name    : "ovsbr0"
other_config    : {}
qos : []
statistics  : {}
status  : {}
tag : []
trunks  : []
vlan_mode   : []

No tags, no trunks, no vlan mode???

Is there something I missed (in docs though I sroogled exensively) ?

I also tried to add mode='trunk' into  and virsh does not
complains but next time I edit the guest the mode bit is gone.


There is no such attribute "mode='trunk'". The accepted attributes for 
the  attribute can be found at 
https://libvirt.org/formatnetwork.html - search for "vlan" within that page.




My vSwitch's bridge has only one phys iface (into the net switch) and I
tried setting that iface with tag/no tag, with vlan_mode/no vlan_mode
but if guest is up with above libvirt's vSwitch initialization then
guest cannot ping net switch no matter the setting for phys iface.

I'm on Centos 7.6 with libvirt-4.5.0-10.el7_6.4.x86_64 &
openvswitch-2.0.0-7.el7.x86_64.

What can be the problem here?

many thanks, L.







___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users



___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] How to insert a dummy NIC

2019-03-11 Thread Laine Stump


On 3/11/19 5:05 AM, wf...@niif.hu wrote:

Hi,

I have to host (with KVM) an appliance which does not use its second and
third NIC.  They have to be present in the guest, but they'd better stay
totally disconnected from anything in the host.  "Second" and "third"
apparently means bus order.  Let's consider virtio devices only.  I think
the best technical solution is adding -device virtio-net-pci,addr=0x3 and
similar options to the KVM command line, without any corresponding
-netdev options (better ideas welcome).  QEMU emits "Warning: nic
virtio-net-pci.2 has no peer" messages, but that's expected.  I can even
do this much using the  element, but libvirt assigns
the 0x3 address to other virtio devices, leading to collision.  Is there
a way to "reserve" a bus address for such manually added devices without
assigning explicit addresses to all other devices in the configuration?



I think qemu is going to be upset by anything that has no backend to the 
emulated device.


As for libvirt reserving addresses that (from its point of view are 
otherwise unused - no, there's no way to do that; if a PCI address isn't 
used by a device in the libvirt config, it is considered fair game for 
assigning to a new device, and we've never considered such an option.




Things I also tried (and found inadequate):

* Using "generic ethernet connection" for the dummy NICs.  Close, but
   requires extra permissions for accessing /dev/net/tun, and technically
   feels a little inferior to using a peerless network device like above.


What version is your libvirt? extra permissions for qemu using 
type='ethernet' (beyond what's required for a type='network' or 
type='bridge') have not been required since libvirt-1.3.3, released on 
April 6, 2016 (this was the result of commit 9c17d665f). If your libvirt 
is that old, you *really* should update to something newer. If it's 
*not* that old, then you're just working with out of date documentation.





* TCP tunnel server.  Even more inferior, does not require extra
   permissions but leaves even looser ends (listening sockets).  Also, the
   RelaxNG grammal does not let me specify a model for this interface
   type, so maintaining bus order with respect to the virtio interfaces is
   impossible.  A grammar bug?

* Using a dummy VLAN in the bridge.  This is what I temporarily settled
   for, but this requires global agreement and still technically inferior,
   so I'd like to move away.

* A  without forwarding.  Still inferior, and also requires
   configuration sharing across the host cluster.



Does it matter if the interface is online or not?

I would recommend using an expansion of this:

   
  
  ...

Your guest will have a network device in the desired position, qemu will 
be satisfied that the device has a backend, libvirt will know there is a 
device there so it won't give the PCI address away to somebody else, and 
the tap device will be IFF_DOWN, so there will be no possibility of 
network traffic accidentally leaking into the host (which would already 
be nearly impossible unless someone separately assigned an IP address to 
the host side of the tap device).


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] libvirt 5.0.0 - LXC container still in "virsh list" output after shutdown

2019-02-25 Thread Laine Stump


On 2/25/19 11:12 AM, mxs kolo wrote:

https://bugzilla.redhat.com/show_bug.cgi?id=1681180


Since you are able to post to a mailing list, and are also know what is 
the problem and how to patch it, how about sending a patch (based on git 
master of libvirt source, and using git send-email) to 
libvir-l...@redhat.com?


  https://libvirt.org/hacking.html#patches

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] script called from qemu hook freezes.

2019-01-04 Thread Laine Stump

On 1/4/19 11:27 AM, daggs wrote:

Greetings Peter,

Sent: Friday, January 04, 2019 at 4:47 PM
From: "Peter Krempa" 
To: daggs 
Cc: libvirt-users@redhat.com
Subject: Re: [libvirt-users] script called from qemu hook freezes.

On Thu, Jan 03, 2019 at 18:07:58 +0100, daggs wrote:

Greetings,

I'm executing an external script when the qemu hook is called with start or 
release, the script is rather simple, upon start it iterates over the output of 
lsusb -t and for each device, it looks if it should be added to the vm we 
started, if so, it attaches it to the vm as follows:
virsh --connect qemu:///system "${cmd}" "${domain}" /dev/stdin << END

END

where cmd is attach-device, domain is the vm's name, busnum and devnum come 
from the output of the lsusb -t.
my issue is that upon the first attach attempt, the cmd hangs, I need to kill 
it and after than I cannot preform any virsh cmd, I must restart the host.
if I try to execute the same cmd after the vm is up, it works great.

why the attach process gets stuck? do I need to execute it under different 
stage?

Hook scripts shall never call any libvirt API (even through virsh). At
the point when the hook script is called the VM startup process is
paused until the script returns. If the script attempts to modify the VM
it gets stuck as the VM is locked at that point.

You either need to add the device prior to startup, but AFAIK that is
not possible with a hook script or after but the script needs to return.

So you either fork off a process which will wait for the startup to
finish from the hook script or write an APP using the libvirt API which
will wait for the VM start event and then execute what's necessary.

I see, is there another way to do what I need (on startup usb hotplug) or maybe 
optional hotplug (e.g. don't fail the device isn't present)?

I've never used it, but does the "startupPolicy" attribute do what you 
need? Search for that string at:

   https://libvirt.org/formatdomain.html#elementsHostDevSubsys

___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

Re: [libvirt-users] assigning PCI addresses with bus > 0x09

2019-01-04 Thread Laine Stump


On 1/3/19 5:22 AM, Riccardo Ravaioli wrote:
On Thu, 20 Dec 2018 at 15:39, Laine Stump <mailto:la...@redhat.com>> wrote:


I think you're right. Each bus requires some amount of IO space, and I
thought I recalled someone saying that all of the available IO space is
exhausted after 7 or 8 buses. [...]


Laine,

Do you have by any chance a link to a page explaining this in more details?
Thanks again! :)


No, sorry. There are mentions of it in bugzilla records (e.g. 
https://bugzilla.redhat.com/show_bug.cgi?id=106 ) but all the info I 
have is just recalled from email and irc conversations over the last 3-4 
years. Basically the amount of IO address space is limited, and SeaBIOS 
allocates some minimum-sized chunk of that for each PCI controller that 
is probed and "seems to need IO address space for its devices". After 8 
or so controllers, all the space is used up.


You can avoid the IO address space limit if you're using a PCI Express 
based machinetype, *and* all PCIe devices, but you'll run up against 
different limits that need to be worked around in a different way - PCIe 
devices are required to be usable with no IO address space (and so QEMU 
specifically creates the PCIe controllers beyond pcie-root without any). 
But each PCIe controller only has a single slot (with 8 functions), each 
consumes a "bus number" and the bus number is an 8-bit value, so you're 
limited to 256 total (x8 if you don't need hotplug and don't mind 
manually assigning addresses).


___
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users

1 2 3 4 >

1 - 100 of 367 matches

Mail list logo