On 8/26/25 12:11 PM, Thomas Lamprecht wrote:
On 12/08/2025 11:59, Dominik Csapak wrote:
by default, pci devices will be bound to 'vfio-pci' driver and reset.
For most devices this is necessary, but there are a few exceptions,
e.g.:
* some mellanox nics have support for the driver 'mlx5_vfio_pci'
* intel flex gpus have support for 'i915_vfio_pci'
* (maybe some more i don't know about)
both of these drivers play the role of the vfio-pci drivers themselves,
so no rebinding or resetting necessary. Those drivers usually have more
functionality than the default vfio driver, like support for
live-migration.
To be able to configure that on our side, introduce the 'keep-driver'
option for 'hostpciX', which will not rebind/reset the device.
Signed-off-by: Dominik Csapak <d.csa...@proxmox.com>
---
sending as RFC, since i'm not sure if we want to go this (generic)
approach, or if we e.g. want to make special configs/cases for driver we
know. Pro of this approach is that we don't have to add more drivers in
the future, but con is that it has some potential to confuse users when
it does not work the way they though it would.
The main relevant question for if this approach is OK is if we
ever want to support loading a specific driver explicitly.
If very unlikely we can go this exact route, otherwise we could at
least prepare for that possibility while still avoiding the need for
a specific driver list, e.g. by using an option like:
driver=<vfio|keep>
Where vfio is the default.
No hard feelings though, we can still transform a keep-driver
option to such an option in the future with the small cost of
backward compat, but if you see no downside for above approach
and especially if you could immagine us loading a specific driver
already then it might be good to go for that route already now.
If not, I can apply that patch as is, albeit in that case I'd want
a followup (see below).
Your suggestion with driver=... makes total sense and is easily
extendable, so i'll do that for the next version>
src/PVE/QemuServer/PCI.pm | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/src/PVE/QemuServer/PCI.pm b/src/PVE/QemuServer/PCI.pm
index e7a9a610..84a56998 100644
--- a/src/PVE/QemuServer/PCI.pm
+++ b/src/PVE/QemuServer/PCI.pm
@@ -124,6 +124,13 @@ EODESCR
optional => 1,
description => "Override PCI subsystem device ID visible to guest",
},
+ 'keep-driver' => {
+ type => 'boolean',
+ optional => 1,
+ default => 0,
+ description => "If this is set, does not bind the device to vfio-pci and
does not reset"
+ . "the device. Useful for VF that already have the correct driver
loaded.",
"does not" sounds a bit odd to me here, maybe rather something like:
'If set, the device will neither be bound to vfio-pci nor reset. This is useful
for VF devices that already have the correct driver loaded.'
+ },
};
PVE::JSONSchema::register_format('pve-qm-hostpci', $hostpci_fmt);
@@ -736,7 +743,7 @@ sub prepare_pci_device {
if !PVE::SysFSTools::check_iommu_support();
die "no pci device info for device '$pciid'\n" if !$info;
- if ($device->{nvidia}) {
+ if ($device->{nvidia} || $device->{'keep-driver'}) {
I'd encourage adding (or extending) a cfg2cmd test for new options, even
if it doesn't allow full coverage it can still be useful to have to
catch more regressions (especially with perl).
of course, i did not do it for an RFC because i did not expect it to go
in like this without discussion/refining anyway. the next version
will have cfg2command tests (though it does not change anything on the
command line currently, so the tests will just test that the config
is parseable)
# nothing to do
} elsif (my $mdev = $device->{mdev}) {
my $uuid = generate_mdev_uuid($vmid, $index);
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel