On 4/15/25 06:05, Alex Williamson wrote: > On Mon, 14 Apr 2025 01:23:56 +0800 > Tomita Moeko <tomitamo...@gmail.com> wrote: > >> On 4/10/25 15:34, Cédric Le Goater wrote: >>> + Corvin >>> >>> On 4/9/25 19:18, Alex Williamson wrote: >>>> On Wed, 26 Mar 2025 01:22:39 +0800 >>>> Tomita Moeko <tomitamo...@gmail.com> wrote: >>>> >>>>> So far, all Intel VGA adapters, including discrete GPUs like A770 and >>>>> B580, were treated as IGD devices. While this had no functional impact, >>>>> a error about "unsupported IGD device" will be printed when passthrough >>>>> Intel discrete GPUs. >>>>> >>>>> Since IGD devices must be at "0000:00:02.0", let's check the host PCI >>>>> address when probing. >>>>> >>>>> Signed-off-by: Tomita Moeko <tomitamo...@gmail.com> >>>>> --- >>>>> hw/vfio/igd.c | 23 +++++++++-------------- >>>>> 1 file changed, 9 insertions(+), 14 deletions(-) >>>>> >>>>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c >>>>> index 265fffc2aa..ff250017b0 100644 >>>>> --- a/hw/vfio/igd.c >>>>> +++ b/hw/vfio/igd.c >>>>> @@ -53,6 +53,13 @@ >>>>> * headless setup is desired, the OpRegion gets in the way of that. >>>>> */ >>>>> +static bool vfio_is_igd(VFIOPCIDevice *vdev) >>>>> +{ >>>>> + return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) && >>>>> + vfio_is_vga(vdev) && >>>>> + vfio_pci_host_match(&vdev->host, "0000:00:02.0"); >>>>> +} >>>> >>>> vfio-pci devices can also be specified via sysfsdev= rather than host=, >>>> so at a minimum I think we'd need to test against vdev->vbasedev.name, >>>> as other callers of vfio_pci_host_match do. For example building a >>>> local PCIHostDeviceAddress and comparing it to name. This is also not >>>> foolproof though if we start taking advantage of devices passed by fd. >>>> >>>> Could we instead rely PCIe capabilities? A discrete GPU should >>>> identify as either an endpoint or legacy endpoint and IGD should >>>> identify as a root complex integrated endpoint, or maybe older versions >>>> would lack the PCIe capability altogether. >>> >>> Maintaining a list of PCI IDs for Intel GPU devices as Corvin was >>> proposing in [1] is not a viable solution ? >>> >>> Thanks, >>> >>> C. >>> >>> [1] >>> https://lore.kernel.org/qemu-devel/20250206121341.118337-1-corvin.koe...@gmail.com/ >>> >> >> I checked Intel doc, probably maintaining an device ID list is the only >> possible way. But given that intel is moving to xe driver, generation >> becomes unclear, I'd like to propose a list with quirk flags for igd. >> >> static const struct igd_device igd_devices[] = { >> INTEL_SNB_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM_QUIRK), >> INTEL_TGL_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM64_QUIRK), >> } >> >> Matching in the list is more time consuming than current switch-case, >> it's better to have a new field to cache it. >> >> I will go with Corvin's first 2 patches with reordering suggested by >> Cornelia. > > If I recall the discussion correctly, Corvin's series was mapping device > IDs to generation, where I had the concern that it creates ongoing > overhead to sync with the i915 driver to create new mappings. There > was a suggestion that newer hardware has a register that reports the > generation, so maybe we only need to manage creating the mapping table > up to the point we can rely on getting the generation information from > hardware (with the massive caveat that Intel could drop that generation > register in the future, or maybe already has). > > The above table however suggests yet another use case of the table, a > mapping of quirks to specific devices. It seems this once again > introduces the maintenance issue. Why would it not be sufficient to > determine the quirks based on the generation alone? Thanks, > > Alex
The generation register also exists on discrete GPUs. In the new xe driver [1], the Battlemage discrete GPU shares the same logic reading GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to device id. In QEMU, we need to know whether the device is a supported IGD device first before applying the IGD-specific quirk, especially for legacy mode. The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_ INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion. i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has removed the BDSM register by making the DSM range part of BAR2 since Meteor Lake and onwards. QEMU only need to quirk on the register for IGD devices until Raptor Lake, meaning that the device list is fixed for now. By the way, for legacy mode, I think we should only support it until Gen 9, as Intel only provide VBIOS or CSM support until that generation, and seabios cannot handle 64 bit BDSM register. I'm also wondering if VGA really works on newer generations. Maybe we can continue with current igd_gen, but implement a logic like: if (!intel graphics) return; if (!has VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION) return; setup_opregion(); // make x-igd-opregion automatically enabled? if (gen <= 9) setup_legacy_mode(); if (gen >= 6 && gen <=9) setup_32bit_bdsm(): else if (gen >= 9 && gen <= 12) setup_64bit_bdsm(); // ... // optional quirks like lpc bridge id A table can also be used to precisely track all the gen 6-12 devices. [1] https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/xe/xe_pci.c#L630 [2] https://github.com/torvalds/linux/blob/69b8923f5003664e3ffef102e73333edfa2abdcf/drivers/gpu/drm/i915/gem/i915_gem_stolen.c#L918 [3] https://edc.intel.com/content/www/us/en/design/publications/core-ultra-p200s-series-processors-cfg-mem-registers/d2-f0-processor-graphics-registers/ Thanks, Moeko Attached a config space dump of Intel A770 discrete GPU for reference 03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08) (prog-if 00 [VGA controller]) Subsystem: Intel Corporation Device 1020 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin ? routed to IRQ 181 IOMMU group: 19 Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=16M] Region 2: Memory at 6000000000 (64-bit, prefetchable) [size=16G] Expansion ROM at 82000000 [disabled] [size=2M] Capabilities: [40] Vendor Specific Information: Len=0c <?> Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W TEE-IO- DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1 TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+ 10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- AtomicOpsCtl: ReqEn- IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq- 10BitTagReq- OBFF Disabled, EETLPPrefixBlk- LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit+ Address: 00000000fee008b8 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [d0] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [420 v1] Physical Resizable BAR BAR 2: current size: 16GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB 16GB Capabilities: [400 v1] Latency Tolerance Reporting Max snoop latency: 15728640ns Max no snoop latency: 15728640ns Kernel driver in use: i915 Kernel modules: i915, xe 00: 86 80 a0 56 07 00 10 00 08 00 00 03 10 00 00 00 10: 04 00 00 81 00 00 00 00 0c 00 00 00 60 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 20 10 30: 00 00 00 82 40 00 00 00 00 00 00 00 ff 00 00 00 40: 09 70 0c 01 03 00 00 00 00 00 00 00 00 00 00 00 50: c0 f5 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 10 ac 02 00 20 80 00 10 10 09 08 00 11 0c 40 00 80: 00 00 11 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 12 08 13 00 00 04 00 00 02 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 05 d0 81 01 b0: b8 08 e0 fe 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 01 00 03 48 08 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 47 01 c0 ff 03 00 00 00 00 00 00 00 00 00 00 00