On 4/15/25 06:05, Alex Williamson wrote:
> On Mon, 14 Apr 2025 01:23:56 +0800
> Tomita Moeko <tomitamo...@gmail.com> wrote:
> 
>> On 4/10/25 15:34, Cédric Le Goater wrote:
>>> + Corvin
>>>
>>> On 4/9/25 19:18, Alex Williamson wrote:  
>>>> On Wed, 26 Mar 2025 01:22:39 +0800
>>>> Tomita Moeko <tomitamo...@gmail.com> wrote:
>>>>  
>>>>> So far, all Intel VGA adapters, including discrete GPUs like A770 and
>>>>> B580, were treated as IGD devices. While this had no functional impact,
>>>>> a error about "unsupported IGD device" will be printed when passthrough
>>>>> Intel discrete GPUs.
>>>>>
>>>>> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
>>>>> address when probing.
>>>>>
>>>>> Signed-off-by: Tomita Moeko <tomitamo...@gmail.com>
>>>>> ---
>>>>>   hw/vfio/igd.c | 23 +++++++++--------------
>>>>>   1 file changed, 9 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
>>>>> index 265fffc2aa..ff250017b0 100644
>>>>> --- a/hw/vfio/igd.c
>>>>> +++ b/hw/vfio/igd.c
>>>>> @@ -53,6 +53,13 @@
>>>>>    * headless setup is desired, the OpRegion gets in the way of that.
>>>>>    */
>>>>>   +static bool vfio_is_igd(VFIOPCIDevice *vdev)
>>>>> +{
>>>>> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
>>>>> +           vfio_is_vga(vdev) &&
>>>>> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
>>>>> +}  
>>>>
>>>> vfio-pci devices can also be specified via sysfsdev= rather than host=,
>>>> so at a minimum I think we'd need to test against vdev->vbasedev.name,
>>>> as other callers of vfio_pci_host_match do.  For example building a
>>>> local PCIHostDeviceAddress and comparing it to name.  This is also not
>>>> foolproof though if we start taking advantage of devices passed by fd.
>>>>
>>>> Could we instead rely PCIe capabilities?  A discrete GPU should
>>>> identify as either an endpoint or legacy endpoint and IGD should
>>>> identify as a root complex integrated endpoint, or maybe older versions
>>>> would lack the PCIe capability altogether.  
>>>
>>> Maintaining a list of PCI IDs for Intel GPU devices as Corvin was
>>> proposing in [1] is not a viable solution ?
>>>
>>> Thanks,
>>>
>>> C.
>>>
>>> [1] 
>>> https://lore.kernel.org/qemu-devel/20250206121341.118337-1-corvin.koe...@gmail.com/
>>>   
>>
>> I checked Intel doc, probably maintaining an device ID list is the only
>> possible way. But given that intel is moving to xe driver, generation
>> becomes unclear, I'd like to propose a list with quirk flags for igd.
>>
>> static const struct igd_device igd_devices[] = {
>>     INTEL_SNB_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM_QUIRK),
>>     INTEL_TGL_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM64_QUIRK),
>> }
>>
>> Matching in the list is more time consuming than current switch-case,
>> it's better to have a new field to cache it.
>>
>> I will go with Corvin's first 2 patches with reordering suggested by
>> Cornelia.
> 
> If I recall the discussion correctly, Corvin's series was mapping device
> IDs to generation, where I had the concern that it creates ongoing
> overhead to sync with the i915 driver to create new mappings.  There
> was a suggestion that newer hardware has a register that reports the
> generation, so maybe we only need to manage creating the mapping table
> up to the point we can rely on getting the generation information from
> hardware (with the massive caveat that Intel could drop that generation
> register in the future, or maybe already has).
> 
> The above table however suggests yet another use case of the table, a
> mapping of quirks to specific devices.  It seems this once again
> introduces the maintenance issue.  Why would it not be sufficient to
> determine the quirks based on the generation alone?  Thanks,
> 
> Alex

The generation register also exists on discrete GPUs. In the new xe
driver [1], the Battlemage discrete GPU shares the same logic reading
GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to
device id. In QEMU, we need to know whether the device is a supported
IGD device first before applying the IGD-specific quirk, especially
for legacy mode.

The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_
INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion.

i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has
removed the BDSM register by making the DSM range part of BAR2 since
Meteor Lake and onwards. QEMU only need to quirk on the register for
IGD devices until Raptor Lake, meaning that the device list is fixed
for now.

By the way, for legacy mode, I think we should only support it until
Gen 9, as Intel only provide VBIOS or CSM support until that generation,
and seabios cannot handle 64 bit BDSM register. I'm also wondering if
VGA really works on newer generations.

Maybe we can continue with current igd_gen, but implement a logic like:
    if (!intel graphics)
        return;
    if (!has VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION)
        return;
    setup_opregion();  // make x-igd-opregion automatically enabled?
    if (gen <= 9)
        setup_legacy_mode();
    if (gen >= 6 && gen <=9)
        setup_32bit_bdsm():
    else if (gen >= 9 && gen <= 12)
        setup_64bit_bdsm();
    // ...
    // optional quirks like lpc bridge id

A table can also be used to precisely track all the gen 6-12 devices.

[1] 
https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/xe/xe_pci.c#L630
[2] 
https://github.com/torvalds/linux/blob/69b8923f5003664e3ffef102e73333edfa2abdcf/drivers/gpu/drm/i915/gem/i915_gem_stolen.c#L918
[3] 
https://edc.intel.com/content/www/us/en/design/publications/core-ultra-p200s-series-processors-cfg-mem-registers/d2-f0-processor-graphics-registers/

Thanks,
Moeko

Attached a config space dump of Intel A770 discrete GPU for reference

03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08) 
(prog-if 00 [VGA controller])
        Subsystem: Intel Corporation Device 1020
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin ? routed to IRQ 181
        IOMMU group: 19
        Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=16M]
        Region 2: Memory at 6000000000 (64-bit, prefetchable) [size=16G]
        Expansion ROM at 82000000 [disabled] [size=2M]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, 
L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ 
SlotPowerLimit 0W TEE-IO-
                DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 128 bytes, MaxReadReq 128 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- 
TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit 
Latency L0s <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1
                        TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ 
EETLPPrefix-
                         EmergencyPowerReduction Not Supported, 
EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                         AtomicOpsCtl: ReqEn-
                         IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
                         10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
                LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 
2Retimers- DRS-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, 
EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB 
preshoot
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- 
EqualizationPhase1-
                         EqualizationPhase2- EqualizationPhase3- 
LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit+
                Address: 00000000fee008b8  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [d0] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0+,D1-,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [420 v1] Physical Resizable BAR
                BAR 2: current size: 16GB, supported: 256MB 512MB 1GB 2GB 4GB 
8GB 16GB
        Capabilities: [400 v1] Latency Tolerance Reporting
                Max snoop latency: 15728640ns
                Max no snoop latency: 15728640ns
        Kernel driver in use: i915
        Kernel modules: i915, xe
00: 86 80 a0 56 07 00 10 00 08 00 00 03 10 00 00 00
10: 04 00 00 81 00 00 00 00 0c 00 00 00 60 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 20 10
30: 00 00 00 82 40 00 00 00 00 00 00 00 ff 00 00 00
40: 09 70 0c 01 03 00 00 00 00 00 00 00 00 00 00 00
50: c0 f5 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 10 ac 02 00 20 80 00 10 10 09 08 00 11 0c 40 00
80: 00 00 11 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 12 08 13 00 00 04 00 00 02 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 05 d0 81 01
b0: b8 08 e0 fe 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 01 00 03 48 08 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 47 01 c0 ff 03 00 00 00 00 00 00 00 00 00 00 00


Reply via email to