Hi Alex,
I believe in the patch file, this
+ (pdev->subsystem_device == 0x0c19 ||
+pdev->subsystem_device == 0x0c10))
Has to be changed to:
+ (pdev->subsystem_device == 0xce19 ||
+pdev->subsystem_device == 0xcc10))
Because our SSIDs are "ea50:ce19" and "ea50:cc10" respectively and another one
would "ea50:cc08".
I will apply that patch and feedback the results soon plus the patch file that
I actually had applied.
-Original Message-
From: Deucher, Alexander
Sent: Montag, 30. November 2020 19:36
To: Merger, Edgar [AUTOSOL/MAS/AUGS] ; Huang, Ray
; Kuehling, Felix
Cc: Will Deacon ; linux-ker...@vger.kernel.org;
linux-...@vger.kernel.org; iommu@lists.linux-foundation.org; Bjorn Helgaas
; Joerg Roedel ; Zhu, Changfeng
Subject: RE: [EXTERNAL] Re: [PATCH] PCI: Mark AMD Raven iGPU ATS as broken
[AMD Public Use]
> -Original Message-
> From: Merger, Edgar [AUTOSOL/MAS/AUGS]
> Sent: Thursday, November 26, 2020 4:24 AM
> To: Deucher, Alexander ; Huang, Ray
> ; Kuehling, Felix
> Cc: Will Deacon ; linux-ker...@vger.kernel.org;
> linux- p...@vger.kernel.org; iommu@lists.linux-foundation.org; Bjorn
> Helgaas ; Joerg Roedel ; Zhu,
> Changfeng
> Subject: RE: [EXTERNAL] Re: [PATCH] PCI: Mark AMD Raven iGPU ATS as
> broken
>
> Alex,
>
> This is pretty much the same patch as what I have received from Joerg
> previously, except that it is tied to the particular Emerson platform
> and its derivatives (listed with Subsystem IDs).
Right. As per my original point, I don't want to disable ATS on all Picasso
chips because doing so would break GPU compute on them, so I'd like to apply
this quirk as narrowly as possible.
>
> Below patch was what Joerg provided me and I successfully tested.
>
> This diff to the kernel should do that:
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index
> f70692ac79c5..3911b0ec57ba 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5176,6 +5176,8 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI,
> 0x6900, quirk_amd_harvest_no_ats);
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7312,
> quirk_amd_harvest_no_ats);
> /* AMD Navi14 dGPU */
> DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7340,
> quirk_amd_harvest_no_ats);
> +/* AMD Raven platform iGPU */
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x15d8,
> +quirk_amd_harvest_no_ats);
> #endif /* CONFIG_PCI_ATS */
>
> /* Freescale PCIe doesn't support MSI in RC mode */
>
> So far I have seen this issue on two instances of this chip, but I
> must admit that I did test only two of them to this extent, so I guess
> it is not a bad chip in particular, but the chips we use are from the
> same production lot, so it might be a systematical problem of that production
> lot?
>
> UEFI-Setup shows:
> Processor Family: 17h
> Procossor Model: 20h - 2Fh
> CPUID: 00820F01
> Microcode Patch Level: 8200103
>
> Looking at the chip-die I found that this is a fully qualified IP
> Silicon (according to Ryzen Embedded R1000 SOC Interlock).
> YE1305C9T20FG
> BI2015SUY
> 9JB6496P00123
> 2016 AMD
> DIFFUSED IN USA
> MADE IN CHINA
>
> Currently used SBIOS is a branch from "EmbeddedPI-FP5 1.2.0.3RC3".
>
> In the future our SBIOS might merge with EmbeddedPI-FP5_1.2.0.5RC3.
>
I think it's more likely an sbios issue, so hopefully the new release fixes it.
Alex
>
>
>
> -Original Message-
> From: Deucher, Alexander
> Sent: Mittwoch, 25. November 2020 17:08
> To: Merger, Edgar [AUTOSOL/MAS/AUGS] ;
> Huang, Ray ; Kuehling, Felix
>
> Cc: Will Deacon ; linux-ker...@vger.kernel.org;
> linux- p...@vger.kernel.org; iommu@lists.linux-foundation.org; Bjorn
> Helgaas ; Joerg Roedel ; Zhu,
> Changfeng
> Subject: RE: [EXTERNAL] Re: [PATCH] PCI: Mark AMD Raven iGPU ATS as
> broken
>
> [AMD Public Use]
>
> > -Original Message-
> > From: Merger, Edgar [AUTOSOL/MAS/AUGS]
>
> > Sent: Wednesday, November 25, 2020 5:04 AM
> > To: Deucher, Alexander ; Huang, Ray
> > ; Kuehling, Felix
> > Cc: Will Deacon ; linux-ker...@vger.kernel.org;
> > linux- p...@vger.kernel.org; iommu@lists.linux-foundation.org; Bjorn
> > Helgaas ; Joerg Roedel ; Zhu,
> > Changfeng
> > Subject: RE: [EXTERNAL] Re: [PATCH] PCI: Mark AMD Raven iGPU ATS as
> > broken
> >
> > I do have also other problems with this unit, when IOMMU is enabled
> > and pci=noats is not set as kernel parameter.
> >
> > [ 2004.265906] amdgpu :0b:00.0: [drm:amdgpu_ib_ring_tests
> > [amdgpu]]
> > *ERROR* IB test failed on gfx (-110).
> > [ 2004.266024] [drm:amdgpu_device_delayed_init_work_handler
> [amdgpu]]
> > *ERROR* ib ring test failed (-110).
> >
>
> Is this seen on all instances of this chip or only specific silicon?
> I.e., could this be a bad chip? Would it be possible to test a newer
> sbios? I think the attached patch should work if we can't get it
> fixed on the platform side. It should only enable the quirk on your
> partic