Hi Samuel,

On Mon, Apr 03, 2017 at 01:38:08PM -0700, Samuel Sieb wrote:
> I filed a bug in bugzilla, but I wasn't sure what category to put it
> in, so I suspect I ended up picking one that doesn't get looked at
> much.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=195051
> 
> The issue is that on a specific Acer laptop with a dual-core A9, if
> I don't disable the IOMMU using iommu=off, it has immediate and
> rapidly fatal filesystem corruption by the time a user logs into the
> desktop. What led me to try that was at one point I noticed an error
> message about the iommu in the logs.  However, I did not have a
> chance to save that due to the corruption obliterating the log
> files.

You have a system based on the AMD Stoney platform, on which the PCI-ATS
feature of the GPU is broken, as we recently found out.

Can you please test whether the attached patch fixes the issue on your
machine?

>From 09cbdcbbd23f0823e7651b4f35b13ae633b3fbe2 Mon Sep 17 00:00:00 2001
From: Joerg Roedel <jroe...@suse.de>
Date: Tue, 28 Mar 2017 13:20:27 +0200
Subject: [PATCH] PCI: Blacklist AMD Stoney GPU devices for ATS

ATS is broken on these devices. Under invalidation load, the
GPU does not reply to invalidations anymore, causing
Completion-wait loop timeouts on the AMD IOMMU driver side.
Fix it by not enabling ATS on these devices.

Note that below mentioned commit is not broken, it just
triggers the issue because it might cause invalidation
storms on devices.

Fixes: b1516a14657a ('iommu/amd: Implement flush queue')
Reported-by: Daniel Drake <dr...@endlessm.com>
Cc: Alexander Deucher <alexander.deuc...@amd.com>
Signed-off-by: Joerg Roedel <jroe...@suse.de>
---
 drivers/pci/ats.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index eeb9fb2..711bdb2 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -17,10 +17,18 @@
 
 #include "pci.h"
 
+static const struct pci_device_id broken_ats_tbl[] = {
+       { PCI_DEVICE(PCI_VENDOR_ID_AMD, 0x98e4) }, /* AMD Stoney GPU part */
+       { 0 }
+};
+
 void pci_ats_init(struct pci_dev *dev)
 {
        int pos;
 
+       if (pci_match_id(broken_ats_tbl, dev))
+               return;
+
        pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ATS);
        if (!pos)
                return;
-- 
1.9.1

Reply via email to