Hi Joerg,

On 2022/5/21 08:21, Yian Chen wrote:
Notifier calling chain uses priority to determine the execution
order of the notifiers or listeners registered to the chain.
PCI bus device hot add utilizes the notification mechanism.

The current code sets low priority (INT_MIN) to Intel
dmar_pci_bus_notifier and postpones DMAR decoding after adding
new device into IOMMU. The result is that struct device pointer
cannot be found in DRHD search for the new device's DMAR/IOMMU.
Subsequently, the device is put under the "catch-all" IOMMU
instead of the correct one. This could cause system hang when
device TLB invalidation is sent to the wrong IOMMU. Invalidation
timeout error and hard lockup have been observed and data
inconsistency/crush may occur as well.

This patch fixes the issue by setting a positive priority(1) for
dmar_pci_bus_notifier while the priority of IOMMU bus notifier
uses the default value(0), therefore DMAR decoding will be in
advance of DRHD search for a new device to find the correct IOMMU.

Following is a 2-step example that triggers the bug by simulating
PCI device hot add behavior in Intel Sapphire Rapids server.

echo 1 > /sys/bus/pci/devices/0000:6a:01.0/remove
echo 1 > /sys/bus/pci/rescan

Fixes: 59ce0515cdaf ("iommu/vt-d: Update DRHD/RMRR/ATSR device scope")
Cc: sta...@vger.kernel.org # v3.15+
Reported-by: Zhang, Bernice <bernice.zh...@intel.com>
Signed-off-by: Jacob Pan <jacob.jun....@linux.intel.com>
Signed-off-by: Yian Chen <yian.c...@intel.com>
---
This is a quick fix for the bug reported. Intel internally evaluated
another redesigned solution that eliminates dmar pci bus notifier to
simplify the workflow of pci hotplug and improve its runtime efficiency.

While considering the fix could apply to downstream and the complexity
of pci hotplug workflow change may significantly increase the
engineering effort to downstream the patch, the choice is to submit this
simple patch to help the deployment of this bug fix.

Yian has been worked on using IOMMU bus notifier to solve this problem.
It turns out that due to the following facts, we need to refactor the IOMMU core and Intel DMAR Code:

- Interrupt remapping also requires Intel DMAR code. Therefore, when
  IOMMU is not enabled, the PCI bus notifier in DMAR is still required.
- The IOMMU PCI bus notifier calls .probe_device() which lacks of the
  information about hot-add or static boot.

Considering that the problem described here is a serious problem,
because users can easily damage the system by writing sysfs files on
some platforms, we need a quick fix for both upstream and stable
kernels. The refactoring code will be discussed in a separate series.

How do you like it? If you agree, I can queue it in my next pull request
for fixes.

Best regards,
baolu

---

  drivers/iommu/intel/dmar.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 4de960834a1b..497c5bd95caf 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -383,7 +383,7 @@ static int dmar_pci_bus_notifier(struct notifier_block *nb,
static struct notifier_block dmar_pci_bus_nb = {
        .notifier_call = dmar_pci_bus_notifier,
-       .priority = INT_MIN,
+       .priority = 1,
  };
static struct dmar_drhd_unit *

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to