When a device or driver misbehaves, it is possible to receive DMA fault events much faster than we can print them out, causing a lock up of the system and inability to cancel the source of the problem. Ratelimit printing of events to help recovery.
Tested-by: Aaro Koskinen <[email protected]> Signed-off-by: Jean-Philippe Brucker <[email protected]> --- Aiming for v5.14 rather than 5.13, since it mainly fixes a nuisance during development/debug. Conflicts with "iommu/arm-smmu-v3: Add stall support for platform devices" currently on the list [1], because they both change arm_smmu_evtq_thread(). This patch is based onto [1]. I encountered this while developing SVA on hardware, although the problem is not specific to SVA or stall. The device driver didn't properly stop DMA, and the SMMU would flood the event queue with translation faults. Without rate limiting I was unable to even reset the device. Note that this is not a problem for normal SVA operations, since userspace cannot cause DMA to print kernel messages. Aaro Koskinen reported a similar problem [2] [1] https://lore.kernel.org/linux-iommu/[email protected]/ [2] https://lore.kernel.org/linux-iommu/[email protected]/ --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 39bdb4264248..2792382ad3bd 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1518,6 +1518,8 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) struct arm_smmu_device *smmu = dev; struct arm_smmu_queue *q = &smmu->evtq.q; struct arm_smmu_ll_queue *llq = &q->llq; + static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); u64 evt[EVTQ_ENT_DWORDS]; do { @@ -1525,7 +1527,7 @@ static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev) u8 id = FIELD_GET(EVTQ_0_ID, evt[0]); ret = arm_smmu_handle_evt(smmu, evt); - if (!ret) + if (!ret || !__ratelimit(&rs)) continue; dev_info(smmu->dev, "event 0x%02x received:\n", id); -- 2.31.1 _______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
