On Tue, 2018-02-13 at 17:35 +0100, Joerg Roedel wrote: > On Mon, Feb 12, 2018 at 04:48:23PM +0000, Dmitry Safonov wrote: > > dmar_fault() reports/handles/cleans DMAR faults in a cycle one-by- > > one. > > The nuisance is that it's set as a irq handler and runs with > > disabled > > interrupts - which works OK if you have only a couple of DMAR > > faults, > > but becomes a problem if your intel iommu has a plenty of mappings. > > I don't think that a work-queue is the right solution here, it adds a > long delay until the log is processed. During that delay, and with > high > fault rates the error log will overflow during that delay. > > Here is what I think you should do instead to fix the soft-lockups: > > First, unmask the fault reporting irq so that you will get subsequent > irqs. Then: > > * For Primary Fault Reporting just cycle once through all > supported fault recording registers. > > * For Advanced Fault Reporting, read start and end pointer of > the log and process all entries. > > After that return from the fault handler and let the next irq handle > additional faults that might have been recorded while the previous > handler was running.
Ok, will re-do this way, thanks. > And of course, ratelimiting the fault printouts is always a good > idea. -- Dima