Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-24 Thread Don Dutile
On 04/24/2013 06:46 AM, Joerg Roedel wrote: On Tue, Apr 23, 2013 at 09:22:45AM -0400, Don Dutile wrote: Given other threads on this mail list (and I've seen crashes with same problem) where this type of logging during a flood of IOMMU errors will lock up the machine, is there something that

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-24 Thread Joerg Roedel
On Tue, Apr 23, 2013 at 09:22:45AM -0400, Don Dutile wrote: > Given other threads on this mail list (and I've seen crashes with same > problem) > where this type of logging during a flood of IOMMU errors will lock up the > machine, > is there something that can be done to break the do-while loop

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-24 Thread Joerg Roedel
On Tue, Apr 23, 2013 at 09:22:45AM -0400, Don Dutile wrote: Given other threads on this mail list (and I've seen crashes with same problem) where this type of logging during a flood of IOMMU errors will lock up the machine, is there something that can be done to break the do-while loop

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-24 Thread Don Dutile
On 04/24/2013 06:46 AM, Joerg Roedel wrote: On Tue, Apr 23, 2013 at 09:22:45AM -0400, Don Dutile wrote: Given other threads on this mail list (and I've seen crashes with same problem) where this type of logging during a flood of IOMMU errors will lock up the machine, is there something that

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-23 Thread Don Dutile
On 04/18/2013 12:28 PM, Joerg Roedel wrote: On Thu, Apr 18, 2013 at 11:13:19AM -0500, Suravee Suthikulanit wrote: This workaround is required for both event log and ppr log. Your patch is only taking care of the event log. Right, thanks for the notice. Here is the updated patch. From

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-23 Thread Don Dutile
On 04/18/2013 12:28 PM, Joerg Roedel wrote: On Thu, Apr 18, 2013 at 11:13:19AM -0500, Suravee Suthikulanit wrote: This workaround is required for both event log and ppr log. Your patch is only taking care of the event log. Right, thanks for the notice. Here is the updated patch. From

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-22 Thread Suravee Suthikulanit
On 4/18/2013 3:06 PM, Joerg Roedel wrote: Yes, but the irq-thread function itself executes the handler function repeatedly until the IRQTF_RUNTHREAD bit is cleared. And every new interrupt will set this bit again. So when there is a new interrupt while our handler function runs the handler will

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-22 Thread Suravee Suthikulanit
On 4/18/2013 3:06 PM, Joerg Roedel wrote: Yes, but the irq-thread function itself executes the handler function repeatedly until the IRQTF_RUNTHREAD bit is cleared. And every new interrupt will set this bit again. So when there is a new interrupt while our handler function runs the handler will

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Thu, Apr 18, 2013 at 01:56:42PM -0500, Suthikulpanit, Suravee wrote: > On 4/18/2013 1:35 PM, Joerg Roedel wrote: > According to the "kernel/irq/handle.c:irq_wake_thread()", I thought > that for the threaded IRQ, if the system getting a new interrupt > from the device while the thread is

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Suravee Suthikulpanit
On 4/18/2013 1:35 PM, Joerg Roedel wrote: On Thu, Apr 18, 2013 at 11:59:58AM -0500, Suthikulpanit, Suravee wrote: One last concern I have for this patch is the case when we re-enable the interrupt, then another interrupt happens while we processing the log and set the bit. If the interrupt

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Thu, Apr 18, 2013 at 11:59:58AM -0500, Suthikulpanit, Suravee wrote: > One last concern I have for this patch is the case when we re-enable > the interrupt, then another interrupt happens while we processing > the log and set the bit. If the interrupt thread doesn't check this > right before

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Suravee Suthikulanit
Joerg, One last concern I have for this patch is the case when we re-enable the interrupt, then another interrupt happens while we processing the log and set the bit. If the interrupt thread doesn't check this right before the thread exits the handler. We could still end up leaving the

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Thu, Apr 18, 2013 at 11:13:19AM -0500, Suravee Suthikulanit wrote: > This workaround is required for both event log and ppr log. Your > patch is only taking care of the event log. Right, thanks for the notice. Here is the updated patch. >From cebe04596989c4b9001e2c1571c4fb219ea37b99 Mon Sep

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Suravee Suthikulanit
Joerg, This workaround is required for both event log and ppr log. Your patch is only taking care of the event log. Suravee On 4/18/2013 11:02 AM, Joerg Roedel wrote: On Mon, Apr 15, 2013 at 02:07:46AM -0500, suravee.suthikulpa...@amd.com wrote: drivers/iommu/amd_iommu.c | 145

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Mon, Apr 15, 2013 at 02:07:46AM -0500, suravee.suthikulpa...@amd.com wrote: > drivers/iommu/amd_iommu.c | 145 > + That is way too much for a simple erratum workaround, and too much for a stable backport. I queued the patch below instead, which has

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Mon, Apr 15, 2013 at 02:07:46AM -0500, suravee.suthikulpa...@amd.com wrote: drivers/iommu/amd_iommu.c | 145 + That is way too much for a simple erratum workaround, and too much for a stable backport. I queued the patch below instead, which has a

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Suravee Suthikulanit
Joerg, This workaround is required for both event log and ppr log. Your patch is only taking care of the event log. Suravee On 4/18/2013 11:02 AM, Joerg Roedel wrote: On Mon, Apr 15, 2013 at 02:07:46AM -0500, suravee.suthikulpa...@amd.com wrote: drivers/iommu/amd_iommu.c | 145

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Thu, Apr 18, 2013 at 11:13:19AM -0500, Suravee Suthikulanit wrote: This workaround is required for both event log and ppr log. Your patch is only taking care of the event log. Right, thanks for the notice. Here is the updated patch. From cebe04596989c4b9001e2c1571c4fb219ea37b99 Mon Sep 17

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Suravee Suthikulanit
Joerg, One last concern I have for this patch is the case when we re-enable the interrupt, then another interrupt happens while we processing the log and set the bit. If the interrupt thread doesn't check this right before the thread exits the handler. We could still end up leaving the

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Thu, Apr 18, 2013 at 11:59:58AM -0500, Suthikulpanit, Suravee wrote: One last concern I have for this patch is the case when we re-enable the interrupt, then another interrupt happens while we processing the log and set the bit. If the interrupt thread doesn't check this right before the

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Suravee Suthikulpanit
On 4/18/2013 1:35 PM, Joerg Roedel wrote: On Thu, Apr 18, 2013 at 11:59:58AM -0500, Suthikulpanit, Suravee wrote: One last concern I have for this patch is the case when we re-enable the interrupt, then another interrupt happens while we processing the log and set the bit. If the interrupt

Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-18 Thread Joerg Roedel
On Thu, Apr 18, 2013 at 01:56:42PM -0500, Suthikulpanit, Suravee wrote: On 4/18/2013 1:35 PM, Joerg Roedel wrote: According to the kernel/irq/handle.c:irq_wake_thread(), I thought that for the threaded IRQ, if the system getting a new interrupt from the device while the thread is running, it

[PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-15 Thread suravee.suthikulpanit
From: Suravee Suthikulpanit The IOMMU interrupt handling in bottom half must clear the PPR log interrupt and event log interrupt bits to re-enable the interrupt. This is done by writing 1 to the memory mapped register to clear the bit. Due to hardware bug, if the driver tries to clear this bit

[PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

2013-04-15 Thread suravee.suthikulpanit
From: Suravee Suthikulpanit suravee.suthikulpa...@amd.com The IOMMU interrupt handling in bottom half must clear the PPR log interrupt and event log interrupt bits to re-enable the interrupt. This is done by writing 1 to the memory mapped register to clear the bit. Due to hardware bug, if the