Re: [PATCH v2] iommu/amd: Use report_iommu_fault()
On Sat, Aug 21, 2021 at 06:44:39PM +0300, Lennert Buytenhek wrote: > > > - EVENT_FLAG_I unset, but the request was a translation request > > > (EVENT_FLAG_TR set) or the target page was not present > > > (EVENT_FLAG_PR unset): call report_iommu_fault(), but the RW > > > bit will be invalid, so don't try to map it to a > > > IOMMU_FAULT_{READ,WRITE} code > > > > So, why do we need to call report_iommu_fault() for this case? > > My understanding is we only have IOMMU_FAULT_[READ|WRITE]. > > So, if we can't identify whether the DMA is read / write, > > we should not need to call report_iommu_fauilt(), is it? > > I don't think that we should just altogether avoid logging the subset > of page faults for which we can't determine the read/write direction > on AMD platforms. > > E.g. "access to an unmapped address" (which will have PR=0, and thus we > won't know if it was a read or a write access) is just as much of a page > fault as "write to a read-only page" (which will have PR=1, and thus the > RW bit will be accurate) is, and for RAS purposes, both events are > equally interesting, and important to know about. > > It's true that we currently don't have a way of signaling to > report_iommu_fault() (and by extension, to the io_page_fault > tracepoint) that we're not sure whether the offending access was a read > or a write, but I think we can just add a bit to include/linux/iommu.h > to indicate that, something along the lines of: > >/* iommu fault flags */ >#define IOMMU_FAULT_READ0x0 >#define IOMMU_FAULT_WRITE 0x1 > +#define IOMMU_FAULT_RW_UNKNOWN 0x2 > > (Cc'ing Ohad Ben-Cohen, who originally added this API.) > > I don't think that it would be a good idea to just not signal the page > faults for which we don't know the read/write direction. I had another look at this, and from some testing, it seems that, contrary to what the datasheet suggests, the RW (read/write direction) bit in logged I/O page faults is actually accurate for both faulting accesses to present pages and faulting accesses to non-present pages. I made a few hacks to the ixgbe driver to intentionally cause several different types of I/O page faults, thereby turning an ixgbe NIC into a poor man's I/O page fault generator, and these are the resulting logged I/O page faults, on a Ryzen 3700X system: Read from non-present page: ixgbe :25:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0x1bbc8f040 flags=0x] => flags indicate PR(esent)=0, RW=0 Write to non-present page: ixgbe :25:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0x1bdcb70c0 flags=0x0020] => flags indicate PR(esent)=0, RW=1 Read from write-only page: ixgbe :25:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xbbcc1c40 flags=0x0050] => flags indicate PR(esent)=1, RW=0 (and PE(rmission violation)=1) Write to read-only page: ixgbe :25:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 address=0xbdcb70c0 flags=0x0070] => flags indicate PR(esent)=1, RW=1 (and PE(rmission violation)=1) In other words, it seems that the RW bit is reliable even for PR=0 type faults. I assume that the restriction mentioned in the docs regarding RW and PR ("RW is only meaningful when PR=1, TR=0, and I=0" from AMD I/O Virtualization Technology (IOMMU) Specification, revision 3.00, Table 55: IO_PAGE_FAULT Event Log Buffer Entry Fields) is either confused or refers to a restriction in older hardware that has since been lifted. I'll resubmit the patch to unconditionally pass through the RW bit. ixgbe hacks: to cause reads from non-present pages (in the TX path): --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -8232,7 +8232,7 @@ static int ixgbe_tx_map(struct ixgbe_ring *tx_ring, dma_unmap_len_set(tx_buffer, len, size); dma_unmap_addr_set(tx_buffer, dma, dma); - tx_desc->read.buffer_addr = cpu_to_le64(dma); + tx_desc->read.buffer_addr = cpu_to_le64(dma + BIT(32)); while (unlikely(size > IXGBE_MAX_DATA_PER_TXD)) { tx_desc->read.cmd_type_len = to cause writes to non-present pages (in the RX path): --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -1604,7 +1604,7 @@ void ixgbe_alloc_rx_buffers(struct ixgbe_ring *rx_ring, u16 cleaned_count) * Refresh the desc even if buffer_addrs didn't change * because each write-back erases this info. */ - rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset); + rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset + BIT(32)); rx_desc++; bi++; to cause reads from write-only pages (in the TX path): --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++
Re: [PATCH v2] iommu/amd: Use report_iommu_fault()
On Thu, Aug 05, 2021 at 11:26:25AM -0500, Suthikulpanit, Suravee wrote: > Lennert, Hi Suravee, > > - EVENT_FLAG_I unset, but the request was a translation request > > (EVENT_FLAG_TR set) or the target page was not present > > (EVENT_FLAG_PR unset): call report_iommu_fault(), but the RW > > bit will be invalid, so don't try to map it to a > > IOMMU_FAULT_{READ,WRITE} code > > So, why do we need to call report_iommu_fault() for this case? > My understanding is we only have IOMMU_FAULT_[READ|WRITE]. > So, if we can't identify whether the DMA is read / write, > we should not need to call report_iommu_fauilt(), is it? I don't think that we should just altogether avoid logging the subset of page faults for which we can't determine the read/write direction on AMD platforms. E.g. "access to an unmapped address" (which will have PR=0, and thus we won't know if it was a read or a write access) is just as much of a page fault as "write to a read-only page" (which will have PR=1, and thus the RW bit will be accurate) is, and for RAS purposes, both events are equally interesting, and important to know about. It's true that we currently don't have a way of signaling to report_iommu_fault() (and by extension, to the io_page_fault tracepoint) that we're not sure whether the offending access was a read or a write, but I think we can just add a bit to include/linux/iommu.h to indicate that, something along the lines of: /* iommu fault flags */ #define IOMMU_FAULT_READ0x0 #define IOMMU_FAULT_WRITE 0x1 +#define IOMMU_FAULT_RW_UNKNOWN 0x2 (Cc'ing Ohad Ben-Cohen, who originally added this API.) I don't think that it would be a good idea to just not signal the page faults for which we don't know the read/write direction. Thanks, Lennert > > - EVENT_FLAG_I unset, the request is a transaction request (EVENT_FLAG_TR > >unset) and the target page was present (EVENT_FLAG_PR set): call > >report_iommu_fault(), and use the RW bit to set IOMMU_FAULT_{READ,WRITE} > > > > So I don't think we can merge the test for EVENT_FLAG_I with the > > test for EVENT_FLAG_TR/EVENT_FLAG_PR. > > The only condition that we would report_iommu_fault is > I=0, TR=0, PR=1, isn't it. So we should be able to just check if PR=1. > > > > We could do something like this, if you'd prefer: > > > > #define IS_IOMMU_MEM_TRANSACTION(flags) \ > > (((flags) & EVENT_FLAG_I) == 0) > > > > #define IS_RW_FLAG_VALID(flags) \ > > (((flags) & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == EVENT_FLAG_PR) > > > > #define IS_WRITE_REQUEST(flags) \ > > (IS_RW_FLAG_VALID(flags) && (flags & EVENT_FLAG_RW)) > > > > And then do something like: > > > > if (dev_data && IS_IOMMU_MEM_TRANSACTION(flags)) { > > if (!report_iommu_fault(_data->domain->domain, >dev, > > address, > > IS_WRITE_REQUEST(flags) ? > > IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) > > Actually, IS_WRITE_REQUEST() == 0 could mean: > - I=0, TR=0, PR=1 and RW=0: This is fine. > - I=0, (TR=1 or PR=0), and we should not be calling report_iommu_fault() here > since we cannot specify READ/WRITE here. > > Thanks, > Suravee ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/amd: Use report_iommu_fault()
Lennert, On 7/29/2021 9:32 PM, Lennert Buytenhek wrote: We have three cases to handle: - EVENT_FLAG_I set: IRQ remapping fault, don't call report_iommu_fault() - EVENT_FLAG_I unset, but the request was a translation request (EVENT_FLAG_TR set) or the target page was not present (EVENT_FLAG_PR unset): call report_iommu_fault(), but the RW bit will be invalid, so don't try to map it to a IOMMU_FAULT_{READ,WRITE} code So, why do we need to call report_iommu_fault() for this case? My understanding is we only have IOMMU_FAULT_[READ|WRITE]. So, if we can't identify whether the DMA is read / write, we should not need to call report_iommu_fauilt(), is it? - EVENT_FLAG_I unset, the request is a transaction request (EVENT_FLAG_TR unset) and the target page was present (EVENT_FLAG_PR set): call report_iommu_fault(), and use the RW bit to set IOMMU_FAULT_{READ,WRITE} So I don't think we can merge the test for EVENT_FLAG_I with the test for EVENT_FLAG_TR/EVENT_FLAG_PR. The only condition that we would report_iommu_fault is I=0, TR=0, PR=1, isn't it. So we should be able to just check if PR=1. We could do something like this, if you'd prefer: #define IS_IOMMU_MEM_TRANSACTION(flags) \ (((flags) & EVENT_FLAG_I) == 0) #define IS_RW_FLAG_VALID(flags) \ (((flags) & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == EVENT_FLAG_PR) #define IS_WRITE_REQUEST(flags) \ (IS_RW_FLAG_VALID(flags) && (flags & EVENT_FLAG_RW)) And then do something like: if (dev_data && IS_IOMMU_MEM_TRANSACTION(flags)) { if (!report_iommu_fault(_data->domain->domain, >dev, address, IS_WRITE_REQUEST(flags) ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ)) Actually, IS_WRITE_REQUEST() == 0 could mean: - I=0, TR=0, PR=1 and RW=0: This is fine. - I=0, (TR=1 or PR=0), and we should not be calling report_iommu_fault() here since we cannot specify READ/WRITE here. Thanks, Suravee goto out; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v2] iommu/amd: Use report_iommu_fault()
On Fri, Jul 30, 2021 at 05:32:16AM +0300, Lennert Buytenhek wrote: > > > This patch makes iommu/amd call report_iommu_fault() when an I/O page > > > fault occurs, which has two effects: > > > > > > 1) It allows device drivers to register a callback to be notified of > > > I/O page faults, via the iommu_set_fault_handler() API. > > > > > > 2) It triggers the io_page_fault tracepoint in report_iommu_fault() > > > when an I/O page fault occurs. > > > > > > I'm mainly interested in (2). We have a daemon with some rasdaemon-like > > > functionality for handling platform errors, and being able to be notified > > > of I/O page faults for initiating corrective action is very useful -- and > > > receiving such events via event tracing is a lot nicer than having to > > > scrape them from kmsg. > > > > > > A number of other IOMMU drivers already use report_iommu_fault(), and > > > I/O page faults on those IOMMUs therefore already seem to trigger this > > > tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR. > > > > > > I copied the logic from the other callers of report_iommu_fault(), where > > > if that function returns zero, the driver will have handled the fault, > > > in which case we avoid logging information about the fault to the printk > > > buffer from the IOMMU driver. > > > > > > With this patch I see io_page_fault event tracing entries as expected: > > > > > > irq/24-AMD-Vi-48[002] 978.554289: io_page_fault: > > > IOMMU:[drvname] :05:00.0 iova=0x91482640 flags=0x > > > irq/24-AMD-Vi-48[002] 978.554294: io_page_fault: > > > IOMMU:[drvname] :05:00.0 iova=0x91482650 flags=0x > > > irq/24-AMD-Vi-48[002] 978.554299: io_page_fault: > > > IOMMU:[drvname] :05:00.0 iova=0x91482660 flags=0x > > > irq/24-AMD-Vi-48[002] 978.554305: io_page_fault: > > > IOMMU:[drvname] :05:00.0 iova=0x91482670 flags=0x > > > irq/24-AMD-Vi-48[002] 978.554310: io_page_fault: > > > IOMMU:[drvname] :05:00.0 iova=0x91482680 flags=0x > > > irq/24-AMD-Vi-48[002] 978.554315: io_page_fault: > > > IOMMU:[drvname] :05:00.0 iova=0x914826a0 flags=0x > > > > > > For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU > > > spec, but I haven't tested that bit of the code, as the page faults I > > > encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which > > > case EVENT_FLAG_RW doesn't make sense. > > > > > > Signed-off-by: Lennert Buytenhek > > > --- > > > Changes since v1 RFC: > > > > > > - Don't call report_iommu_fault() for IRQ remapping faults. > > >(Suggested by Joerg Roedel.) > > > > > > drivers/iommu/amd/amd_iommu_types.h | 4 > > > drivers/iommu/amd/iommu.c | 29 + > > > 2 files changed, 33 insertions(+) > > > > > > diff --git a/drivers/iommu/amd/amd_iommu_types.h > > > b/drivers/iommu/amd/amd_iommu_types.h > > > index 94c1a7a9876d..2f2c6630c24c 100644 > > > --- a/drivers/iommu/amd/amd_iommu_types.h > > > +++ b/drivers/iommu/amd/amd_iommu_types.h > > > @@ -138,6 +138,10 @@ > > > #define EVENT_DOMID_MASK_HI 0xf > > > #define EVENT_FLAGS_MASK0xfff > > > #define EVENT_FLAGS_SHIFT 0x10 > > > +#define EVENT_FLAG_TR0x100 > > > +#define EVENT_FLAG_RW0x020 > > > +#define EVENT_FLAG_PR0x010 > > > +#define EVENT_FLAG_I 0x008 > > > /* feature control bits */ > > > #define CONTROL_IOMMU_EN0x00ULL > > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > > > index a7d6d78147b7..d9fb2c22d44a 100644 > > > --- a/drivers/iommu/amd/iommu.c > > > +++ b/drivers/iommu/amd/iommu.c > > > > What if we introduce: > > > > +/* > > + * AMD I/O Virtualization Technology (IOMMU) Specification, > > + * revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says > > + * that the RW ("read-write") bit is only valid if the I/O > > + * page fault was caused by a memory transaction request > > + * referencing a page that was marked present. > > + */ > > +#define IO_PAGE_FAULT_MEM_MASK \ > > + (EVENT_FLAG_TR | EVENT_FLAG_PR | EVENT_FLAG_I) > > +#define IS_IOMMU_MEM_TRANSACTION(x)\ > > + ((x & IO_PAGE_FAULT_MEM_MASK) == EVENT_FLAG_PR) > > > > Note that this should have already checked w/ EVENT_FLAG_I == 0. > > > > > > > @@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, > > > u16 domain_id, > > > if (pdev) > > > dev_data = dev_iommu_priv_get(>dev); > > > + /* > > > + * If this is a DMA fault (for which the I(nterrupt) bit will > > > + * be unset), allow report_iommu_fault() to prevent logging it. > > > + */ > > > + if (dev_data && ((flags & EVENT_FLAG_I) == 0)) { > > > + int report_flags; > > > + > > > + /* > > > + * AMD I/O Virtualization Technology (IOMMU) Specification,
Re: [PATCH v2] iommu/amd: Use report_iommu_fault()
On Wed, Jul 28, 2021 at 04:51:27PM -0500, Suthikulpanit, Suravee wrote: > Lennert, Hi Suravee, > > This patch makes iommu/amd call report_iommu_fault() when an I/O page > > fault occurs, which has two effects: > > > > 1) It allows device drivers to register a callback to be notified of > > I/O page faults, via the iommu_set_fault_handler() API. > > > > 2) It triggers the io_page_fault tracepoint in report_iommu_fault() > > when an I/O page fault occurs. > > > > I'm mainly interested in (2). We have a daemon with some rasdaemon-like > > functionality for handling platform errors, and being able to be notified > > of I/O page faults for initiating corrective action is very useful -- and > > receiving such events via event tracing is a lot nicer than having to > > scrape them from kmsg. > > > > A number of other IOMMU drivers already use report_iommu_fault(), and > > I/O page faults on those IOMMUs therefore already seem to trigger this > > tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR. > > > > I copied the logic from the other callers of report_iommu_fault(), where > > if that function returns zero, the driver will have handled the fault, > > in which case we avoid logging information about the fault to the printk > > buffer from the IOMMU driver. > > > > With this patch I see io_page_fault event tracing entries as expected: > > > > irq/24-AMD-Vi-48[002] 978.554289: io_page_fault: > > IOMMU:[drvname] :05:00.0 iova=0x91482640 flags=0x > > irq/24-AMD-Vi-48[002] 978.554294: io_page_fault: > > IOMMU:[drvname] :05:00.0 iova=0x91482650 flags=0x > > irq/24-AMD-Vi-48[002] 978.554299: io_page_fault: > > IOMMU:[drvname] :05:00.0 iova=0x91482660 flags=0x > > irq/24-AMD-Vi-48[002] 978.554305: io_page_fault: > > IOMMU:[drvname] :05:00.0 iova=0x91482670 flags=0x > > irq/24-AMD-Vi-48[002] 978.554310: io_page_fault: > > IOMMU:[drvname] :05:00.0 iova=0x91482680 flags=0x > > irq/24-AMD-Vi-48[002] 978.554315: io_page_fault: > > IOMMU:[drvname] :05:00.0 iova=0x914826a0 flags=0x > > > > For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU > > spec, but I haven't tested that bit of the code, as the page faults I > > encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which > > case EVENT_FLAG_RW doesn't make sense. > > > > Signed-off-by: Lennert Buytenhek > > --- > > Changes since v1 RFC: > > > > - Don't call report_iommu_fault() for IRQ remapping faults. > >(Suggested by Joerg Roedel.) > > > > drivers/iommu/amd/amd_iommu_types.h | 4 > > drivers/iommu/amd/iommu.c | 29 + > > 2 files changed, 33 insertions(+) > > > > diff --git a/drivers/iommu/amd/amd_iommu_types.h > > b/drivers/iommu/amd/amd_iommu_types.h > > index 94c1a7a9876d..2f2c6630c24c 100644 > > --- a/drivers/iommu/amd/amd_iommu_types.h > > +++ b/drivers/iommu/amd/amd_iommu_types.h > > @@ -138,6 +138,10 @@ > > #define EVENT_DOMID_MASK_HI 0xf > > #define EVENT_FLAGS_MASK 0xfff > > #define EVENT_FLAGS_SHIFT 0x10 > > +#define EVENT_FLAG_TR 0x100 > > +#define EVENT_FLAG_RW 0x020 > > +#define EVENT_FLAG_PR 0x010 > > +#define EVENT_FLAG_I 0x008 > > /* feature control bits */ > > #define CONTROL_IOMMU_EN0x00ULL > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > > index a7d6d78147b7..d9fb2c22d44a 100644 > > --- a/drivers/iommu/amd/iommu.c > > +++ b/drivers/iommu/amd/iommu.c > > What if we introduce: > > +/* > + * AMD I/O Virtualization Technology (IOMMU) Specification, > + * revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says > + * that the RW ("read-write") bit is only valid if the I/O > + * page fault was caused by a memory transaction request > + * referencing a page that was marked present. > + */ > +#define IO_PAGE_FAULT_MEM_MASK \ > + (EVENT_FLAG_TR | EVENT_FLAG_PR | EVENT_FLAG_I) > +#define IS_IOMMU_MEM_TRANSACTION(x)\ > + ((x & IO_PAGE_FAULT_MEM_MASK) == EVENT_FLAG_PR) > > Note that this should have already checked w/ EVENT_FLAG_I == 0. > > > > @@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, u16 > > domain_id, > > if (pdev) > > dev_data = dev_iommu_priv_get(>dev); > > + /* > > +* If this is a DMA fault (for which the I(nterrupt) bit will > > +* be unset), allow report_iommu_fault() to prevent logging it. > > +*/ > > + if (dev_data && ((flags & EVENT_FLAG_I) == 0)) { > > + int report_flags; > > + > > + /* > > +* AMD I/O Virtualization Technology (IOMMU) Specification, > > +* revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says > > +* that the RW ("read-write") bit is only valid if the I/O > > +* page
Re: [PATCH v2] iommu/amd: Use report_iommu_fault()
Lennert, On 7/26/2021 11:31 AM, Lennert Buytenhek wrote: This patch makes iommu/amd call report_iommu_fault() when an I/O page fault occurs, which has two effects: 1) It allows device drivers to register a callback to be notified of I/O page faults, via the iommu_set_fault_handler() API. 2) It triggers the io_page_fault tracepoint in report_iommu_fault() when an I/O page fault occurs. I'm mainly interested in (2). We have a daemon with some rasdaemon-like functionality for handling platform errors, and being able to be notified of I/O page faults for initiating corrective action is very useful -- and receiving such events via event tracing is a lot nicer than having to scrape them from kmsg. A number of other IOMMU drivers already use report_iommu_fault(), and I/O page faults on those IOMMUs therefore already seem to trigger this tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR. I copied the logic from the other callers of report_iommu_fault(), where if that function returns zero, the driver will have handled the fault, in which case we avoid logging information about the fault to the printk buffer from the IOMMU driver. With this patch I see io_page_fault event tracing entries as expected: irq/24-AMD-Vi-48[002] 978.554289: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482640 flags=0x irq/24-AMD-Vi-48[002] 978.554294: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482650 flags=0x irq/24-AMD-Vi-48[002] 978.554299: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482660 flags=0x irq/24-AMD-Vi-48[002] 978.554305: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482670 flags=0x irq/24-AMD-Vi-48[002] 978.554310: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482680 flags=0x irq/24-AMD-Vi-48[002] 978.554315: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x914826a0 flags=0x For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU spec, but I haven't tested that bit of the code, as the page faults I encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which case EVENT_FLAG_RW doesn't make sense. Signed-off-by: Lennert Buytenhek --- Changes since v1 RFC: - Don't call report_iommu_fault() for IRQ remapping faults. (Suggested by Joerg Roedel.) drivers/iommu/amd/amd_iommu_types.h | 4 drivers/iommu/amd/iommu.c | 29 + 2 files changed, 33 insertions(+) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 94c1a7a9876d..2f2c6630c24c 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -138,6 +138,10 @@ #define EVENT_DOMID_MASK_HI 0xf #define EVENT_FLAGS_MASK 0xfff #define EVENT_FLAGS_SHIFT 0x10 +#define EVENT_FLAG_TR 0x100 +#define EVENT_FLAG_RW 0x020 +#define EVENT_FLAG_PR 0x010 +#define EVENT_FLAG_I 0x008 /* feature control bits */ #define CONTROL_IOMMU_EN0x00ULL diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a7d6d78147b7..d9fb2c22d44a 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c What if we introduce: +/* + * AMD I/O Virtualization Technology (IOMMU) Specification, + * revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says + * that the RW ("read-write") bit is only valid if the I/O + * page fault was caused by a memory transaction request + * referencing a page that was marked present. + */ +#define IO_PAGE_FAULT_MEM_MASK \ + (EVENT_FLAG_TR | EVENT_FLAG_PR | EVENT_FLAG_I) +#define IS_IOMMU_MEM_TRANSACTION(x)\ + ((x & IO_PAGE_FAULT_MEM_MASK) == EVENT_FLAG_PR) Note that this should have already checked w/ EVENT_FLAG_I == 0. @@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id, if (pdev) dev_data = dev_iommu_priv_get(>dev); + /* +* If this is a DMA fault (for which the I(nterrupt) bit will +* be unset), allow report_iommu_fault() to prevent logging it. +*/ + if (dev_data && ((flags & EVENT_FLAG_I) == 0)) { + int report_flags; + + /* +* AMD I/O Virtualization Technology (IOMMU) Specification, +* revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says +* that the RW ("read-write") bit is only valid if the I/O +* page fault was caused by a memory transaction request +* referencing a page that was marked present. +*/ + report_flags = 0; + if ((flags & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == + EVENT_FLAG_PR) { + if (flags & EVENT_FLAG_RW) +
[PATCH v2] iommu/amd: Use report_iommu_fault()
This patch makes iommu/amd call report_iommu_fault() when an I/O page fault occurs, which has two effects: 1) It allows device drivers to register a callback to be notified of I/O page faults, via the iommu_set_fault_handler() API. 2) It triggers the io_page_fault tracepoint in report_iommu_fault() when an I/O page fault occurs. I'm mainly interested in (2). We have a daemon with some rasdaemon-like functionality for handling platform errors, and being able to be notified of I/O page faults for initiating corrective action is very useful -- and receiving such events via event tracing is a lot nicer than having to scrape them from kmsg. A number of other IOMMU drivers already use report_iommu_fault(), and I/O page faults on those IOMMUs therefore already seem to trigger this tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR. I copied the logic from the other callers of report_iommu_fault(), where if that function returns zero, the driver will have handled the fault, in which case we avoid logging information about the fault to the printk buffer from the IOMMU driver. With this patch I see io_page_fault event tracing entries as expected: irq/24-AMD-Vi-48[002] 978.554289: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482640 flags=0x irq/24-AMD-Vi-48[002] 978.554294: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482650 flags=0x irq/24-AMD-Vi-48[002] 978.554299: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482660 flags=0x irq/24-AMD-Vi-48[002] 978.554305: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482670 flags=0x irq/24-AMD-Vi-48[002] 978.554310: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x91482680 flags=0x irq/24-AMD-Vi-48[002] 978.554315: io_page_fault: IOMMU:[drvname] :05:00.0 iova=0x914826a0 flags=0x For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU spec, but I haven't tested that bit of the code, as the page faults I encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which case EVENT_FLAG_RW doesn't make sense. Signed-off-by: Lennert Buytenhek --- Changes since v1 RFC: - Don't call report_iommu_fault() for IRQ remapping faults. (Suggested by Joerg Roedel.) drivers/iommu/amd/amd_iommu_types.h | 4 drivers/iommu/amd/iommu.c | 29 + 2 files changed, 33 insertions(+) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 94c1a7a9876d..2f2c6630c24c 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -138,6 +138,10 @@ #define EVENT_DOMID_MASK_HI0xf #define EVENT_FLAGS_MASK 0xfff #define EVENT_FLAGS_SHIFT 0x10 +#define EVENT_FLAG_TR 0x100 +#define EVENT_FLAG_RW 0x020 +#define EVENT_FLAG_PR 0x010 +#define EVENT_FLAG_I 0x008 /* feature control bits */ #define CONTROL_IOMMU_EN0x00ULL diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a7d6d78147b7..d9fb2c22d44a 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id, if (pdev) dev_data = dev_iommu_priv_get(>dev); + /* +* If this is a DMA fault (for which the I(nterrupt) bit will +* be unset), allow report_iommu_fault() to prevent logging it. +*/ + if (dev_data && ((flags & EVENT_FLAG_I) == 0)) { + int report_flags; + + /* +* AMD I/O Virtualization Technology (IOMMU) Specification, +* revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says +* that the RW ("read-write") bit is only valid if the I/O +* page fault was caused by a memory transaction request +* referencing a page that was marked present. +*/ + report_flags = 0; + if ((flags & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == + EVENT_FLAG_PR) { + if (flags & EVENT_FLAG_RW) + report_flags |= IOMMU_FAULT_WRITE; + else + report_flags |= IOMMU_FAULT_READ; + } + + if (!report_iommu_fault(_data->domain->domain, + >dev, address, report_flags)) + goto out; + } + if (dev_data) { if (__ratelimit(_data->rs)) { pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n", @@ -495,6 +523,7 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id, domain_id, address, flags);