Re: [PATCH v2] iommu/amd: Use report_iommu_fault()

2021-09-25 Thread Lennert Buytenhek
On Sat, Aug 21, 2021 at 06:44:39PM +0300, Lennert Buytenhek wrote:

> > > - EVENT_FLAG_I unset, but the request was a translation request
> > >   (EVENT_FLAG_TR set) or the target page was not present
> > >   (EVENT_FLAG_PR unset): call report_iommu_fault(), but the RW
> > >   bit will be invalid, so don't try to map it to a
> > >   IOMMU_FAULT_{READ,WRITE} code
> > 
> > So, why do we need to call report_iommu_fault() for this case?
> > My understanding is we only have IOMMU_FAULT_[READ|WRITE].
> > So, if we can't identify whether the DMA is read / write,
> > we should not need to call report_iommu_fauilt(), is it?
> 
> I don't think that we should just altogether avoid logging the subset
> of page faults for which we can't determine the read/write direction
> on AMD platforms.
> 
> E.g. "access to an unmapped address" (which will have PR=0, and thus we
> won't know if it was a read or a write access) is just as much of a page
> fault as "write to a read-only page" (which will have PR=1, and thus the
> RW bit will be accurate) is, and for RAS purposes, both events are
> equally interesting, and important to know about.
> 
> It's true that we currently don't have a way of signaling to
> report_iommu_fault() (and by extension, to the io_page_fault
> tracepoint) that we're not sure whether the offending access was a read
> or a write, but I think we can just add a bit to include/linux/iommu.h
> to indicate that, something along the lines of:
> 
>/* iommu fault flags */
>#define IOMMU_FAULT_READ0x0
>#define IOMMU_FAULT_WRITE   0x1
>   +#define IOMMU_FAULT_RW_UNKNOWN  0x2
> 
> (Cc'ing Ohad Ben-Cohen, who originally added this API.)
> 
> I don't think that it would be a good idea to just not signal the page
> faults for which we don't know the read/write direction.

I had another look at this, and from some testing, it seems that,
contrary to what the datasheet suggests, the RW (read/write direction)
bit in logged I/O page faults is actually accurate for both faulting
accesses to present pages and faulting accesses to non-present pages.

I made a few hacks to the ixgbe driver to intentionally cause several
different types of I/O page faults, thereby turning an ixgbe NIC into
a poor man's I/O page fault generator, and these are the resulting
logged I/O page faults, on a Ryzen 3700X system:

Read from non-present page:
ixgbe :25:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 
address=0x1bbc8f040 flags=0x]
=> flags indicate PR(esent)=0, RW=0

Write to non-present page:
ixgbe :25:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 
address=0x1bdcb70c0 flags=0x0020]
=> flags indicate PR(esent)=0, RW=1

Read from write-only page:
ixgbe :25:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 
address=0xbbcc1c40 flags=0x0050]
=> flags indicate PR(esent)=1, RW=0  (and PE(rmission violation)=1)

Write to read-only page:
ixgbe :25:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010 
address=0xbdcb70c0 flags=0x0070]
=> flags indicate PR(esent)=1, RW=1  (and PE(rmission violation)=1)

In other words, it seems that the RW bit is reliable even for PR=0 type
faults.

I assume that the restriction mentioned in the docs regarding RW and
PR ("RW is only meaningful when PR=1, TR=0, and I=0" from AMD I/O
Virtualization Technology (IOMMU) Specification, revision 3.00, Table
55: IO_PAGE_FAULT Event Log Buffer Entry Fields) is either confused
or refers to a restriction in older hardware that has since been lifted.

I'll resubmit the patch to unconditionally pass through the RW bit.




ixgbe hacks:

to cause reads from non-present pages (in the TX path):

--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8232,7 +8232,7 @@ static int ixgbe_tx_map(struct ixgbe_ring *tx_ring,
dma_unmap_len_set(tx_buffer, len, size);
dma_unmap_addr_set(tx_buffer, dma, dma);
 
-   tx_desc->read.buffer_addr = cpu_to_le64(dma);
+   tx_desc->read.buffer_addr = cpu_to_le64(dma + BIT(32));
 
while (unlikely(size > IXGBE_MAX_DATA_PER_TXD)) {
tx_desc->read.cmd_type_len =


to cause writes to non-present pages (in the RX path):

--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1604,7 +1604,7 @@ void ixgbe_alloc_rx_buffers(struct ixgbe_ring *rx_ring, 
u16 cleaned_count)
 * Refresh the desc even if buffer_addrs didn't change
 * because each write-back erases this info.
 */
-   rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset);
+   rx_desc->read.pkt_addr = cpu_to_le64(bi->dma + bi->page_offset 
+ BIT(32));
 
rx_desc++;
bi++;


to cause reads from write-only pages (in the TX path):

--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ 

Re: [PATCH v2] iommu/amd: Use report_iommu_fault()

2021-08-21 Thread Lennert Buytenhek
On Thu, Aug 05, 2021 at 11:26:25AM -0500, Suthikulpanit, Suravee wrote:

> Lennert,

Hi Suravee,


> > - EVENT_FLAG_I unset, but the request was a translation request
> >   (EVENT_FLAG_TR set) or the target page was not present
> >   (EVENT_FLAG_PR unset): call report_iommu_fault(), but the RW
> >   bit will be invalid, so don't try to map it to a
> >   IOMMU_FAULT_{READ,WRITE} code
> 
> So, why do we need to call report_iommu_fault() for this case?
> My understanding is we only have IOMMU_FAULT_[READ|WRITE].
> So, if we can't identify whether the DMA is read / write,
> we should not need to call report_iommu_fauilt(), is it?

I don't think that we should just altogether avoid logging the subset
of page faults for which we can't determine the read/write direction
on AMD platforms.

E.g. "access to an unmapped address" (which will have PR=0, and thus we
won't know if it was a read or a write access) is just as much of a page
fault as "write to a read-only page" (which will have PR=1, and thus the
RW bit will be accurate) is, and for RAS purposes, both events are
equally interesting, and important to know about.

It's true that we currently don't have a way of signaling to
report_iommu_fault() (and by extension, to the io_page_fault
tracepoint) that we're not sure whether the offending access was a read
or a write, but I think we can just add a bit to include/linux/iommu.h
to indicate that, something along the lines of:

 /* iommu fault flags */
 #define IOMMU_FAULT_READ0x0
 #define IOMMU_FAULT_WRITE   0x1
+#define IOMMU_FAULT_RW_UNKNOWN  0x2

(Cc'ing Ohad Ben-Cohen, who originally added this API.)

I don't think that it would be a good idea to just not signal the page
faults for which we don't know the read/write direction.


Thanks,
Lennert



> > - EVENT_FLAG_I unset, the request is a transaction request (EVENT_FLAG_TR
> >unset) and the target page was present (EVENT_FLAG_PR set): call
> >report_iommu_fault(), and use the RW bit to set IOMMU_FAULT_{READ,WRITE}
> > 
> > So I don't think we can merge the test for EVENT_FLAG_I with the
> > test for EVENT_FLAG_TR/EVENT_FLAG_PR.
> 
> The only condition that we would report_iommu_fault is
> I=0, TR=0, PR=1, isn't it. So we should be able to just check if PR=1.
> 
> 
> > We could do something like this, if you'd prefer:
> > 
> > #define IS_IOMMU_MEM_TRANSACTION(flags) \
> > (((flags) & EVENT_FLAG_I) == 0)
> > 
> > #define IS_RW_FLAG_VALID(flags) \
> > (((flags) & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == EVENT_FLAG_PR)
> > 
> > #define IS_WRITE_REQUEST(flags) \
> > (IS_RW_FLAG_VALID(flags) && (flags & EVENT_FLAG_RW))
> > 
> > And then do something like:
> > 
> > if (dev_data && IS_IOMMU_MEM_TRANSACTION(flags)) {
> > if (!report_iommu_fault(_data->domain->domain, >dev,
> > address,
> > IS_WRITE_REQUEST(flags) ?
> > IOMMU_FAULT_WRITE : IOMMU_FAULT_READ))
> 
> Actually, IS_WRITE_REQUEST() == 0 could mean:
> - I=0, TR=0, PR=1 and RW=0: This is fine.
> - I=0, (TR=1 or PR=0), and we should not be calling report_iommu_fault() here
>   since we cannot specify READ/WRITE here.
> 
> Thanks,
> Suravee
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Use report_iommu_fault()

2021-08-05 Thread Suthikulpanit, Suravee via iommu

Lennert,

On 7/29/2021 9:32 PM, Lennert Buytenhek wrote:

We have three cases to handle:

- EVENT_FLAG_I set: IRQ remapping fault, don't call report_iommu_fault()

- EVENT_FLAG_I unset, but the request was a translation request
   (EVENT_FLAG_TR set) or the target page was not present (EVENT_FLAG_PR
   unset): call report_iommu_fault(), but the RW bit will be invalid, so
   don't try to map it to a IOMMU_FAULT_{READ,WRITE} code


So, why do we need to call report_iommu_fault() for this case?
My understanding is we only have IOMMU_FAULT_[READ|WRITE].
So, if we can't identify whether the DMA is read / write,
we should not need to call report_iommu_fauilt(), is it?


- EVENT_FLAG_I unset, the request is a transaction request (EVENT_FLAG_TR
   unset) and the target page was present (EVENT_FLAG_PR set): call
   report_iommu_fault(), and use the RW bit to set IOMMU_FAULT_{READ,WRITE}

So I don't think we can merge the test for EVENT_FLAG_I with the
test for EVENT_FLAG_TR/EVENT_FLAG_PR.


The only condition that we would report_iommu_fault is
I=0, TR=0, PR=1, isn't it. So we should be able to just check if PR=1.



We could do something like this, if you'd prefer:

#define IS_IOMMU_MEM_TRANSACTION(flags) \
(((flags) & EVENT_FLAG_I) == 0)

#define IS_RW_FLAG_VALID(flags) \
(((flags) & (EVENT_FLAG_TR | EVENT_FLAG_PR)) == EVENT_FLAG_PR)

#define IS_WRITE_REQUEST(flags) \
(IS_RW_FLAG_VALID(flags) && (flags & EVENT_FLAG_RW))

And then do something like:

if (dev_data && IS_IOMMU_MEM_TRANSACTION(flags)) {
if (!report_iommu_fault(_data->domain->domain, >dev,
address,
IS_WRITE_REQUEST(flags) ?
IOMMU_FAULT_WRITE : IOMMU_FAULT_READ))


Actually, IS_WRITE_REQUEST() == 0 could mean:
- I=0, TR=0, PR=1 and RW=0: This is fine.
- I=0, (TR=1 or PR=0), and we should not be calling report_iommu_fault() here
  since we cannot specify READ/WRITE here.

Thanks,
Suravee


goto out;
}


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2] iommu/amd: Use report_iommu_fault()

2021-08-03 Thread Lennert Buytenhek
On Fri, Jul 30, 2021 at 05:32:16AM +0300, Lennert Buytenhek wrote:

> > > This patch makes iommu/amd call report_iommu_fault() when an I/O page
> > > fault occurs, which has two effects:
> > > 
> > > 1) It allows device drivers to register a callback to be notified of
> > > I/O page faults, via the iommu_set_fault_handler() API.
> > > 
> > > 2) It triggers the io_page_fault tracepoint in report_iommu_fault()
> > > when an I/O page fault occurs.
> > > 
> > > I'm mainly interested in (2).  We have a daemon with some rasdaemon-like
> > > functionality for handling platform errors, and being able to be notified
> > > of I/O page faults for initiating corrective action is very useful -- and
> > > receiving such events via event tracing is a lot nicer than having to
> > > scrape them from kmsg.
> > > 
> > > A number of other IOMMU drivers already use report_iommu_fault(), and
> > > I/O page faults on those IOMMUs therefore already seem to trigger this
> > > tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR.
> > > 
> > > I copied the logic from the other callers of report_iommu_fault(), where
> > > if that function returns zero, the driver will have handled the fault,
> > > in which case we avoid logging information about the fault to the printk
> > > buffer from the IOMMU driver.
> > > 
> > > With this patch I see io_page_fault event tracing entries as expected:
> > > 
> > > irq/24-AMD-Vi-48[002]    978.554289: io_page_fault: 
> > > IOMMU:[drvname] :05:00.0 iova=0x91482640 flags=0x
> > > irq/24-AMD-Vi-48[002]    978.554294: io_page_fault: 
> > > IOMMU:[drvname] :05:00.0 iova=0x91482650 flags=0x
> > > irq/24-AMD-Vi-48[002]    978.554299: io_page_fault: 
> > > IOMMU:[drvname] :05:00.0 iova=0x91482660 flags=0x
> > > irq/24-AMD-Vi-48[002]    978.554305: io_page_fault: 
> > > IOMMU:[drvname] :05:00.0 iova=0x91482670 flags=0x
> > > irq/24-AMD-Vi-48[002]    978.554310: io_page_fault: 
> > > IOMMU:[drvname] :05:00.0 iova=0x91482680 flags=0x
> > > irq/24-AMD-Vi-48[002]    978.554315: io_page_fault: 
> > > IOMMU:[drvname] :05:00.0 iova=0x914826a0 flags=0x
> > > 
> > > For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU
> > > spec, but I haven't tested that bit of the code, as the page faults I
> > > encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which
> > > case EVENT_FLAG_RW doesn't make sense.
> > > 
> > > Signed-off-by: Lennert Buytenhek 
> > > ---
> > > Changes since v1 RFC:
> > > 
> > > - Don't call report_iommu_fault() for IRQ remapping faults.
> > >(Suggested by Joerg Roedel.)
> > > 
> > >   drivers/iommu/amd/amd_iommu_types.h |  4 
> > >   drivers/iommu/amd/iommu.c   | 29 +
> > >   2 files changed, 33 insertions(+)
> > > 
> > > diff --git a/drivers/iommu/amd/amd_iommu_types.h 
> > > b/drivers/iommu/amd/amd_iommu_types.h
> > > index 94c1a7a9876d..2f2c6630c24c 100644
> > > --- a/drivers/iommu/amd/amd_iommu_types.h
> > > +++ b/drivers/iommu/amd/amd_iommu_types.h
> > > @@ -138,6 +138,10 @@
> > >   #define EVENT_DOMID_MASK_HI 0xf
> > >   #define EVENT_FLAGS_MASK0xfff
> > >   #define EVENT_FLAGS_SHIFT   0x10
> > > +#define EVENT_FLAG_TR0x100
> > > +#define EVENT_FLAG_RW0x020
> > > +#define EVENT_FLAG_PR0x010
> > > +#define EVENT_FLAG_I 0x008
> > >   /* feature control bits */
> > >   #define CONTROL_IOMMU_EN0x00ULL
> > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > > index a7d6d78147b7..d9fb2c22d44a 100644
> > > --- a/drivers/iommu/amd/iommu.c
> > > +++ b/drivers/iommu/amd/iommu.c
> > 
> > What if we introduce:
> > 
> > +/*
> > + * AMD I/O Virtualization Technology (IOMMU) Specification,
> > + * revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says
> > + * that the RW ("read-write") bit is only valid if the I/O
> > + * page fault was caused by a memory transaction request
> > + * referencing a page that was marked present.
> > + */
> > +#define IO_PAGE_FAULT_MEM_MASK \
> > +   (EVENT_FLAG_TR | EVENT_FLAG_PR | EVENT_FLAG_I)
> > +#define IS_IOMMU_MEM_TRANSACTION(x)\
> > +   ((x & IO_PAGE_FAULT_MEM_MASK) == EVENT_FLAG_PR)
> > 
> > Note that this should have already checked w/ EVENT_FLAG_I == 0.
> > 
> > 
> > > @@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, 
> > > u16 domain_id,
> > >   if (pdev)
> > >   dev_data = dev_iommu_priv_get(>dev);
> > > + /*
> > > +  * If this is a DMA fault (for which the I(nterrupt) bit will
> > > +  * be unset), allow report_iommu_fault() to prevent logging it.
> > > +  */
> > > + if (dev_data && ((flags & EVENT_FLAG_I) == 0)) {
> > > + int report_flags;
> > > +
> > > + /*
> > > +  * AMD I/O Virtualization Technology (IOMMU) Specification,

Re: [PATCH v2] iommu/amd: Use report_iommu_fault()

2021-07-29 Thread Lennert Buytenhek
On Wed, Jul 28, 2021 at 04:51:27PM -0500, Suthikulpanit, Suravee wrote:

> Lennert,

Hi Suravee,


> > This patch makes iommu/amd call report_iommu_fault() when an I/O page
> > fault occurs, which has two effects:
> > 
> > 1) It allows device drivers to register a callback to be notified of
> > I/O page faults, via the iommu_set_fault_handler() API.
> > 
> > 2) It triggers the io_page_fault tracepoint in report_iommu_fault()
> > when an I/O page fault occurs.
> > 
> > I'm mainly interested in (2).  We have a daemon with some rasdaemon-like
> > functionality for handling platform errors, and being able to be notified
> > of I/O page faults for initiating corrective action is very useful -- and
> > receiving such events via event tracing is a lot nicer than having to
> > scrape them from kmsg.
> > 
> > A number of other IOMMU drivers already use report_iommu_fault(), and
> > I/O page faults on those IOMMUs therefore already seem to trigger this
> > tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR.
> > 
> > I copied the logic from the other callers of report_iommu_fault(), where
> > if that function returns zero, the driver will have handled the fault,
> > in which case we avoid logging information about the fault to the printk
> > buffer from the IOMMU driver.
> > 
> > With this patch I see io_page_fault event tracing entries as expected:
> > 
> > irq/24-AMD-Vi-48[002]    978.554289: io_page_fault: 
> > IOMMU:[drvname] :05:00.0 iova=0x91482640 flags=0x
> > irq/24-AMD-Vi-48[002]    978.554294: io_page_fault: 
> > IOMMU:[drvname] :05:00.0 iova=0x91482650 flags=0x
> > irq/24-AMD-Vi-48[002]    978.554299: io_page_fault: 
> > IOMMU:[drvname] :05:00.0 iova=0x91482660 flags=0x
> > irq/24-AMD-Vi-48[002]    978.554305: io_page_fault: 
> > IOMMU:[drvname] :05:00.0 iova=0x91482670 flags=0x
> > irq/24-AMD-Vi-48[002]    978.554310: io_page_fault: 
> > IOMMU:[drvname] :05:00.0 iova=0x91482680 flags=0x
> > irq/24-AMD-Vi-48[002]    978.554315: io_page_fault: 
> > IOMMU:[drvname] :05:00.0 iova=0x914826a0 flags=0x
> > 
> > For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU
> > spec, but I haven't tested that bit of the code, as the page faults I
> > encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which
> > case EVENT_FLAG_RW doesn't make sense.
> > 
> > Signed-off-by: Lennert Buytenhek 
> > ---
> > Changes since v1 RFC:
> > 
> > - Don't call report_iommu_fault() for IRQ remapping faults.
> >(Suggested by Joerg Roedel.)
> > 
> >   drivers/iommu/amd/amd_iommu_types.h |  4 
> >   drivers/iommu/amd/iommu.c   | 29 +
> >   2 files changed, 33 insertions(+)
> > 
> > diff --git a/drivers/iommu/amd/amd_iommu_types.h 
> > b/drivers/iommu/amd/amd_iommu_types.h
> > index 94c1a7a9876d..2f2c6630c24c 100644
> > --- a/drivers/iommu/amd/amd_iommu_types.h
> > +++ b/drivers/iommu/amd/amd_iommu_types.h
> > @@ -138,6 +138,10 @@
> >   #define EVENT_DOMID_MASK_HI   0xf
> >   #define EVENT_FLAGS_MASK  0xfff
> >   #define EVENT_FLAGS_SHIFT 0x10
> > +#define EVENT_FLAG_TR  0x100
> > +#define EVENT_FLAG_RW  0x020
> > +#define EVENT_FLAG_PR  0x010
> > +#define EVENT_FLAG_I   0x008
> >   /* feature control bits */
> >   #define CONTROL_IOMMU_EN0x00ULL
> > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > index a7d6d78147b7..d9fb2c22d44a 100644
> > --- a/drivers/iommu/amd/iommu.c
> > +++ b/drivers/iommu/amd/iommu.c
> 
> What if we introduce:
> 
> +/*
> + * AMD I/O Virtualization Technology (IOMMU) Specification,
> + * revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says
> + * that the RW ("read-write") bit is only valid if the I/O
> + * page fault was caused by a memory transaction request
> + * referencing a page that was marked present.
> + */
> +#define IO_PAGE_FAULT_MEM_MASK \
> +   (EVENT_FLAG_TR | EVENT_FLAG_PR | EVENT_FLAG_I)
> +#define IS_IOMMU_MEM_TRANSACTION(x)\
> +   ((x & IO_PAGE_FAULT_MEM_MASK) == EVENT_FLAG_PR)
> 
> Note that this should have already checked w/ EVENT_FLAG_I == 0.
> 
> 
> > @@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, u16 
> > domain_id,
> > if (pdev)
> > dev_data = dev_iommu_priv_get(>dev);
> > +   /*
> > +* If this is a DMA fault (for which the I(nterrupt) bit will
> > +* be unset), allow report_iommu_fault() to prevent logging it.
> > +*/
> > +   if (dev_data && ((flags & EVENT_FLAG_I) == 0)) {
> > +   int report_flags;
> > +
> > +   /*
> > +* AMD I/O Virtualization Technology (IOMMU) Specification,
> > +* revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says
> > +* that the RW ("read-write") bit is only valid if the I/O
> > +* page 

Re: [PATCH v2] iommu/amd: Use report_iommu_fault()

2021-07-28 Thread Suthikulpanit, Suravee via iommu

Lennert,

On 7/26/2021 11:31 AM, Lennert Buytenhek wrote:

This patch makes iommu/amd call report_iommu_fault() when an I/O page
fault occurs, which has two effects:

1) It allows device drivers to register a callback to be notified of
I/O page faults, via the iommu_set_fault_handler() API.

2) It triggers the io_page_fault tracepoint in report_iommu_fault()
when an I/O page fault occurs.

I'm mainly interested in (2).  We have a daemon with some rasdaemon-like
functionality for handling platform errors, and being able to be notified
of I/O page faults for initiating corrective action is very useful -- and
receiving such events via event tracing is a lot nicer than having to
scrape them from kmsg.

A number of other IOMMU drivers already use report_iommu_fault(), and
I/O page faults on those IOMMUs therefore already seem to trigger this
tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR.

I copied the logic from the other callers of report_iommu_fault(), where
if that function returns zero, the driver will have handled the fault,
in which case we avoid logging information about the fault to the printk
buffer from the IOMMU driver.

With this patch I see io_page_fault event tracing entries as expected:

irq/24-AMD-Vi-48[002]    978.554289: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482640 flags=0x
irq/24-AMD-Vi-48[002]    978.554294: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482650 flags=0x
irq/24-AMD-Vi-48[002]    978.554299: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482660 flags=0x
irq/24-AMD-Vi-48[002]    978.554305: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482670 flags=0x
irq/24-AMD-Vi-48[002]    978.554310: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482680 flags=0x
irq/24-AMD-Vi-48[002]    978.554315: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x914826a0 flags=0x

For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU
spec, but I haven't tested that bit of the code, as the page faults I
encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which
case EVENT_FLAG_RW doesn't make sense.

Signed-off-by: Lennert Buytenhek 
---
Changes since v1 RFC:

- Don't call report_iommu_fault() for IRQ remapping faults.
   (Suggested by Joerg Roedel.)

  drivers/iommu/amd/amd_iommu_types.h |  4 
  drivers/iommu/amd/iommu.c   | 29 +
  2 files changed, 33 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 94c1a7a9876d..2f2c6630c24c 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -138,6 +138,10 @@
  #define EVENT_DOMID_MASK_HI   0xf
  #define EVENT_FLAGS_MASK  0xfff
  #define EVENT_FLAGS_SHIFT 0x10
+#define EVENT_FLAG_TR  0x100
+#define EVENT_FLAG_RW  0x020
+#define EVENT_FLAG_PR  0x010
+#define EVENT_FLAG_I   0x008
  
  /* feature control bits */

  #define CONTROL_IOMMU_EN0x00ULL
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a7d6d78147b7..d9fb2c22d44a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c


What if we introduce:

+/*
+ * AMD I/O Virtualization Technology (IOMMU) Specification,
+ * revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says
+ * that the RW ("read-write") bit is only valid if the I/O
+ * page fault was caused by a memory transaction request
+ * referencing a page that was marked present.
+ */
+#define IO_PAGE_FAULT_MEM_MASK \
+   (EVENT_FLAG_TR | EVENT_FLAG_PR | EVENT_FLAG_I)
+#define IS_IOMMU_MEM_TRANSACTION(x)\
+   ((x & IO_PAGE_FAULT_MEM_MASK) == EVENT_FLAG_PR)

Note that this should have already checked w/ EVENT_FLAG_I == 0.



@@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, u16 
domain_id,
if (pdev)
dev_data = dev_iommu_priv_get(>dev);
  
+	/*

+* If this is a DMA fault (for which the I(nterrupt) bit will
+* be unset), allow report_iommu_fault() to prevent logging it.
+*/
+   if (dev_data && ((flags & EVENT_FLAG_I) == 0)) {
+   int report_flags;
+
+   /*
+* AMD I/O Virtualization Technology (IOMMU) Specification,
+* revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says
+* that the RW ("read-write") bit is only valid if the I/O
+* page fault was caused by a memory transaction request
+* referencing a page that was marked present.
+*/
+   report_flags = 0;
+   if ((flags & (EVENT_FLAG_TR | EVENT_FLAG_PR)) ==
+   EVENT_FLAG_PR) {
+   if (flags & EVENT_FLAG_RW)
+   

[PATCH v2] iommu/amd: Use report_iommu_fault()

2021-07-26 Thread Lennert Buytenhek
This patch makes iommu/amd call report_iommu_fault() when an I/O page
fault occurs, which has two effects:

1) It allows device drivers to register a callback to be notified of
   I/O page faults, via the iommu_set_fault_handler() API.

2) It triggers the io_page_fault tracepoint in report_iommu_fault()
   when an I/O page fault occurs.

I'm mainly interested in (2).  We have a daemon with some rasdaemon-like
functionality for handling platform errors, and being able to be notified
of I/O page faults for initiating corrective action is very useful -- and
receiving such events via event tracing is a lot nicer than having to
scrape them from kmsg.

A number of other IOMMU drivers already use report_iommu_fault(), and
I/O page faults on those IOMMUs therefore already seem to trigger this
tracepoint -- but this isn't (yet) the case for AMD-Vi and Intel DMAR.

I copied the logic from the other callers of report_iommu_fault(), where
if that function returns zero, the driver will have handled the fault,
in which case we avoid logging information about the fault to the printk
buffer from the IOMMU driver.

With this patch I see io_page_fault event tracing entries as expected:

   irq/24-AMD-Vi-48[002]    978.554289: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482640 flags=0x
   irq/24-AMD-Vi-48[002]    978.554294: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482650 flags=0x
   irq/24-AMD-Vi-48[002]    978.554299: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482660 flags=0x
   irq/24-AMD-Vi-48[002]    978.554305: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482670 flags=0x
   irq/24-AMD-Vi-48[002]    978.554310: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x91482680 flags=0x
   irq/24-AMD-Vi-48[002]    978.554315: io_page_fault: IOMMU:[drvname] 
:05:00.0 iova=0x914826a0 flags=0x

For determining IOMMU_FAULT_{READ,WRITE}, I followed the AMD IOMMU
spec, but I haven't tested that bit of the code, as the page faults I
encounter are all to non-present (!EVENT_FLAG_PR) mappings, in which
case EVENT_FLAG_RW doesn't make sense.

Signed-off-by: Lennert Buytenhek 
---
Changes since v1 RFC:

- Don't call report_iommu_fault() for IRQ remapping faults.
  (Suggested by Joerg Roedel.)

 drivers/iommu/amd/amd_iommu_types.h |  4 
 drivers/iommu/amd/iommu.c   | 29 +
 2 files changed, 33 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 94c1a7a9876d..2f2c6630c24c 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -138,6 +138,10 @@
 #define EVENT_DOMID_MASK_HI0xf
 #define EVENT_FLAGS_MASK   0xfff
 #define EVENT_FLAGS_SHIFT  0x10
+#define EVENT_FLAG_TR  0x100
+#define EVENT_FLAG_RW  0x020
+#define EVENT_FLAG_PR  0x010
+#define EVENT_FLAG_I   0x008
 
 /* feature control bits */
 #define CONTROL_IOMMU_EN0x00ULL
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a7d6d78147b7..d9fb2c22d44a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -484,6 +484,34 @@ static void amd_iommu_report_page_fault(u16 devid, u16 
domain_id,
if (pdev)
dev_data = dev_iommu_priv_get(>dev);
 
+   /*
+* If this is a DMA fault (for which the I(nterrupt) bit will
+* be unset), allow report_iommu_fault() to prevent logging it.
+*/
+   if (dev_data && ((flags & EVENT_FLAG_I) == 0)) {
+   int report_flags;
+
+   /*
+* AMD I/O Virtualization Technology (IOMMU) Specification,
+* revision 3.00, section 2.5.3 ("IO_PAGE_FAULT Event") says
+* that the RW ("read-write") bit is only valid if the I/O
+* page fault was caused by a memory transaction request
+* referencing a page that was marked present.
+*/
+   report_flags = 0;
+   if ((flags & (EVENT_FLAG_TR | EVENT_FLAG_PR)) ==
+   EVENT_FLAG_PR) {
+   if (flags & EVENT_FLAG_RW)
+   report_flags |= IOMMU_FAULT_WRITE;
+   else
+   report_flags |= IOMMU_FAULT_READ;
+   }
+
+   if (!report_iommu_fault(_data->domain->domain,
+   >dev, address, report_flags))
+   goto out;
+   }
+
if (dev_data) {
if (__ratelimit(_data->rs)) {
pci_err(pdev, "Event logged [IO_PAGE_FAULT 
domain=0x%04x address=0x%llx flags=0x%04x]\n",
@@ -495,6 +523,7 @@ static void amd_iommu_report_page_fault(u16 devid, u16 
domain_id,
domain_id, address, flags);