Re: [RFCv2 PATCH 00/36] Process management for IOMMU + SVM for SMMUv3

2017-11-07 Thread Bob Liu
Hi Jean,

On 2017/10/12 20:55, Jean-Philippe Brucker wrote:
> On 12/10/17 13:05, Yisheng Xie wrote:
> [...]
> * An iommu_process can be bound to multiple domains, and a domain can have
>   multiple iommu_process.
 when bind a task to device, can we create a single domain for it? I am 
 thinking
 about process management without shared PT(for some device only support 
 PASID
 without pri ability), it seems hard to expand if a domain have multiple 
 iommu_process?
 Do you have any idea about this?
>>>
>>> A device always has to be in a domain, as far as I know. Not supporting
>>> PRI forces you to pin down all user mappings (or just the ones you use for
>>> DMA) but you should sill be able to share PT. Now if you don't support
>>> shared PT either, but only PASID, then you'll have to use io-pgtable and a
>>> new map/unmap API on an iommu_process. I don't understand your concern
>>> though, how would the link between process and domains prevent this 
>>> use-case?
>>>
>> So you mean that if an iommu_process bind to multiple devices it should 
>> create
>> multiple io-pgtables? or just share the same io-pgtable?
> 
> I don't know to be honest, I haven't thought much about the io-pgtable
> case, I'm all about sharing the mm :)
> 

Sorry to get back to this thread, but traditional DMA_MAP use case may also 
want to
enable Substreamid/PASID.
As a general framework, you may also consider SubStreamid/Pasid support for dma 
map/io-pgtable.

We're considering make io-pgtables per SubStreamid/Pasid, but haven't decide 
put all 
io-pgtables into a single domain or iommu_process.

Thanks,
Liubo

> It really depends on what the user (GPU driver I assume) wants. I think
> that if you're not sharing an mm with the device, then you're trying to
> hide parts of the process to the device, so you'd also want the
> flexibility of having different io-pgtables between devices. Different
> devices accessing isolated parts of the process requires separate io-pgtables.
> 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH for-next 2/4] RDMA/hns: Add IOMMU enable support in hip08

2017-11-07 Thread Jason Gunthorpe
On Tue, Nov 07, 2017 at 07:58:05AM -0800, Christoph Hellwig wrote:
> On Tue, Nov 07, 2017 at 08:48:38AM -0700, Jason Gunthorpe wrote:
> > Can't you just use vmalloc and dma_map that? Other drivers follow that
> > approach..
> 
> You can't easily due to the flushing requirements.  We used to do that
> in XFS and it led to problems.  You need the page allocator + vmap +
> invalidate_kernel_vmap_range + flush_kernel_vmap_range to get the
> cache flushing right.

Yes, exactly something ugly like that.. :\

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH for-next 2/4] RDMA/hns: Add IOMMU enable support in hip08

2017-11-07 Thread Christoph Hellwig
On Tue, Nov 07, 2017 at 08:48:38AM -0700, Jason Gunthorpe wrote:
> Can't you just use vmalloc and dma_map that? Other drivers follow that
> approach..

You can't easily due to the flushing requirements.  We used to do that
in XFS and it led to problems.  You need the page allocator + vmap +
invalidate_kernel_vmap_range + flush_kernel_vmap_range to get the
cache flushing right.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH for-next 2/4] RDMA/hns: Add IOMMU enable support in hip08

2017-11-07 Thread Jason Gunthorpe
On Tue, Nov 07, 2017 at 10:45:29AM +0800, Wei Hu (Xavier) wrote:

> We reconstruct the code as below:

> It replaces dma_alloc_coherent with __get_free_pages and
> dma_map_single functions. So, we can vmap serveral ptrs returned by
> __get_free_pages, right?

Can't you just use vmalloc and dma_map that? Other drivers follow that
approach..

However, dma_alloc_coherent and dma_map_single are not the same
thing. You can't touch the vmap memory once you call dma_map unless
the driver also includes dma cache flushing calls in all the right
places.

The difference is that alloc_coherent will return non-cachable memory
if necessary, while get_free_pages does not.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 08/16] iommu: introduce device fault data

2017-11-07 Thread Jean-Philippe Brucker
On 07/11/17 08:40, Liu, Yi L wrote:
[...]
> +
> +/*  Generic fault types, can be expanded IRQ remapping fault */
> +enum iommu_fault_type {
> + IOMMU_FAULT_DMA_UNRECOV = 1,/* unrecoverable fault */
> + IOMMU_FAULT_PAGE_REQ,   /* page request fault */
> +};
> +
> +enum iommu_fault_reason {
> + IOMMU_FAULT_REASON_CTX = 1,

 If I read the VT-d spec right, this is a fault encountered while
 fetching the PASID table pointer?

> + IOMMU_FAULT_REASON_ACCESS,

 And this a pgd or pte access fault?

> + IOMMU_FAULT_REASON_INVALIDATE,

 What would this be?

> + IOMMU_FAULT_REASON_UNKNOWN,
> +};

 I'm currently doing the same exploratory work for virtio-iommu, and
 I'd be tempted to report reasons as detailed as possible to guest or
 device driver, but it's not clear what they need, how they would use
 this information. I'd like to discuss this some more.
>>>
>>> [Liu, Yi L] In fact, it's not necessary to pass the detailed
>>> unrecoverable fault to guest in virtualization case. Unrecoverable
>>> fault happened on native indicates fault during native IOMMU address
>>> translation. If the fault is not due to guest IOMMU page table
>>> setting, then it is not necessary to inject the fault to guest. And 
>>> hypervisor should
>> be able to deduce it by walking the guest IOMMU page table with the fault 
>> address.
>>
>> I'm not sure the hypervisor should go and inspect the guest's page tables.
> 
> [Liu, Yi L] I think hypervisor needs to do it to make sure reporting fault to 
> guest
> correctly. If not, hypervisor may report some fault to guest and make guest
> confused. e.g. pIOMMU walks page table and failed during walking root 
> table(VT-d)
> or device table(SMMU). such fault is due to no valid programming in host, 
> guest
> has no duty on it and neither has knowledge to fix it. it would make guest to
> believe that it has programmed the root table or device table in the wrong way
> while the fact is not.
> 
>> The pIOMMU already did the walk and reported the fault, so the hypervisor 
>> knows
>> that they are invalid. I thought VT-d and other pIOMMUs provide enough
>> information in the fault report to tell if the error was due to invalid page 
>> tables?
> 
> [Liu, Yi L] yes, pIOMMU did walk and get the fault info, but it's not sure 
> who is
> responsible to the fault. With inspecting the guest table, hypervisor may 
> know who
> should be responsible to the fault.

The SMMU adds information about the origin of the fault in its reports,
for example:

* bad streamid, bad ste, ste fetch, stream disabled: the device tables are
invalid, and the SMMU wasn't able to reach the PASID table.
* bad substreamid, CD fetch, bad CD: the PASID tables are invalid, and the
SMMU wasn't able to reach the page directory.
* walk external abort, translation, addr size, access, permission: the
page tables are invalid

If the pIOMMU driver receives this kind of information already, walking
the page table won't tell us anything more. Wouldn't you get a similar
reason granularity on VT-d?

>>> So I think for
>>> virtualization case, pass the fault address is enough. If hypervisor
>>> doesn't see any issue after checking the guest IOMMU translation
>>> hierarchy, no use to let guest know it. Hypervisor can either throw
>>> error log or stop the guest. If hypervisor see any error in the guest
>>> iommu translation hierarchy, then inject the error to guest with a
>>> proper fault type.> But for device driver or other user-space driver,
>>> I'm not sure if they need detailed fault info. In fact, it is enough to 
>>> pass the
>> possible info which would help them to deduce whether the unrecoverable 
>> fault is
>> due to them. This need more inputs from device driver reviewers.
>>
>> Agreed, though I'm not sure how to reach them.
> 
> [Liu, Yi L] I'd like to supplement my words here. Except the fault address, 
> we may also
> need to provide the BDF and PASID if it is there.

I think the IOMMU should pass the struct device associated to the BDF to
the fault handler. The fault handler can then deduce the BDF from struct
device if it needs to. This also allows to support faults from non-PCI
devices, where the BDF or deviceID is specific to the IOMMU and doesn't
mean anything to the device driver.

I agree that we should provide the PASID if present.

>> At the moment, the only users of report_iommu_fault, the existing fault 
>> reporting
>> mechanism, are ARM-based IOMMU drivers and there are only four device drivers
>> that register a handler with iommu_set_fault_handler. Two of them simply 
>> print the
>> fault, one resets the offending device, and the last one (msm GPU) wants to 
>> provide
>> more detailed debugging information about the device state.
> 
> [Liu, Yi L] Well, it looks like device driver may not try to fix the fault, 
> instead, it would
> more likely do kind of clean up after un-recoverable fault. If 

RE: [PATCH v9 2/4] iommu/dma: Add a helper function to reserve HW MSI address regions for IOMMU drivers

2017-11-07 Thread Shameerali Kolothum Thodi


> -Original Message-
> From: Lorenzo Pieralisi [mailto:lorenzo.pieral...@arm.com]
> Sent: Friday, November 03, 2017 11:35 AM
> To: Shameerali Kolothum Thodi 
> Cc: Robin Murphy ; Will Deacon
> ; Gabriele Paoloni ;
> marc.zyng...@arm.com; linux-...@vger.kernel.org; j...@8bytes.org; John
> Garry ; Guohanjun (Hanjun Guo)
> ; Linuxarm ; linux-
> a...@vger.kernel.org; iommu@lists.linux-foundation.org; Wangzhou (B)
> ; sudeep.ho...@arm.com; bhelg...@google.com;
> linux-arm-ker...@lists.infradead.org; de...@acpica.org
> Subject: Re: [PATCH v9 2/4] iommu/dma: Add a helper function to reserve HW
> MSI address regions for IOMMU drivers
> 
> On Thu, Oct 26, 2017 at 10:11:58AM +, Shameerali Kolothum Thodi wrote:
> 
[..]

> >
> > As we still don’t have a clear resolution on how to invoke the
> > iort_iommu_msi_get_resv_regions(), I have gone back and attempted to
> > move the smmu model check inside the iort code. This means the
> > function will selectively apply HW MSI reservation based on the
> > platform and also the function can be invoked from the
> iommu_dma_get_resv_regions() directly.
> >
> > Could you please take a look at the below snippet and let me know your
> feedback.
> > Hope we can make some progress on this series.
> >
> > Thanks,
> > Shameer
> >
> > -->8--
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 876c0e1..a27233d 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -619,6 +619,39 @@ static int __maybe_unused __get_pci_rid(struct
> pci_dev *pdev, u16 alias,
> > return 0;
> >  }
> >
> > +static bool __maybe_unused iort_hw_msi_resv_enable(struct device *dev,
> > +   struct acpi_iort_node *node)
> > +{
> > +   struct acpi_iort_node *iommu = NULL;
> > +   int i;
> > +
> > +   if (dev_is_pci(dev)) {
> > +   u32 rid;
> > +
> > +   pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, );
> > +   iommu = iort_node_map_id(node, rid, NULL,
> IORT_IOMMU_TYPE);
> > +   } else {
> > +   for (i = 0; i < node->mapping_count; i++) {
> > +   iommu = iort_node_map_platform_id(node, NULL,
> > +   IORT_IOMMU_TYPE,
> i);
> > +   if (iommu)
> > +   break;
> > +   }
> > +   }
> 
> You do not need (and I do not want this code) to do the mapping again.
> 
> You have the fwnode (ie dev->iommu_fwspec) corresponding to the IOMMU,
> you can retrieve the SMMU IORT node by a simple look-up and carry out the
> check below.

Ok. Understood. I will rework this part then.

> It would be simpler to set an option in the SMMUv3 driver but then you go back
> to square one with DT/ACPI SMMUv3 driver awareness so, if, with the change
> above this can make the generic approach work (ie Robin is happy with it) I am
> fine with this IORT update as well.

Thanks Lorenzo. I will rebase on top of rc1 and prepare v10 with these changes
and sent it out.

Shameer 

> > +
> > +   if (iommu && (iommu->type == ACPI_IORT_NODE_SMMU_V3)) {
> > +   struct acpi_iort_smmu_v3 *smmu;
> > +
> > +   smmu = (struct acpi_iort_smmu_v3 *)iommu->node_data;
> > +   if (smmu->model ==
> ACPI_IORT_SMMU_V3_HISILICON_HI161X) {
> > +   dev_notice(dev, "Enabling HiSilicon erratum
> 161010801\n");
> > +   return true;
> > +   }
> > +   }
> > +
> > +   return false;
> > +}
> > +
> >  static int arm_smmu_iort_xlate(struct device *dev, u32 streamid,
> >struct fwnode_handle *fwnode,
> >const struct iommu_ops *ops) @@ -682,6 +715,9
> @@ int
> > iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head
> *head)
> > if (!node)
> > return -ENODEV;
> >
> > +   if (!iort_hw_msi_resv_enable(dev, node))
> > +   return 0;
> > +
> > /*
> >  * Current logic to reserve ITS regions relies on HW topologies
> >  * where a given PCI or named component maps its IDs to only one
> > diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> > index 9d1cebe..67c6e30 100644
> > --- a/drivers/iommu/dma-iommu.c
> > +++ b/drivers/iommu/dma-iommu.c
> > @@ -19,6 +19,7 @@
> >   * along with this program.  If not, see .
> >   */
> >
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -174,6 +175,10 @@ void iommu_dma_get_resv_regions(struct device
> *dev, struct list_head *list)
> > struct pci_host_bridge *bridge;
> > struct resource_entry *window;
> >
> > +   if (!is_of_node(dev->iommu_fwspec->iommu_fwnode) &&
> > +   iort_iommu_msi_get_resv_regions(dev, list) < 0)
> > +   return;
> > +
> > if 

RE: [PATCH v2 08/16] iommu: introduce device fault data

2017-11-07 Thread Liu, Yi L
Hi Jean,

Nice to have you "online". This open is really blocking the progress. Pls check 
inline.

> -Original Message-
> From: Jean-Philippe Brucker [mailto:jean-philippe.bruc...@arm.com]
> Sent: Tuesday, November 7, 2017 3:02 AM
> To: Liu, Yi L ; Jacob Pan ;
> iommu@lists.linux-foundation.org; LKML ; Joerg
> Roedel ; David Woodhouse ; Greg
> Kroah-Hartman ; Wysocki, Rafael J
> 
> Cc: Lan, Tianyu ; Tian, Kevin ; 
> Raj,
> Ashok ; Alex Williamson 
> Subject: Re: [PATCH v2 08/16] iommu: introduce device fault data
> 
> Hi Yi,
> 
> Sorry for the late reply, I seem to have missed this.
> 
> On 20/10/17 11:07, Liu, Yi L wrote:
> [...]
> >>> +
> >>> +/*  Generic fault types, can be expanded IRQ remapping fault */
> >>> +enum iommu_fault_type {
> >>> + IOMMU_FAULT_DMA_UNRECOV = 1,/* unrecoverable fault */
> >>> + IOMMU_FAULT_PAGE_REQ,   /* page request fault */
> >>> +};
> >>> +
> >>> +enum iommu_fault_reason {
> >>> + IOMMU_FAULT_REASON_CTX = 1,
> >>
> >> If I read the VT-d spec right, this is a fault encountered while
> >> fetching the PASID table pointer?
> >>
> >>> + IOMMU_FAULT_REASON_ACCESS,
> >>
> >> And this a pgd or pte access fault?
> >>
> >>> + IOMMU_FAULT_REASON_INVALIDATE,
> >>
> >> What would this be?
> >>
> >>> + IOMMU_FAULT_REASON_UNKNOWN,
> >>> +};
> >>
> >> I'm currently doing the same exploratory work for virtio-iommu, and
> >> I'd be tempted to report reasons as detailed as possible to guest or
> >> device driver, but it's not clear what they need, how they would use
> >> this information. I'd like to discuss this some more.
> >
> > [Liu, Yi L] In fact, it's not necessary to pass the detailed
> > unrecoverable fault to guest in virtualization case. Unrecoverable
> > fault happened on native indicates fault during native IOMMU address
> > translation. If the fault is not due to guest IOMMU page table
> > setting, then it is not necessary to inject the fault to guest. And 
> > hypervisor should
> be able to deduce it by walking the guest IOMMU page table with the fault 
> address.
> 
> I'm not sure the hypervisor should go and inspect the guest's page tables.

[Liu, Yi L] I think hypervisor needs to do it to make sure reporting fault to 
guest
correctly. If not, hypervisor may report some fault to guest and make guest
confused. e.g. pIOMMU walks page table and failed during walking root 
table(VT-d)
or device table(SMMU). such fault is due to no valid programming in host, guest
has no duty on it and neither has knowledge to fix it. it would make guest to
believe that it has programmed the root table or device table in the wrong way
while the fact is not.

> The pIOMMU already did the walk and reported the fault, so the hypervisor 
> knows
> that they are invalid. I thought VT-d and other pIOMMUs provide enough
> information in the fault report to tell if the error was due to invalid page 
> tables?

[Liu, Yi L] yes, pIOMMU did walk and get the fault info, but it's not sure who 
is
responsible to the fault. With inspecting the guest table, hypervisor may know 
who
should be responsible to the fault.

> 
> > So I think for
> > virtualization case, pass the fault address is enough. If hypervisor
> > doesn't see any issue after checking the guest IOMMU translation
> > hierarchy, no use to let guest know it. Hypervisor can either throw
> > error log or stop the guest. If hypervisor see any error in the guest
> > iommu translation hierarchy, then inject the error to guest with a
> > proper fault type.> But for device driver or other user-space driver,
> > I'm not sure if they need detailed fault info. In fact, it is enough to 
> > pass the
> possible info which would help them to deduce whether the unrecoverable fault 
> is
> due to them. This need more inputs from device driver reviewers.
> 
> Agreed, though I'm not sure how to reach them.

[Liu, Yi L] I'd like to supplement my words here. Except the fault address, we 
may also
need to provide the BDF and PASID if it is there.

> 
> At the moment, the only users of report_iommu_fault, the existing fault 
> reporting
> mechanism, are ARM-based IOMMU drivers and there are only four device drivers
> that register a handler with iommu_set_fault_handler. Two of them simply 
> print the
> fault, one resets the offending device, and the last one (msm GPU) wants to 
> provide
> more detailed debugging information about the device state.

[Liu, Yi L] Well, it looks like device driver may not try to fix the fault, 
instead, it would
more likely do kind of clean up after un-recoverable fault. If so, it may be 
enough to
have a notification to device driver.
 
> >> For unrecoverable faults I guess CTX means "the host IOMMU driver is
> >> broken", since the device tables are 

Re: [PATCH 2/2] iommu/ipmmu-vmsa: Hook up r8a779(70|95) DT matching code

2017-11-07 Thread Geert Uytterhoeven
On Wed, Nov 1, 2017 at 11:34 AM, Simon Horman
 wrote:
> Support the r8a77970 (R-Car V3M) and r8a77995 (R-Car D3) IPMMUs by sharing
> feature flags with r8a7795 (R-Car H3) and r8a7796 (R-Car M3-W). Also update
> IOMMU_OF_DECLARE to hook up the compat strings.
>
> Based on work for the r8a7796 by Magnus Damm
>
> Signed-off-by: Simon Horman 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/ipmmu-vmsa: Add r8a779(70|95) DT bindings

2017-11-07 Thread Geert Uytterhoeven
On Wed, Nov 1, 2017 at 11:34 AM, Simon Horman
 wrote:
> Update the IPMMU DT binding documentation to include the
> r8a77970 (R-Car V3M) and r8a77995 (R-Car D3) compat strings.
>
> Based on work for r8a7796 by Magnus Damm.
>
> Signed-off-by: Simon Horman 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu