Re: [PATCH v2 23/24] iommu/arm-smmu: Allow non-strict in pgtable_quirks interface

2021-08-03 Thread Will Deacon
On Tue, Aug 03, 2021 at 01:13:12PM +0100, Robin Murphy wrote:
> On 2021-08-03 11:36, Will Deacon wrote:
> > Overall, I'm really nervous about the concurrency here and think we'd be
> > better off requiring the unbind as we have for the other domain changes.
> 
> Sure, the dynamic switch is what makes it ultimately work for Doug's
> use-case (where the unbind isn't viable), but I had every expectation that
> we might need to hold back those two patches for much deeper consideration.
> It's no accident that the first 22 patches can still be usefully applied
> without them!

Oh, the rest of the series looks great which is why I jumped on this bit!

> In all honesty I don't really like this particular patch much, mostly
> because I increasingly dislike IO_PGTABLE_QUIRK_NON_STRICT at all, but since
> the interface was there it made it super easy to prove the concept. I have a
> more significant refactoring of the io-pgtable code sketched out in my mind
> already, it's just going to be more work.

Intriguing... Move the smp_wmb() into the IOVA code instead?

> > With your changes, I think quite a few things can go wrong.
> > 
> >* cookie->fq_domain may be observed but iovad->fq could be NULL
> 
> Good point, I guess that already technically applies (if iovad->fq sat in a
> write buffer long enough), but it certainly becomes far easier to provoke.
> However a barrier after assigning fq_domain (as mentioned above) paired with
> the control dependency around the queue_iova() call would also fix that,
> right?
> 
> >* We can miss the smp_wmb() in the pgtable code but end up deferring the
> >  IOVA reclaim
> >* iommu_change_dev_def_domain() only holds the group mutex afaict, so can
> >  possibly run concurrently with itself on the same domain?
> >* iommu_dma_init_fq() can flip the domain type back from
> >  IOMMU_DOMAIN_DMA_FQ to IOMMU_DOMAIN_DMA on the error path
> >* set_pgtable_quirks() isn't atomic, which probably is ok for now, but 
> > the
> >  moment we use it anywhere else it's dangerous
> 
> In other words, IO_PGTABLE_QUIRK_NON_STRICT is definitely the problem. I'll
> have a hack on that this afternoon, and if it starts to look rabbit-holey
> I'll split this bit off and post v3 of the rest of the series.
> 
> If all the io-pgtable and fq behaviour for a given call could be consistent
> based on a single READ_ONCE(cookie->fq_domain) in iommu_dma_unmap(), do you
> see any remaining issues other than the point above?

I'd have to see the patches, and I didn't look exhaustively at the current
stuff. But yes, I think the basic flow needs to be that there is an atomic
flag (i.e. cookie->fq_domain) which indicates which mode is being used
and if you flip that concurrently then you need to guarantee that everybody
is either using the old more or the new mode and not some halfway state in
between.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 23/24] iommu/arm-smmu: Allow non-strict in pgtable_quirks interface

2021-08-03 Thread Robin Murphy

On 2021-08-03 11:36, Will Deacon wrote:

On Mon, Aug 02, 2021 at 03:15:50PM +0100, Robin Murphy wrote:

On 2021-08-02 14:04, Will Deacon wrote:

On Wed, Jul 28, 2021 at 04:58:44PM +0100, Robin Murphy wrote:

To make io-pgtable aware of a flush queue being dynamically enabled,
allow IO_PGTABLE_QUIRK_NON_STRICT to be set even after a domain has been
attached to, and hook up the final piece of the puzzle in iommu-dma.

Signed-off-by: Robin Murphy 
---
   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++
   drivers/iommu/arm/arm-smmu/arm-smmu.c   | 11 +++
   drivers/iommu/dma-iommu.c   |  3 +++
   3 files changed, 29 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 19400826eba7..40fa9cb382c3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2711,6 +2711,20 @@ static int arm_smmu_enable_nesting(struct iommu_domain 
*domain)
return ret;
   }
+static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
+   unsigned long quirks)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+   if (quirks == IO_PGTABLE_QUIRK_NON_STRICT && smmu_domain->pgtbl_ops) {
+   struct io_pgtable *iop = 
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+
+   iop->cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+   return 0;
+   }
+   return -EINVAL;
+}


I don't see anything serialising this against a concurrent iommu_unmap(), so
the ordering and atomicity looks quite suspicious to me here. I don't think
it's just the page-table quirks either, but also setting cookie->fq_domain.


Heh, I confess to very much taking the cheeky "let's say nothing and see
what Will thinks about concurrency" approach here :)


Damnit, I fell for that didn't I?

Overall, I'm really nervous about the concurrency here and think we'd be
better off requiring the unbind as we have for the other domain changes.


Sure, the dynamic switch is what makes it ultimately work for Doug's 
use-case (where the unbind isn't viable), but I had every expectation 
that we might need to hold back those two patches for much deeper 
consideration. It's no accident that the first 22 patches can still be 
usefully applied without them!


In all honesty I don't really like this particular patch much, mostly 
because I increasingly dislike IO_PGTABLE_QUIRK_NON_STRICT at all, but 
since the interface was there it made it super easy to prove the 
concept. I have a more significant refactoring of the io-pgtable code 
sketched out in my mind already, it's just going to be more work.



The beauty of only allowing relaxation in the strict->non-strict direction
is that it shouldn't need serialising as such - it doesn't matter if the
update to cookie->fq_domain is observed between iommu_unmap() and
iommu_dma_free_iova(), since there's no correctness impact to queueing IOVAs
which may already have been invalidated and may or may not have been synced.
AFAICS the only condition which matters is that the setting of the
io-pgtable quirk must observe fq_domain being set. It feels like there must
be enough dependencies on the read side, but we might need an smp_wmb()
between the two in iommu_dma_init_fq()?

I've also flip-flopped a bit on whether fq_domain needs to be accessed with
READ_ONCE/WRITE_ONCE - by the time of posting I'd convinced myself that it
was probably OK, but looking again now I suppose this wacky reordering is
theoretically possible:


iommu_dma_unmap() {
bool free_fq = cookie->fq_domain; // == false

iommu_unmap();

if (!cookie->fq_domain) // observes new non-NULL value
iommu_tlb_sync(); // skipped

iommu_dma_free_iova { // inlined
if (free_fq) // false
queue_iova();
else
free_iova_fast(); // Uh-oh!
}
}

so although I still can't see atomicity being a problem I guess we do need
it for the sake of reordering at least.


With your changes, I think quite a few things can go wrong.

   * cookie->fq_domain may be observed but iovad->fq could be NULL


Good point, I guess that already technically applies (if iovad->fq sat 
in a write buffer long enough), but it certainly becomes far easier to 
provoke. However a barrier after assigning fq_domain (as mentioned 
above) paired with the control dependency around the queue_iova() call 
would also fix that, right?



   * We can miss the smp_wmb() in the pgtable code but end up deferring the
 IOVA reclaim
   * iommu_change_dev_def_domain() only holds the group mutex afaict, so can
 possibly run concurrently with itself on the same domain?
   * iommu_dma_init_fq() can flip the domain type back from
 IOMMU_DOMAIN_DMA_FQ to 

Re: [PATCH v2 23/24] iommu/arm-smmu: Allow non-strict in pgtable_quirks interface

2021-08-03 Thread Will Deacon
On Mon, Aug 02, 2021 at 03:15:50PM +0100, Robin Murphy wrote:
> On 2021-08-02 14:04, Will Deacon wrote:
> > On Wed, Jul 28, 2021 at 04:58:44PM +0100, Robin Murphy wrote:
> > > To make io-pgtable aware of a flush queue being dynamically enabled,
> > > allow IO_PGTABLE_QUIRK_NON_STRICT to be set even after a domain has been
> > > attached to, and hook up the final piece of the puzzle in iommu-dma.
> > > 
> > > Signed-off-by: Robin Murphy 
> > > ---
> > >   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++
> > >   drivers/iommu/arm/arm-smmu/arm-smmu.c   | 11 +++
> > >   drivers/iommu/dma-iommu.c   |  3 +++
> > >   3 files changed, 29 insertions(+)
> > > 
> > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> > > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > index 19400826eba7..40fa9cb382c3 100644
> > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> > > @@ -2711,6 +2711,20 @@ static int arm_smmu_enable_nesting(struct 
> > > iommu_domain *domain)
> > >   return ret;
> > >   }
> > > +static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
> > > + unsigned long quirks)
> > > +{
> > > + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> > > +
> > > + if (quirks == IO_PGTABLE_QUIRK_NON_STRICT && smmu_domain->pgtbl_ops) {
> > > + struct io_pgtable *iop = 
> > > io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
> > > +
> > > + iop->cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> > > + return 0;
> > > + }
> > > + return -EINVAL;
> > > +}
> > 
> > I don't see anything serialising this against a concurrent iommu_unmap(), so
> > the ordering and atomicity looks quite suspicious to me here. I don't think
> > it's just the page-table quirks either, but also setting cookie->fq_domain.
> 
> Heh, I confess to very much taking the cheeky "let's say nothing and see
> what Will thinks about concurrency" approach here :)

Damnit, I fell for that didn't I?

Overall, I'm really nervous about the concurrency here and think we'd be
better off requiring the unbind as we have for the other domain changes.

> The beauty of only allowing relaxation in the strict->non-strict direction
> is that it shouldn't need serialising as such - it doesn't matter if the
> update to cookie->fq_domain is observed between iommu_unmap() and
> iommu_dma_free_iova(), since there's no correctness impact to queueing IOVAs
> which may already have been invalidated and may or may not have been synced.
> AFAICS the only condition which matters is that the setting of the
> io-pgtable quirk must observe fq_domain being set. It feels like there must
> be enough dependencies on the read side, but we might need an smp_wmb()
> between the two in iommu_dma_init_fq()?
> 
> I've also flip-flopped a bit on whether fq_domain needs to be accessed with
> READ_ONCE/WRITE_ONCE - by the time of posting I'd convinced myself that it
> was probably OK, but looking again now I suppose this wacky reordering is
> theoretically possible:
> 
> 
>   iommu_dma_unmap() {
>   bool free_fq = cookie->fq_domain; // == false
> 
>   iommu_unmap();
> 
>   if (!cookie->fq_domain) // observes new non-NULL value
>   iommu_tlb_sync(); // skipped
> 
>   iommu_dma_free_iova { // inlined
>   if (free_fq) // false
>   queue_iova();
>   else
>   free_iova_fast(); // Uh-oh!
>   }
>   }
> 
> so although I still can't see atomicity being a problem I guess we do need
> it for the sake of reordering at least.

With your changes, I think quite a few things can go wrong.

  * cookie->fq_domain may be observed but iovad->fq could be NULL
  * We can miss the smp_wmb() in the pgtable code but end up deferring the
IOVA reclaim
  * iommu_change_dev_def_domain() only holds the group mutex afaict, so can
possibly run concurrently with itself on the same domain?
  * iommu_dma_init_fq() can flip the domain type back from
IOMMU_DOMAIN_DMA_FQ to IOMMU_DOMAIN_DMA on the error path
  * set_pgtable_quirks() isn't atomic, which probably is ok for now, but the
moment we use it anywhere else it's dangerous

and I suspect there are more :/

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 23/24] iommu/arm-smmu: Allow non-strict in pgtable_quirks interface

2021-08-02 Thread Robin Murphy

On 2021-08-02 14:04, Will Deacon wrote:

On Wed, Jul 28, 2021 at 04:58:44PM +0100, Robin Murphy wrote:

To make io-pgtable aware of a flush queue being dynamically enabled,
allow IO_PGTABLE_QUIRK_NON_STRICT to be set even after a domain has been
attached to, and hook up the final piece of the puzzle in iommu-dma.

Signed-off-by: Robin Murphy 
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++
  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 11 +++
  drivers/iommu/dma-iommu.c   |  3 +++
  3 files changed, 29 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 19400826eba7..40fa9cb382c3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2711,6 +2711,20 @@ static int arm_smmu_enable_nesting(struct iommu_domain 
*domain)
return ret;
  }
  
+static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,

+   unsigned long quirks)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+   if (quirks == IO_PGTABLE_QUIRK_NON_STRICT && smmu_domain->pgtbl_ops) {
+   struct io_pgtable *iop = 
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+
+   iop->cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+   return 0;
+   }
+   return -EINVAL;
+}


I don't see anything serialising this against a concurrent iommu_unmap(), so
the ordering and atomicity looks quite suspicious to me here. I don't think
it's just the page-table quirks either, but also setting cookie->fq_domain.


Heh, I confess to very much taking the cheeky "let's say nothing and see 
what Will thinks about concurrency" approach here :)


The beauty of only allowing relaxation in the strict->non-strict 
direction is that it shouldn't need serialising as such - it doesn't 
matter if the update to cookie->fq_domain is observed between 
iommu_unmap() and iommu_dma_free_iova(), since there's no correctness 
impact to queueing IOVAs which may already have been invalidated and may 
or may not have been synced. AFAICS the only condition which matters is 
that the setting of the io-pgtable quirk must observe fq_domain being 
set. It feels like there must be enough dependencies on the read side, 
but we might need an smp_wmb() between the two in iommu_dma_init_fq()?


I've also flip-flopped a bit on whether fq_domain needs to be accessed 
with READ_ONCE/WRITE_ONCE - by the time of posting I'd convinced myself 
that it was probably OK, but looking again now I suppose this wacky 
reordering is theoretically possible:



iommu_dma_unmap() {
bool free_fq = cookie->fq_domain; // == false

iommu_unmap();

if (!cookie->fq_domain) // observes new non-NULL value
iommu_tlb_sync(); // skipped

iommu_dma_free_iova { // inlined
if (free_fq) // false
queue_iova();
else
free_iova_fast(); // Uh-oh!
}
}

so although I still can't see atomicity being a problem I guess we do 
need it for the sake of reordering at least.


Cheers,
Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 23/24] iommu/arm-smmu: Allow non-strict in pgtable_quirks interface

2021-08-02 Thread Will Deacon
On Wed, Jul 28, 2021 at 04:58:44PM +0100, Robin Murphy wrote:
> To make io-pgtable aware of a flush queue being dynamically enabled,
> allow IO_PGTABLE_QUIRK_NON_STRICT to be set even after a domain has been
> attached to, and hook up the final piece of the puzzle in iommu-dma.
> 
> Signed-off-by: Robin Murphy 
> ---
>  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++
>  drivers/iommu/arm/arm-smmu/arm-smmu.c   | 11 +++
>  drivers/iommu/dma-iommu.c   |  3 +++
>  3 files changed, 29 insertions(+)
> 
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index 19400826eba7..40fa9cb382c3 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -2711,6 +2711,20 @@ static int arm_smmu_enable_nesting(struct iommu_domain 
> *domain)
>   return ret;
>  }
>  
> +static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
> + unsigned long quirks)
> +{
> + struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +
> + if (quirks == IO_PGTABLE_QUIRK_NON_STRICT && smmu_domain->pgtbl_ops) {
> + struct io_pgtable *iop = 
> io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
> +
> + iop->cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
> + return 0;
> + }
> + return -EINVAL;
> +}

I don't see anything serialising this against a concurrent iommu_unmap(), so
the ordering and atomicity looks quite suspicious to me here. I don't think
it's just the page-table quirks either, but also setting cookie->fq_domain.

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 23/24] iommu/arm-smmu: Allow non-strict in pgtable_quirks interface

2021-07-28 Thread Robin Murphy
To make io-pgtable aware of a flush queue being dynamically enabled,
allow IO_PGTABLE_QUIRK_NON_STRICT to be set even after a domain has been
attached to, and hook up the final piece of the puzzle in iommu-dma.

Signed-off-by: Robin Murphy 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 15 +++
 drivers/iommu/arm/arm-smmu/arm-smmu.c   | 11 +++
 drivers/iommu/dma-iommu.c   |  3 +++
 3 files changed, 29 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 19400826eba7..40fa9cb382c3 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2711,6 +2711,20 @@ static int arm_smmu_enable_nesting(struct iommu_domain 
*domain)
return ret;
 }
 
+static int arm_smmu_set_pgtable_quirks(struct iommu_domain *domain,
+   unsigned long quirks)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+
+   if (quirks == IO_PGTABLE_QUIRK_NON_STRICT && smmu_domain->pgtbl_ops) {
+   struct io_pgtable *iop = 
io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+
+   iop->cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+   return 0;
+   }
+   return -EINVAL;
+}
+
 static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
 {
return iommu_fwspec_add_ids(dev, args->args, 1);
@@ -2825,6 +2839,7 @@ static struct iommu_ops arm_smmu_ops = {
.release_device = arm_smmu_release_device,
.device_group   = arm_smmu_device_group,
.enable_nesting = arm_smmu_enable_nesting,
+   .set_pgtable_quirks = arm_smmu_set_pgtable_quirks,
.of_xlate   = arm_smmu_of_xlate,
.get_resv_regions   = arm_smmu_get_resv_regions,
.put_resv_regions   = generic_iommu_put_resv_regions,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 109e4723f9f5..f18684f308b9 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1518,6 +1518,17 @@ static int arm_smmu_set_pgtable_quirks(struct 
iommu_domain *domain,
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
int ret = 0;
 
+   if (quirks == IO_PGTABLE_QUIRK_NON_STRICT) {
+   struct io_pgtable *iop;
+
+   if (!smmu_domain->pgtbl_ops)
+   return -EINVAL;
+
+   iop = io_pgtable_ops_to_pgtable(smmu_domain->pgtbl_ops);
+   iop->cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;
+   return 0;
+   }
+
mutex_lock(_domain->init_mutex);
if (smmu_domain->smmu)
ret = -EPERM;
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 304a3ec71223..6e3eca778267 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -326,6 +327,8 @@ int iommu_dma_init_fq(struct iommu_domain *domain)
return -ENODEV;
}
cookie->fq_domain = domain;
+   if (domain->ops->set_pgtable_quirks)
+   domain->ops->set_pgtable_quirks(domain, 
IO_PGTABLE_QUIRK_NON_STRICT);
return 0;
 }
 
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu