Re: [PATCH] iommu/arm-smmu-v3: Handle duplicated Stream IDs from other masters

2021-01-12 Thread Ajay Kumar
On Tue, Jan 12, 2021 at 12:57 AM Robin Murphy  wrote:
>
> On 2021-01-07 13:03, Will Deacon wrote:
> > On Thu, Jan 07, 2021 at 03:03:40PM +0530, Ajay Kumar wrote:
> >> When PCI function drivers(ex:pci-endpoint-test) are probed for already
> >> initialized PCIe-RC(Root Complex), and PCIe-RC is already bound to SMMU,
> >> then we encounter a situation where the function driver tries to attach
> >> itself to the smmu with the same stream-id as PCIe-RC and re-initialize
> >> an already initialized STE. This causes ste_live BUG_ON() in the driver.
>
> Note that this is actually expected behaviour, since Stream ID aliasing
> has remained officially not supported until a sufficiently compelling
> reason to do so appears. I always thought the most likely scenario would
> be a legacy PCI bridge with multiple devices behind it, but even that
> seems increasingly improbable for a modern SMMUv3-based system to ever see.
Thanks to Will and Robin for reviewing this. I am pretty new to PCI,
sorry about that.
I assumed that the support for stream-id alias is already handled as
part of this patch:
https://www.spinics.net/lists/arm-kernel/msg626087.html
which prevents STE re-initialization. But, what I do not understand is
why the path
taken by the arm-smmu-v3 driver misses the aforementioned check for my usecase.

> > I don't understand why the endpoint is using the same stream ID as the root
> > complex in this case. Why is that? Is the grouping logic not working
> > properly?
>
> It's not so much that it isn't working properly, it's more that it needs
> to be implemented at all ;)
The pci_endpoint_test picks up the same of_ DMA config node as the PCI RC
because they sit on the same PCI bus [via pci_dma_configure( )]
While in the arm-smmu-v3 driver, I can see that the pci_device_group( ) hands
over the same iommu group as the Root Complex to the newly added master
device (pci_endpoint_test in our case) because they share the same stream ID.
Shouldn't they?

> >> There is an already existing check in the driver to manage duplicated ids
> >> if duplicated ids are added in same master device, but there can be
> >> scenarios like above where we need to extend the check for other masters
> >> using the same stream-id.
> >>
> >> Signed-off-by: Ajay Kumar 
> >> ---
> >>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 +
> >>   1 file changed, 33 insertions(+)
> >
> > It doesn't feel like the driver is the right place to fix this, as the same
> > issue could surely occur for other IOMMUs too, right? In which case, I think
> > we should avoid getting into the situation where different groups have
> > overlapping stream IDs.
>
> Yes, this patch does not represent the correct thing to do either way.
> The main reason that Stream ID aliasing hasn't been supported so far is
> that the required Stream ID to group lookup is rather awkward, and
> adding all of that complexity just for the sake of a rather unlikely
> possibility seemed dubious. However, PRI support has always had a more
> pressing need to implement almost the same thing (Stream ID to device),
> so once that lands we can finally get round to adding the rest of proper
> group support relatively easily.
I hope the support will be added soon. Also, can you point me to few drivers
which already handle this type of stream-ID aliasing?

Thanks,
Ajay Kumar
> __
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/arm-smmu-v3: Handle duplicated Stream IDs from other masters

2021-01-07 Thread Ajay Kumar
When PCI function drivers(ex:pci-endpoint-test) are probed for already
initialized PCIe-RC(Root Complex), and PCIe-RC is already bound to SMMU,
then we encounter a situation where the function driver tries to attach
itself to the smmu with the same stream-id as PCIe-RC and re-initialize
an already initialized STE. This causes ste_live BUG_ON() in the driver.

There is an already existing check in the driver to manage duplicated ids
if duplicated ids are added in same master device, but there can be
scenarios like above where we need to extend the check for other masters
using the same stream-id.

Signed-off-by: Ajay Kumar 
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e634bbe60573..a91c3c0e9ee8 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2022,10 +2022,26 @@ static __le64 *arm_smmu_get_step_for_sid(struct 
arm_smmu_device *smmu, u32 sid)
return step;
 }
 
+static bool arm_smmu_check_duplicated_sid(struct arm_smmu_master *master,
+   int sid)
+{
+   int i;
+
+   for (i = 0; i < master->num_sids; ++i)
+   if (master->sids[i] == sid)
+   return true;
+
+   return false;
+}
+
 static void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master)
 {
+   bool sid_in_other_masters;
int i, j;
struct arm_smmu_device *smmu = master->smmu;
+   unsigned long flags;
+   struct arm_smmu_domain *smmu_domain = master->domain;
+   struct arm_smmu_master *other_masters;
 
for (i = 0; i < master->num_sids; ++i) {
u32 sid = master->sids[i];
@@ -2038,6 +2054,23 @@ static void arm_smmu_install_ste_for_dev(struct 
arm_smmu_master *master)
if (j < i)
continue;
 
+   /* Check for stream-ID duplication in masters in given domain */
+   sid_in_other_masters = false;
+   spin_lock_irqsave(_domain->devices_lock, flags);
+   list_for_each_entry(other_masters, _domain->devices,
+   domain_head) {
+   sid_in_other_masters =
+   arm_smmu_check_duplicated_sid(other_masters,
+   sid);
+   if (sid_in_other_masters)
+   break;
+   }
+   spin_unlock_irqrestore(_domain->devices_lock, flags);
+
+   /* Skip STE re-init if stream-id found in other masters */
+   if (sid_in_other_masters)
+   continue;
+
arm_smmu_write_strtab_ent(master, sid, step);
}
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-23 Thread Ajay kumar
Hello all,

We pretty much tried to solve the same issue here with a new API in DMA-IOMMU:
https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/T/

Christopher- the user part would be MFC devices on exynos platforms

Thanks,
Ajay
On 9/23/20, Christoph Hellwig  wrote:
> On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
>> Hi Shaik,
>>
>> I've run into similar problem while adapting S5P-MFC and Exynos4-IS
>> drivers for generic IOMMU-DMA framework. Here is my first solution:
>> https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/
>>
>>
>>
>> It allows to remap given buffer at the specific IOVA address, although
>> it doesn't guarantee that those specific addresses won't be later used
>> by the IOVA allocator. Probably it would make sense to add an API for
>> generic IOMMU-DMA framework to mark the given IOVA range as
>> reserved/unused to protect them.
>
> If you want to use IOVA addresses in a device otherwise managed by
> dma-iommu we need to expose an API through the dma API.  Can you please
> include the iommu list in the discussion of your series?
>
> I don't think using the raw IOMMU API is a very idea in these drivers,
> we probably want a way to change the allocator algorithm or hint the
> next IOVA and keep using the normal DMA API.  Maybe Robin has a better
> idea.
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-23 Thread Ajay Kumar
Hello all,

We pretty much tried to solve the same issue here with a new API in DMA-IOMMU:
https://lore.kernel.org/linux-iommu/20200811054912.ga...@infradead.org/T/

Christoph - the user part would be MFC devices on exynos platforms

Thanks,
Ajay


On Wed, Sep 23, 2020 at 12:28 PM Christoph Hellwig  wrote:
>
> On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
> > Hi Shaik,
> >
> > I've run into similar problem while adapting S5P-MFC and Exynos4-IS
> > drivers for generic IOMMU-DMA framework. Here is my first solution:
> > https://lore.kernel.org/linux-samsung-soc/20200918144833.14618-1-m.szyprow...@samsung.com/T/
> >
> >
> > It allows to remap given buffer at the specific IOVA address, although
> > it doesn't guarantee that those specific addresses won't be later used
> > by the IOVA allocator. Probably it would make sense to add an API for
> > generic IOMMU-DMA framework to mark the given IOVA range as
> > reserved/unused to protect them.
>
> If you want to use IOVA addresses in a device otherwise managed by
> dma-iommu we need to expose an API through the dma API.  Can you please
> include the iommu list in the discussion of your series?
>
> I don't think using the raw IOMMU API is a very idea in these drivers,
> we probably want a way to change the allocator algorithm or hint the
> next IOVA and keep using the normal DMA API.  Maybe Robin has a better
> idea.
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC V2 PATCH] dma-iommu: allow devices to set IOVA range dynamically

2020-08-10 Thread Ajay Kumar
Currently, there is no other option to change the lower limit of
IOVA for any device than calling iova_init_domain(), but the
said function will re-init whole domain and also doesn't track
the previously allocated IOVA before re-initing the domain.

There are cases where the device might not support continuous
range of addresses, and can also have dependency among buffers
allocated for firmware, image manipulation, etc and all of the
address requests pass through IOMMU. In such cases, we can allocate
buffers stage by stage by setting address limit, and also keep
track the of same.

Bit of background can be found here:
IOVA allocation dependency between firmware buffer and remaining buffers
https://www.spinics.net/lists/iommu/msg43586.html

This patch allows devices to limit the IOVA space they want
during allocation at any given point of time. We shall allow
the same only if the device owns the corresponding iommu_domain,
that is the device is the only master attached to the domain.

The lower limit of IOVA space is marked by start_pfn, and the upper
limit is marked by dma_mask and this patch honors the same.
Since changing dma_mask can extend the addressable region beyond
current cached node, we do a reset of current cached nodes as well.

User drivers can make call sequence like below:

When they want to limit IOVA for allocated buffers in range
0x0 to 0x100:
iommu_set_iova_range(dev, 0x0, 0x100 - 1);

When they want to limit IOVA for allocated buffers in range
0x100 to 0xC000:
iommu_set_iova_range(dev, 0x100, 0xC000 - 0x100);
=

Signed-off-by: Ajay Kumar 
---
 drivers/iommu/dma-iommu.c | 73 +++
 drivers/iommu/iommu.c | 16 +
 include/linux/iommu.h |  6 
 include/linux/iova.h  |  6 
 4 files changed, 101 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4959f5df21bd..2fe3f57ab648 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -167,6 +167,79 @@ void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
+/**
+ * iommu_set_iova_range - Limit the IOVA region for a specific device
+ * @dev: Device to set IOVA range for
+ * @base: Base address or the lower limit of the IOVA range
+ * @size: Size of address range from lower limit to upper limit
+ *
+ * Allow a master device to dynamically control the range of IOVA addresses
+ * which are allocated iff the master device is the only device attached to
+ * the corresponding iommu_domain.
+ * This function doesn't harm IOVA addresses outside of current range,
+ * which were allocated prior to calling this function.
+ */
+int iommu_set_iova_range(struct device *dev, dma_addr_t base, u64 size)
+{
+   struct iommu_domain *domain;
+   struct iommu_dma_cookie *cookie;
+   struct iova_domain *iovad;
+   unsigned long shift, base_pfn;
+   u64 new_dma_mask;
+
+   /*
+* Check if the IOMMU master device is the sole entry in the group
+* If the group has more than one master device using the same IOMMU
+* we shouldn't be allowing that device to change the IOVA limit
+*/
+   if (iommu_group_device_count_from_dev(dev) != 1)
+   return -EINVAL;
+
+   domain = iommu_get_domain_for_dev(dev);
+   if (!domain)
+   return -ENODEV;
+
+   if (domain->type != IOMMU_DOMAIN_DMA)
+   return -EINVAL;
+
+   cookie = domain->iova_cookie;
+   if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
+   return -ENODEV;
+
+   iovad = >iovad;
+
+   shift = iova_shift(iovad);
+   base_pfn = base >> shift;
+
+   base_pfn = max_t(unsigned long, 1, base_pfn);
+
+   /* base cannot be outside aperture */
+   if (domain->geometry.force_aperture) {
+   if (base > domain->geometry.aperture_end ||
+   base + size <= domain->geometry.aperture_start) {
+   pr_warn("specified DMA range outside IOMMU 
capability\n");
+   return -EFAULT;
+   }
+   /* ...then finally give it a kicking to make sure it fits */
+   base_pfn = max_t(unsigned long, base_pfn,
+   domain->geometry.aperture_start >> shift);
+   }
+   /* Set page aligned lower limit of IOVA range to start_pfn */
+   iovad->start_pfn = base_pfn;
+
+   /* Set upper limit of IOVA range to dma_mask */
+   new_dma_mask = (u64)base + size;
+   dma_set_mask_and_coherent(dev, new_dma_mask);
+
+   /* Reset cached nodes to start IOVA search from the anchor node */
+   iovad->cached_node = >anchor.node;
+   iovad->cached32_node = >anchor.node;
+  

[RFC PATCH] dma-iommu: allow devices to set IOVA range dynamically

2020-08-03 Thread Ajay Kumar
Currently, there is no other option to change the lower limit of
IOVA for any device than calling iova_init_domain(), but the
said function will re-init whole domain and also doesn't track
the previously allocated IOVA before re-initing the domain.

There are cases where the device might not support continuous
range of addresses, and can also have dependency among buffers
allocated for firmware, image manipulation, etc and all of the
address requests pass through IOMMU. In such cases, we can allocate
buffers stage by stage by setting address limit, and also keep
track the of same.

Bit of background can be found here:
IOVA allocation dependency between firmware buffer and remaining buffers
https://www.spinics.net/lists/iommu/msg43586.html

This patch allows devices to limit the IOVA space they want
during allocation at any given point of time. We shall allow
the same only if the device owns the corresponding iommu_domain,
that is the device is the only master attached to the domain.

The lower limit of IOVA space is marked by start_pfn, and the upper
limit is marked by dma_mask and this patch honors the same.
Since changing dma_mask can extend the addressable region beyond
current cached node, we do a reset of current cached nodes as well.

Signed-off-by: Ajay Kumar 
---
 drivers/iommu/dma-iommu.c | 73 +++
 drivers/iommu/iommu.c | 16 +
 include/linux/iommu.h |  6 
 include/linux/iova.h  |  6 
 4 files changed, 101 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4959f5df21bd..2fe3f57ab648 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -167,6 +167,79 @@ void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
+/**
+ * iommu_set_iova_range - Limit the IOVA region for a specific device
+ * @dev: Device to set IOVA range for
+ * @base: Base address or the lower limit of the IOVA range
+ * @size: Size of address range from lower limit to upper limit
+ *
+ * Allow a master device to dynamically control the range of IOVA addresses
+ * which are allocated iff the master device is the only device attached to
+ * the corresponding iommu_domain.
+ * This function doesn't harm IOVA addresses outside of current range,
+ * which were allocated prior to calling this function.
+ */
+int iommu_set_iova_range(struct device *dev, dma_addr_t base, u64 size)
+{
+   struct iommu_domain *domain;
+   struct iommu_dma_cookie *cookie;
+   struct iova_domain *iovad;
+   unsigned long shift, base_pfn;
+   u64 new_dma_mask;
+
+   /*
+* Check if the IOMMU master device is the sole entry in the group
+* If the group has more than one master device using the same IOMMU
+* we shouldn't be allowing that device to change the IOVA limit
+*/
+   if (iommu_group_device_count_from_dev(dev) != 1)
+   return -EINVAL;
+
+   domain = iommu_get_domain_for_dev(dev);
+   if (!domain)
+   return -ENODEV;
+
+   if (domain->type != IOMMU_DOMAIN_DMA)
+   return -EINVAL;
+
+   cookie = domain->iova_cookie;
+   if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
+   return -ENODEV;
+
+   iovad = >iovad;
+
+   shift = iova_shift(iovad);
+   base_pfn = base >> shift;
+
+   base_pfn = max_t(unsigned long, 1, base_pfn);
+
+   /* base cannot be outside aperture */
+   if (domain->geometry.force_aperture) {
+   if (base > domain->geometry.aperture_end ||
+   base + size <= domain->geometry.aperture_start) {
+   pr_warn("specified DMA range outside IOMMU 
capability\n");
+   return -EFAULT;
+   }
+   /* ...then finally give it a kicking to make sure it fits */
+   base_pfn = max_t(unsigned long, base_pfn,
+   domain->geometry.aperture_start >> shift);
+   }
+   /* Set page aligned lower limit of IOVA range to start_pfn */
+   iovad->start_pfn = base_pfn;
+
+   /* Set upper limit of IOVA range to dma_mask */
+   new_dma_mask = (u64)base + size;
+   dma_set_mask_and_coherent(dev, new_dma_mask);
+
+   /* Reset cached nodes to start IOVA search from the anchor node */
+   iovad->cached_node = >anchor.node;
+   iovad->cached32_node = >anchor.node;
+   iovad->max32_alloc_size = iovad->dma_32bit_pfn;
+
+   return 0;
+}
+EXPORT_SYMBOL(iommu_set_iova_range);
+
 static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
phys_addr_t start, phys_addr_t end)
 {
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 609bd25bf154..30b2d4e5487d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -919,6 +919,22 @@ static int iommu_gro

Re: [PATCH] iommu/iova: Retry from last rb tree node if iova search fails

2020-05-07 Thread Ajay kumar
On 5/7/20, Robin Murphy  wrote:
> On 2020-05-06 9:01 pm, vji...@codeaurora.org wrote:
>> From: Vijayanand Jitta 
>>
>> When ever a new iova alloc request comes iova is always searched
>> from the cached node and the nodes which are previous to cached
>> node. So, even if there is free iova space available in the nodes
>> which are next to the cached node iova allocation can still fail
>> because of this approach.
>>
>> Consider the following sequence of iova alloc and frees on
>> 1GB of iova space
>>
>> 1) alloc - 500MB
>> 2) alloc - 12MB
>> 3) alloc - 499MB
>> 4) free -  12MB which was allocated in step 2
>> 5) alloc - 13MB
>>
>> After the above sequence we will have 12MB of free iova space and
>> cached node will be pointing to the iova pfn of last alloc of 13MB
>> which will be the lowest iova pfn of that iova space. Now if we get an
>> alloc request of 2MB we just search from cached node and then look
>> for lower iova pfn's for free iova and as they aren't any, iova alloc
>> fails though there is 12MB of free iova space.
>
> Yup, this could definitely do with improving. Unfortunately I think this
> particular implementation is slightly flawed...
>
>> To avoid such iova search failures do a retry from the last rb tree node
>> when iova search fails, this will search the entire tree and get an iova
>> if its available
>>
>> Signed-off-by: Vijayanand Jitta 
>> ---
>>   drivers/iommu/iova.c | 11 +++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 0e6a953..2985222 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct
>> iova_domain *iovad,
>>  unsigned long flags;
>>  unsigned long new_pfn;
>>  unsigned long align_mask = ~0UL;
>> +bool retry = false;
>>
>>  if (size_aligned)
>>  align_mask <<= fls_long(size - 1);
>> @@ -198,6 +199,8 @@ static int __alloc_and_insert_iova_range(struct
>> iova_domain *iovad,
>>
>>  curr = __get_cached_rbnode(iovad, limit_pfn);
>>  curr_iova = rb_entry(curr, struct iova, node);
>> +
>> +retry_search:
>>  do {
>>  limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
>>  new_pfn = (limit_pfn - size) & align_mask;
>> @@ -207,6 +210,14 @@ static int __alloc_and_insert_iova_range(struct
>> iova_domain *iovad,
>>  } while (curr && new_pfn <= curr_iova->pfn_hi);
>>
>>  if (limit_pfn < size || new_pfn < iovad->start_pfn) {
>> +if (!retry) {
>> +curr = rb_last(>rbroot);
>
> Why walk when there's an anchor node there already? However...
+1
>
>> +curr_iova = rb_entry(curr, struct iova, node);
>> +limit_pfn = curr_iova->pfn_lo;
>
> ...this doesn't look right, as by now we've lost the original limit_pfn
> supplied by the caller, so are highly likely to allocate beyond the
> range our caller asked for. In fact AFAICS we'd start allocating from
> directly directly below the anchor node, beyond the end of the entire
> address space.
+1
>
> The logic I was imagining we want here was something like the rapidly
> hacked up (and untested) diff below.
>
> Thanks,
> Robin.
>
> ->8-
> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
> index 0e6a9536eca6..3574c19272d6 100644
> --- a/drivers/iommu/iova.c
> +++ b/drivers/iommu/iova.c
> @@ -186,6 +186,7 @@ static int __alloc_and_insert_iova_range(struct
> iova_domain *iovad,
>  unsigned long flags;
>  unsigned long new_pfn;
>  unsigned long align_mask = ~0UL;
> +   unsigned long alloc_hi, alloc_lo;
>
>  if (size_aligned)
>  align_mask <<= fls_long(size - 1);
> @@ -196,17 +197,27 @@ static int __alloc_and_insert_iova_range(struct
> iova_domain *iovad,
>  size >= iovad->max32_alloc_size)
>  goto iova32_full;
>
> +   alloc_hi = IOVA_ANCHOR;
> +   alloc_lo = iovad->start_pfn;
> +retry:
>  curr = __get_cached_rbnode(iovad, limit_pfn);
>  curr_iova = rb_entry(curr, struct iova, node);
> +   if (alloc_hi < curr_iova->pfn_hi) {
> +   alloc_lo = curr_iova->pfn_hi;
> +   alloc_hi = limit_pfn;
> +   }
> +
>  do {
> -   limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
> -   new_pfn = (limit_pfn - size) & align_mask;
> +   alloc_hi = min(alloc_hi, curr_iova->pfn_lo);
During retry case, the curr and curr_iova is not updated. Kindly check it.

Ajay
> +   new_pfn = (alloc_hi - size) & align_mask;
>  prev = curr;
>  curr = rb_prev(curr);
>  curr_iova = rb_entry(curr, struct iova, node);
>  } while (curr && new_pfn <= curr_iova->pfn_hi);
>
> -   if (limit_pfn < size || new_pfn < iovad->start_pfn) {
> +   if (limit_pfn < size || new_pfn < alloc_lo) {
> +   if (alloc_lo == iovad->start_pfn)
> 

Re: [RFC PATCH] drivers: iommu: reset cached node if dma_mask is changed

2020-05-07 Thread Ajay kumar
Hi Robin,

On 5/7/20, Robin Murphy  wrote:
> On 2020-05-04 7:37 pm, Ajay Kumar wrote:
>> The current IOVA allocation code stores a cached copy of the
>> first allocated IOVA address node, and all the subsequent allocations
>> have no way to get past(higher than) the first allocated IOVA range.
>
> Strictly they do, after that first allocation gets freed, or if the
> first limit was <=32 bits and the subsequent limit >32 bits ;)
In my case, the first allocated buffer is the firmware buffer,
and its not released.

>> This causes issue when dma_mask for the master device is changed.
>> Though the DMA window is increased, the allocation code unaware of
>> the change, goes ahead allocating IOVA address lower than the
>> first allocated IOVA address.
>>
>> This patch adds a check for dma_mask change in the IOVA allocation
>> function and resets the cached IOVA node to anchor node everytime
>> the dma_mask change is observed.
>
> This isn't the right approach, since limit_pfn is by design a transient
> per-allocation thing. Devices with different limits may well be
> allocating from the same IOVA domain concurrently, which is the whole
> reason for maintaining two cached nodes to serve the expected PCI case
> of mixing 32-bit and 64-bit limits. Trying to track a per-allocation
> property on a per-domain basis is just going to thrash and massively
> hurt such cases.
Agreed. But if two or more non PCI devices share the same domain,
and though one of the devices has higher addressable limit,
the current logic forces it to take the lower window.
> A somewhat more appropriate fix to the allocation loop itself has been
> proposed here:
>
> https://lore.kernel.org/linux-iommu/1588795317-20879-1-git-send-email-vji...@codeaurora.org/
This patch addresses my problem indirectly. Shall add my concerns there.

Ajay
>> NOTE:
>>   This patch is needed to address the issue discussed in below thread:
>>   https://www.spinics.net/lists/iommu/msg43586.html
>>
>> Signed-off-by: Ajay Kumar 
>> Signed-off-by: Sathyam Panda 
>> ---
>>   drivers/iommu/iova.c | 17 -
>>   include/linux/iova.h |  1 +
>>   2 files changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 41c605b0058f..0e99975036ae 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -44,6 +44,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned
>> long granule,
>>  iovad->granule = granule;
>>  iovad->start_pfn = start_pfn;
>>  iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad));
>> +iovad->curr_limit_pfn = iovad->dma_32bit_pfn;
>>  iovad->max32_alloc_size = iovad->dma_32bit_pfn;
>>  iovad->flush_cb = NULL;
>>  iovad->fq = NULL;
>> @@ -116,9 +117,20 @@ EXPORT_SYMBOL_GPL(init_iova_flush_queue);
>>   static struct rb_node *
>>   __get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn)
>>   {
>> -if (limit_pfn <= iovad->dma_32bit_pfn)
>> +if (limit_pfn <= iovad->dma_32bit_pfn) {
>> +/* re-init cached node if DMA limit has changed */
>> +if (limit_pfn != iovad->curr_limit_pfn) {
>> +iovad->cached32_node = >anchor.node;
>> +iovad->curr_limit_pfn = limit_pfn;
>> +}
>>  return iovad->cached32_node;
>> +}
>>
>> +/* re-init cached node if DMA limit has changed */
>> +if (limit_pfn != iovad->curr_limit_pfn) {
>> +iovad->cached_node = >anchor.node;
>> +iovad->curr_limit_pfn = limit_pfn;
>> +}
>>  return iovad->cached_node;
>>   }
>>
>> @@ -190,6 +202,9 @@ static int __alloc_and_insert_iova_range(struct
>> iova_domain *iovad,
>>  if (size_aligned)
>>  align_mask <<= fls_long(size - 1);
>>
>> +if (limit_pfn != iovad->curr_limit_pfn)
>> +iovad->max32_alloc_size = iovad->dma_32bit_pfn;
>> +
>>  /* Walk the tree backwards */
>>  spin_lock_irqsave(>iova_rbtree_lock, flags);
>>  if (limit_pfn <= iovad->dma_32bit_pfn &&
>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>> index a0637abffee8..be2220c096ef 100644
>> --- a/include/linux/iova.h
>> +++ b/include/linux/iova.h
>> @@ -73,6 +73,7 @@ struct iova_domain {
>>  unsigned long   granule;/* pfn granularity for this domain */
>>  unsigned long   start_pfn;  /* Lower limit for this domain */
>>  unsigned long   dma_32bit_pfn;
>> +unsigned long   curr_limit_pfn; /* Current max limit for this domain */
>>  unsigned long   max32_alloc_size; /* Size of last failed allocation */
>>  struct iova_fq __percpu *fq;/* Flush Queue */
>>
>>
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH] drivers: iommu: reset cached node if dma_mask is changed

2020-05-04 Thread Ajay Kumar
The current IOVA allocation code stores a cached copy of the
first allocated IOVA address node, and all the subsequent allocations
have no way to get past(higher than) the first allocated IOVA range.

This causes issue when dma_mask for the master device is changed.
Though the DMA window is increased, the allocation code unaware of
the change, goes ahead allocating IOVA address lower than the
first allocated IOVA address.

This patch adds a check for dma_mask change in the IOVA allocation
function and resets the cached IOVA node to anchor node everytime
the dma_mask change is observed.

NOTE:
 This patch is needed to address the issue discussed in below thread:
 https://www.spinics.net/lists/iommu/msg43586.html

Signed-off-by: Ajay Kumar 
Signed-off-by: Sathyam Panda 
---
 drivers/iommu/iova.c | 17 -
 include/linux/iova.h |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index 41c605b0058f..0e99975036ae 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -44,6 +44,7 @@ init_iova_domain(struct iova_domain *iovad, unsigned long 
granule,
iovad->granule = granule;
iovad->start_pfn = start_pfn;
iovad->dma_32bit_pfn = 1UL << (32 - iova_shift(iovad));
+   iovad->curr_limit_pfn = iovad->dma_32bit_pfn;
iovad->max32_alloc_size = iovad->dma_32bit_pfn;
iovad->flush_cb = NULL;
iovad->fq = NULL;
@@ -116,9 +117,20 @@ EXPORT_SYMBOL_GPL(init_iova_flush_queue);
 static struct rb_node *
 __get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn)
 {
-   if (limit_pfn <= iovad->dma_32bit_pfn)
+   if (limit_pfn <= iovad->dma_32bit_pfn) {
+   /* re-init cached node if DMA limit has changed */
+   if (limit_pfn != iovad->curr_limit_pfn) {
+   iovad->cached32_node = >anchor.node;
+   iovad->curr_limit_pfn = limit_pfn;
+   }
return iovad->cached32_node;
+   }
 
+   /* re-init cached node if DMA limit has changed */
+   if (limit_pfn != iovad->curr_limit_pfn) {
+   iovad->cached_node = >anchor.node;
+   iovad->curr_limit_pfn = limit_pfn;
+   }
return iovad->cached_node;
 }
 
@@ -190,6 +202,9 @@ static int __alloc_and_insert_iova_range(struct iova_domain 
*iovad,
if (size_aligned)
align_mask <<= fls_long(size - 1);
 
+   if (limit_pfn != iovad->curr_limit_pfn)
+   iovad->max32_alloc_size = iovad->dma_32bit_pfn;
+
/* Walk the tree backwards */
spin_lock_irqsave(>iova_rbtree_lock, flags);
if (limit_pfn <= iovad->dma_32bit_pfn &&
diff --git a/include/linux/iova.h b/include/linux/iova.h
index a0637abffee8..be2220c096ef 100644
--- a/include/linux/iova.h
+++ b/include/linux/iova.h
@@ -73,6 +73,7 @@ struct iova_domain {
unsigned long   granule;/* pfn granularity for this domain */
unsigned long   start_pfn;  /* Lower limit for this domain */
unsigned long   dma_32bit_pfn;
+   unsigned long   curr_limit_pfn; /* Current max limit for this domain */
unsigned long   max32_alloc_size; /* Size of last failed allocation */
struct iova_fq __percpu *fq;/* Flush Queue */
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-04-24 Thread Ajay kumar
Can someone check this?

On Mon, Apr 20, 2020 at 9:24 PM Ajay kumar  wrote:
>
> Hi All,
>
> I have an IOMMU master which has limitations as mentioned below:
> 1) The IOMMU master internally executes a firmware, and the firmware memory
> is allocated by the same master driver.
> The firmware buffer address should be of the lowest range than other address
> allocated by the device, or in other words, all the remaining buffer addresses
> should always be in a higher range than the firmware address.
> 2) None of the buffer addresses should go beyond 0xC000_
>
> example:
> If firmware buffer address is buf_fw = 0x8000_5000;
> All other addresses given to the device should be greater than
> (0x8000_5000 + firmware size) and less than 0xC000_
>
> Currently, this is being handled with one of the below hacks:
> 1) By keeping dma_mask in lower range while allocating firmware buffer,
> and then increasing the dma_mask to higher range for other buffers.
> 2) By reserving IOVA for firmware at the lowest range and creating direct 
> mappings for the same.
>
> I want to know if there is a better way this can be handled with current 
> framework,
> or if anybody is facing similar problems with their devices,
> please share how it is taken care.
>
> I also think there should be some way the masters can specify the IOVA
> range they want to limit to for current allocation.
> Something like a new iommu_ops callback like below:
> limit_iova_alloc_range(dev, iova_start, iova_end)
>
> And, in my driver, the sequence will be:
> limit_iova_alloc_range(dev, 0x_, 0x1000_); /* via helpers */
> alloc( ) firmware buffer using DMA API
> limit_iova_alloc_range(dev, 0x1000_, 0xC000_); /* via helpers */
> alloc( ) other buffers using DMA API
>
> Thanks,
> Ajay Kumar
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


IOVA allocation dependency between firmware buffer and remaining buffers

2020-04-20 Thread Ajay kumar
Hi All,

I have an IOMMU master which has limitations as mentioned below:
1) The IOMMU master internally executes a firmware, and the firmware memory
is allocated by the same master driver.
The firmware buffer address should be of the lowest range than other address
allocated by the device, or in other words, all the remaining buffer
addresses
should always be in a higher range than the firmware address.
2) None of the buffer addresses should go beyond 0xC000_

example:
If firmware buffer address is buf_fw = 0x8000_5000;
All other addresses given to the device should be greater than
(0x8000_5000 + firmware size) and less than 0xC000_

Currently, this is being handled with one of the below hacks:
1) By keeping dma_mask in lower range while allocating firmware buffer,
and then increasing the dma_mask to higher range for other buffers.
2) By reserving IOVA for firmware at the lowest range and creating direct
mappings for the same.

I want to know if there is a better way this can be handled with current
framework, or if anybody is facing similar problems with their devices,
please share how it is taken care.

I also think there should be some way the masters can specify the IOVA
range they want to limit to for current allocation.
Something like a new iommu_ops callback like below:
limit_iova_alloc_range(dev, iova_start, iova_end)

And, in my driver, the sequence will be:
limit_iova_alloc_range(dev, 0x_, 0x1000_); /* via helpers */
alloc( ) firmware buffer using DMA API
limit_iova_alloc_range(dev, 0x1000_, 0xC000_); /* via helpers */
alloc( ) other buffers using DMA API

Thanks,
Ajay Kumar
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu