Hi Jörg, Thanks for your review, I'll fix the issues you pointed out and I left out.
On Mon, Mar 02, 2020 at 04:36:06PM +0100, Joerg Roedel wrote:
> On Thu, Feb 20, 2020 at 07:15:14PM +0100, Maxime Ripard wrote:
> > +struct sun50i_iommu_domain {
> > + struct iommu_domain domain;
> > +
> > + /* Number of devices attached to the domain */
> > + refcount_t refcnt;
> > +
> > + /* Lock to modify the Directory Table */
> > + spinlock_t dt_lock;
>
> I suggest you make page-table updates lock-less. Otherwise this lock
> will become a bottle-neck when using the IOMMU through DMA-API.
As far as I understand it, the page table can be accessed concurrently
since the framework doesn't seem to provide any serialization /
locking, shouldn't we have some locks to prevent concurrent access?
> > +
> > +static int sun50i_iommu_map(struct iommu_domain *domain, unsigned long
> > iova,
> > + phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
> > +{
> > + struct sun50i_iommu_domain *sun50i_domain = to_sun50i_domain(domain);
> > + struct sun50i_iommu *iommu = sun50i_domain->iommu;
> > + u32 pte_index;
> > + u32 *page_table, *pte_addr;
> > + unsigned long flags;
> > + int ret = 0;
> > +
> > + spin_lock_irqsave(&sun50i_domain->dt_lock, flags);
> > + page_table = sun50i_dte_get_page_table(sun50i_domain, iova, gfp);
> > + if (IS_ERR(page_table)) {
> > + ret = PTR_ERR(page_table);
> > + goto out;
> > + }
> > +
> > + pte_index = sun50i_iova_get_pte_index(iova);
> > + pte_addr = &page_table[pte_index];
> > + if (sun50i_pte_is_page_valid(*pte_addr)) {
>
> You can use unlikely() here.
>
> > + phys_addr_t page_phys = sun50i_pte_get_page_address(*pte_addr);
> > + dev_err(iommu->dev,
> > + "iova %pad already mapped to %pa cannot remap to %pa
> > prot: %#x\n",
> > + &iova, &page_phys, &paddr, prot);
> > + ret = -EBUSY;
> > + goto out;
> > + }
> > +
> > + *pte_addr = sun50i_mk_pte(paddr, prot);
> > + sun50i_table_flush(sun50i_domain, pte_addr, 1);
>
> This maps only one page, right? But the function needs to map up to
> 'size' as given in the parameter list.
It does, but pgsize_bitmap is set to 4k only (since the hardware only
supports that), so we would have multiple calls to map, each time with
a single page judging from:
https://elixir.bootlin.com/linux/latest/source/drivers/iommu/iommu.c#L1948
Right?
> > +
> > + spin_lock_irqsave(&iommu->iommu_lock, flags);
> > + sun50i_iommu_tlb_invalidate(iommu, iova);
> > + spin_unlock_irqrestore(&iommu->iommu_lock, flags);
>
> Why is there a need to flush the TLB here? The IOMMU-API provides
> call-backs so that the user of the API can decide when it wants
> to flush the IO/TLB. Such flushes are usually expensive and doing them
> on every map and unmap will cost significant performance.
The vendor driver was doing something along those lines and I wanted
to be conservative with the cache management if we didn't run into
performances issues, but I'll convert to the iotlb callbacks then.
Thanks!
Maxime
signature.asc
Description: PGP signature
_______________________________________________ iommu mailing list [email protected] https://lists.linuxfoundation.org/mailman/listinfo/iommu
