Hi Eric,

On 13/03/17 13:07, Auger Eric wrote:
> Hi Robin,
> 
> On 09/03/2017 20:50, Robin Murphy wrote:
>> Now that it's simple to discover the necessary reservations for a given
>> device/IOMMU combination, let's wire up the appropriate handling. Basic
>> reserved regions and direct-mapped regions are obvious enough to handle;
>> hardware MSI regions we can handle by pre-populating the appropriate
>> msi_pages in the cookie. That way, irqchip drivers which normally assume
>> MSIs to require mapping at the IOMMU can keep working without having
>> to special-case their iommu_dma_map_msi_msg() hook, or indeed be aware
>> at all of integration quirks preventing the IOMMU translating certain
>> addresses.
>>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>>  drivers/iommu/dma-iommu.c | 65 
>> +++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 65 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 1e0983488a8d..1082ebf8a415 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -167,6 +167,69 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
>>  }
>>  EXPORT_SYMBOL(iommu_put_dma_cookie);
>>  
>> +static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
>> +            phys_addr_t start, phys_addr_t end)
>> +{
>> +    struct iova_domain *iovad = &cookie->iovad;
>> +    struct iommu_dma_msi_page *msi_page;
>> +    int i, num_pages;
>> +
>> +    start &= ~iova_mask(iovad);
>> +    end = iova_align(iovad, end);
> Is it always safe if second argument is a phys_addr_t?

Ooh, I think you're right - for the corner case of 32-bit unsigned long
and a crazy system with a doorbell above 4GB, end would get truncated
too early. I'll rework the arithmetic here to be safer.

>> +    num_pages = (end - start) >> iova_shift(iovad);
>> +
>> +    msi_page = kcalloc(num_pages, sizeof(*msi_page), GFP_KERNEL);
>> +    if (!msi_page)
>> +            return -ENOMEM;
>> +
>> +    for (i = 0; i < num_pages; i++) {
>> +            msi_page[i].phys = start;
>> +            msi_page[i].iova = start;
>> +            INIT_LIST_HEAD(&msi_page[i].list);
>> +            list_add(&msi_page[i].list, &cookie->msi_page_list);
>> +            start += iovad->granule;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int iova_reserve_iommu_regions(struct device *dev,
>> +            struct iommu_domain *domain)
>> +{
>> +    struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> +    struct iova_domain *iovad = &cookie->iovad;
>> +    struct iommu_resv_region *region;
>> +    struct list_head resv_regions;
>> +    unsigned long lo, hi;
>> +    int ret = 0;
>> +
>> +    INIT_LIST_HEAD(&resv_regions);
>> +    iommu_get_resv_regions(dev, &resv_regions);
>> +    list_for_each_entry(region, &resv_regions, list) {
>> +            /* We ARE the software that manages these! */
>> +            if (region->type & IOMMU_RESV_SW_MSI)
>> +                    continue;
>> +
>> +            lo = iova_pfn(iovad, region->start);
>> +            hi = iova_pfn(iovad, region->start + region->length);
>> +            reserve_iova(iovad, lo, hi);
>> +
>> +            if (region->type & IOMMU_RESV_DIRECT) {
>> +                    ret = iommu_map(domain, region->start, region->start,
>> +                                    region->length, region->prot);
> 
> in iommu.c, iommu_group_create_direct_mappings also iommu_map() direct
> regions in some cases. Just to make sure cases don't overlap here.

Ah, I had indeed managed to overlook that, thanks for the reminder. We
should only get here long after iommu_group_add_device(), so I think it
should be safe to assume that any direct regions are already mapped and
just reserve the IOVAs here.

>> +            } else if (region->type & IOMMU_RESV_MSI) {
>> +                    ret = cookie_init_hw_msi_region(cookie, region->start,
>> +                                    region->start + region->length);
>> +            }
>> +
>> +            if (ret)
>> +                    break;
>> +    }
>> +    iommu_put_resv_regions(dev, &resv_regions);
>> +
>> +    return ret;
>> +}
>> +
>>  static void iova_reserve_pci_windows(struct pci_dev *dev,
>>              struct iova_domain *iovad)
>>  {
>> @@ -251,6 +314,8 @@ int iommu_dma_init_domain(struct iommu_domain *domain, 
>> dma_addr_t base,
>>              init_iova_domain(iovad, 1UL << order, base_pfn, end_pfn);
>>              if (pci)
>>                      iova_reserve_pci_windows(to_pci_dev(dev), iovad);
>> +            if (dev)
>> +                    iova_reserve_iommu_regions(dev, domain);
> Don't you want to escalate the returned value?

Yes, I'd posted this before I actually wrote the follow-on patch to make
proper reserved regions of the PCI windows as well (which I'll include
in v2) - that one does rearrange the return value logic here, but that's
no excuse for not doing it in this patch. Will fix.

> Besides
> Reviewed-by: Eric Auger <[email protected]>

Thanks!

Robin.

> 
> Thanks
> 
> Eric
>>      }
>>      return 0;
>>  }
>>

_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to