Hi Eric,
On 13/03/17 13:07, Auger Eric wrote:
> Hi Robin,
>
> On 09/03/2017 20:50, Robin Murphy wrote:
>> Now that it's simple to discover the necessary reservations for a given
>> device/IOMMU combination, let's wire up the appropriate handling. Basic
>> reserved regions and direct-mapped regions are obvious enough to handle;
>> hardware MSI regions we can handle by pre-populating the appropriate
>> msi_pages in the cookie. That way, irqchip drivers which normally assume
>> MSIs to require mapping at the IOMMU can keep working without having
>> to special-case their iommu_dma_map_msi_msg() hook, or indeed be aware
>> at all of integration quirks preventing the IOMMU translating certain
>> addresses.
>>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>> drivers/iommu/dma-iommu.c | 65
>> +++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 65 insertions(+)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 1e0983488a8d..1082ebf8a415 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -167,6 +167,69 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
>> }
>> EXPORT_SYMBOL(iommu_put_dma_cookie);
>>
>> +static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
>> + phys_addr_t start, phys_addr_t end)
>> +{
>> + struct iova_domain *iovad = &cookie->iovad;
>> + struct iommu_dma_msi_page *msi_page;
>> + int i, num_pages;
>> +
>> + start &= ~iova_mask(iovad);
>> + end = iova_align(iovad, end);
> Is it always safe if second argument is a phys_addr_t?
Ooh, I think you're right - for the corner case of 32-bit unsigned long
and a crazy system with a doorbell above 4GB, end would get truncated
too early. I'll rework the arithmetic here to be safer.
>> + num_pages = (end - start) >> iova_shift(iovad);
>> +
>> + msi_page = kcalloc(num_pages, sizeof(*msi_page), GFP_KERNEL);
>> + if (!msi_page)
>> + return -ENOMEM;
>> +
>> + for (i = 0; i < num_pages; i++) {
>> + msi_page[i].phys = start;
>> + msi_page[i].iova = start;
>> + INIT_LIST_HEAD(&msi_page[i].list);
>> + list_add(&msi_page[i].list, &cookie->msi_page_list);
>> + start += iovad->granule;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int iova_reserve_iommu_regions(struct device *dev,
>> + struct iommu_domain *domain)
>> +{
>> + struct iommu_dma_cookie *cookie = domain->iova_cookie;
>> + struct iova_domain *iovad = &cookie->iovad;
>> + struct iommu_resv_region *region;
>> + struct list_head resv_regions;
>> + unsigned long lo, hi;
>> + int ret = 0;
>> +
>> + INIT_LIST_HEAD(&resv_regions);
>> + iommu_get_resv_regions(dev, &resv_regions);
>> + list_for_each_entry(region, &resv_regions, list) {
>> + /* We ARE the software that manages these! */
>> + if (region->type & IOMMU_RESV_SW_MSI)
>> + continue;
>> +
>> + lo = iova_pfn(iovad, region->start);
>> + hi = iova_pfn(iovad, region->start + region->length);
>> + reserve_iova(iovad, lo, hi);
>> +
>> + if (region->type & IOMMU_RESV_DIRECT) {
>> + ret = iommu_map(domain, region->start, region->start,
>> + region->length, region->prot);
>
> in iommu.c, iommu_group_create_direct_mappings also iommu_map() direct
> regions in some cases. Just to make sure cases don't overlap here.
Ah, I had indeed managed to overlook that, thanks for the reminder. We
should only get here long after iommu_group_add_device(), so I think it
should be safe to assume that any direct regions are already mapped and
just reserve the IOVAs here.
>> + } else if (region->type & IOMMU_RESV_MSI) {
>> + ret = cookie_init_hw_msi_region(cookie, region->start,
>> + region->start + region->length);
>> + }
>> +
>> + if (ret)
>> + break;
>> + }
>> + iommu_put_resv_regions(dev, &resv_regions);
>> +
>> + return ret;
>> +}
>> +
>> static void iova_reserve_pci_windows(struct pci_dev *dev,
>> struct iova_domain *iovad)
>> {
>> @@ -251,6 +314,8 @@ int iommu_dma_init_domain(struct iommu_domain *domain,
>> dma_addr_t base,
>> init_iova_domain(iovad, 1UL << order, base_pfn, end_pfn);
>> if (pci)
>> iova_reserve_pci_windows(to_pci_dev(dev), iovad);
>> + if (dev)
>> + iova_reserve_iommu_regions(dev, domain);
> Don't you want to escalate the returned value?
Yes, I'd posted this before I actually wrote the follow-on patch to make
proper reserved regions of the PCI windows as well (which I'll include
in v2) - that one does rearrange the return value logic here, but that's
no excuse for not doing it in this patch. Will fix.
> Besides
> Reviewed-by: Eric Auger <[email protected]>
Thanks!
Robin.
>
> Thanks
>
> Eric
>> }
>> return 0;
>> }
>>
_______________________________________________
iommu mailing list
[email protected]
https://lists.linuxfoundation.org/mailman/listinfo/iommu