Re: [PATCH v2 5/9] drm/amdkfd: use helper pci_dev_id

2019-04-25 Thread Koenig, Christian
Am 24.04.19 um 21:14 schrieb Heiner Kallweit:
> Use new helper pci_dev_id() to simplify the code.
>
> Signed-off-by: Heiner Kallweit 

Acked-by: Christian König 

> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 2cb09e088..769dbc7be 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1272,8 +1272,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>   
>   dev->node_props.vendor_id = gpu->pdev->vendor;
>   dev->node_props.device_id = gpu->pdev->device;
> - dev->node_props.location_id = PCI_DEVID(gpu->pdev->bus->number,
> - gpu->pdev->devfn);
> + dev->node_props.location_id = pci_dev_id(gpu->pdev);
>   dev->node_props.max_engine_clk_fcompute =
>   amdgpu_amdkfd_get_max_engine_clock_in_mhz(dev->gpu->kgd);
>   dev->node_props.max_engine_clk_ccompute =

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma

2019-01-30 Thread Koenig, Christian
Am 30.01.19 um 09:02 schrieb Christoph Hellwig:
> On Tue, Jan 29, 2019 at 08:58:35PM +, Jason Gunthorpe wrote:
>> On Tue, Jan 29, 2019 at 01:39:49PM -0700, Logan Gunthorpe wrote:
>>
>>> implement the mapping. And I don't think we should have 'special' vma's
>>> for this (though we may need something to ensure we don't get mapping
>>> requests mixed with different types of pages...).
>> I think Jerome explained the point here is to have a 'special vma'
>> rather than a 'special struct page' as, really, we don't need a
>> struct page at all to make this work.
>>
>> If I recall your earlier attempts at adding struct page for BAR
>> memory, it ran aground on issues related to O_DIRECT/sgls, etc, etc.
> Struct page is what makes O_DIRECT work, using sgls or biovecs, etc on
> it work.  Without struct page none of the above can work at all.  That
> is why we use struct page for backing BARs in the existing P2P code.
> Not that I'm a particular fan of creating struct page for this device
> memory, but without major invasive surgery to large parts of the kernel
> it is the only way to make it work.

The problem seems to be that struct page does two things:

1. Memory management for system memory.
2. The object to work with in the I/O layer.

This was done because a good part of that stuff overlaps, like reference 
counting how often a page is used.  The problem now is that this doesn't 
work very well for device memory in some cases.

For example on GPUs you usually have a large amount of memory which is 
not even accessible by the CPU. In other words you can't easily create a 
struct page for it because you can't reference it with a physical CPU 
address.

Maybe struct page should be split up into smaller structures? I mean 
it's really overloaded with data.

Christian.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 2/6] drivers core: Add I/O ASID allocator

2018-11-21 Thread Koenig, Christian
Am 21.11.18 um 12:16 schrieb Jean-Philippe Brucker:
> On 12/11/2018 14:40, Joerg Roedel wrote:
>> Hi Jean-Philippe,
>>
>> On Fri, Oct 19, 2018 at 07:11:54PM +0100, Jean-Philippe Brucker wrote:
>>> The allocator doesn't really belong in drivers/iommu because some
>>> drivers would like to allocate PASIDs for devices that aren't managed by
>>> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
>>> drivers/pci either since platform device also support PASID. Add the
>>> allocator in drivers/base.
>> I don't really buy this argument, in the end it is the IOMMU that
>> enforces the PASID mappings for a device. Why should a device not behind
>> an IOMMU need to allocate a pasid? Do you have examples of such devices
>> which also don't have their own iommu built-in?
> I misunderstood that requirement. Reading through the thread again
> (https://www.mail-archive.com/iommu@lists.linux-foundation.org/msg25640.html)
> it's more about a device using PASIDs as context IDs. Some contexts are
> not bound to processes but they still need their ID in the same PASID
> space as the contexts that are bound to process address spaces. So we
> can keep that code in drivers/iommu

The problem with that approach is that we also need this allocator when 
IOMMU is completely disabled.

In other words PASIDs are used as contexts IDs by the hardware for 
things like signaling which application has caused an interrupt/event 
even when they are not used by IOMMU later on.

Additional to that we have some MMUs which are internal to the devices 
(GPUVM on AMD GPUs for example, similar exists for NVidia) which uses 
PASIDs for address translation internally in the device.

All of that is completely unrelated to IOMMU, but when IOMMU is enabled 
you need to use the same allocator because all use cases use the same ID 
space.

Regards,
Christian.

>
>>> +int ioasid_for_each(struct ioasid_set *set, ioasid_iter_t func, void *data)
>>> +{
>>> +   int ret;
>>> +   struct ioasid_iter_data iter_data = {
>>> +   .set= set,
>>> +   .func   = func,
>>> +   .data   = data,
>>> +   };
>>> +
>>> +   idr_lock(_idr);
>>> +   ret = idr_for_each(_idr, ioasid_iter, _data);
>>> +   idr_unlock(_idr);
>>> +
>>> +   return ret;
>>> +}
>>> +EXPORT_SYMBOL_GPL(ioasid_for_each);
>> What is the intended use-case for this function?
> I'm using it in sva_bind(), to see if there already exists an io_mm
> associated to the mm_struct given as argument
>
>>> +void *ioasid_find(struct ioasid_set *set, ioasid_t ioasid)
>>> +{
>>> +   void *priv = NULL;
>>> +   struct ioasid_data *ioasid_data;
>>> +
>>> +   idr_lock(_idr);
>>> +   ioasid_data = idr_find(_idr, ioasid);
>>> +   if (ioasid_data && ioasid_data->set == set)
>>> +   priv = ioasid_data->private;
>>> +   idr_unlock(_idr);
>>> +
>>> +   return priv;
>>> +}
>> I think using the idr_lock here will become a performance problem, as we
>> need this function in the SVA page-fault path to retrieve the mm_struct
>> belonging to a PASID.
>>
>> The head comment in lib/idr.c states that it should be safe to use
>> rcu_readlock() on read-only iterations. If so, this should be used here
>> so that concurrent lookups are possible.
> Agreed. I have a patch to use the RCU which looks simple enough, but
> since I haven't used RCU before I wasn't comfortable squashing it right
> away.
>
> Thanks,
> Jean
>
> diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> index 5eb3abbf69ac..fa3b48c9c044 100644
> --- a/drivers/base/ioasid.c
> +++ b/drivers/base/ioasid.c
> @@ -13,6 +13,7 @@ struct ioasid_data {
>   int id;
>   struct ioasid_set *set;
>   void *private;
> + struct rcu_head rcu;
>   };
>
>   static DEFINE_IDR(ioasid_idr);
> @@ -56,7 +57,8 @@ void ioasid_free(ioasid_t ioasid)
>   ioasid_data = idr_remove(_idr, ioasid);
>   idr_unlock(_idr);
>
> - kfree(ioasid_data);
> + if (ioasid_data)
> + kfree_rcu(ioasid_data, rcu);
>   }
>   EXPORT_SYMBOL_GPL(ioasid_free);
>
> @@ -95,9 +97,9 @@ int ioasid_for_each(struct ioasid_set *set,
> ioasid_iter_t func, void *data)
>   .data   = data,
>   };
>
> - idr_lock(_idr);
> + rcu_read_lock();
>   ret = idr_for_each(_idr, ioasid_iter, _data);
> - idr_unlock(_idr);
> + rcu_read_unlock();
>
>   return ret;
>   }
> @@ -111,11 +113,11 @@ void *ioasid_find(struct ioasid_set *set, ioasid_t
> ioasid)
>   void *priv = NULL;
>   struct ioasid_data *ioasid_data;
>
> - idr_lock(_idr);
> + rcu_read_lock();
>   ioasid_data = idr_find(_idr, ioasid);
>   if (ioasid_data && ioasid_data->set == set)
>   priv = ioasid_data->private;
> - idr_unlock(_idr);
> + rcu_read_unlock();
>
>   return priv;
>   }
>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu