On 2018-02-09 07:34 AM, Christian König wrote:
> I really wonder if sharing the GPUVM instance from a render node file
> descriptor wouldn't be easier.
> You could just use the existing IOCTL for allocating and mapping
> memory on a render node and then give the prepared fd to kfd to use.

In amd-kfd-staging we have an ioctl to import a DMABuf into the KFD VM
for interoperability between graphics and compute. I'm planning to
upstream that later. It still depends on the KFD calls for managing the
GPU mappings. The KFD GPU mapping calls need to interact with the
eviction logic. TLB flushing involves the HIQ/KIQ for figuring out the
VMID/PASID mapping, which is managed asynchronously by the HWS, not
amdgpu_ids.c. User pointers need a different MMU notifier. AQL queue
buffers on some GPUs need a double-mapping workaround for a HW
wraparound bug. I could go on. So this is not easy to retrofit into
amdgpu_gem and amdgpu_cs. Also, KFD VMs are created differently
(AMDGPU_VM_CONTEXT_COMPUTE, amdkfd_vm wrapper structure).

What's more, buffers allocated through amdgpu_gem calls create GEM
objects that we don't need. And exporting and importing DMABufs adds
more overhead and a potential to run into the process file-descriptor
limit (maybe the FD could be discarded after importing).

I honestly thought about whether this would be feasible when we
implemented the CPU mapping through the DRM FD. But it would be nothing
short of a complete redesign of the KFD memory management code. It would
be months of work, more instability, for questionable benefits. I don't
think it would be in the interest of end users and customers.

I just thought of a slightly different approach I would consider more
realistic, without having thought through all the details: Adding
KFD-specific memory management ioctls to the amdgpu device. Basically
call amdgpu_amdkfd_gpuvm functions from amdgpu ioctl functions instead
of KFD ioctl functions. But we'd still have KFD ioctls for other things,
and the new amdgpu ioctls would be KFD-specific and not useful for
graphics applications. It's probably still several weeks of work, but
shouldn't have major teething issues because the internal logic and
functionality would be basically unchanged. It would just move the
ioctls from one device to another.


> Regards,
> Christian.
> Am 07.02.2018 um 02:32 schrieb Felix Kuehling:
>> From: Oak Zeng <oak.z...@amd.com>
>> Populate DRM render device minor in kfd topology
>> Signed-off-by: Oak Zeng <oak.z...@amd.com>
>> Signed-off-by: Felix Kuehling <felix.kuehl...@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4 ++++
>>   drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 +
>>   2 files changed, 5 insertions(+)
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> index 2506155..ac28abc 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>> @@ -441,6 +441,8 @@ static ssize_t node_show(struct kobject *kobj,
>> struct attribute *attr,
>>               dev->node_props.device_id);
>>       sysfs_show_32bit_prop(buffer, "location_id",
>>               dev->node_props.location_id);
>> +    sysfs_show_32bit_prop(buffer, "drm_render_minor",
>> +            dev->node_props.drm_render_minor);
>>         if (dev->gpu) {
>>           log_max_watch_addr =
>> @@ -1214,6 +1216,8 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>>           dev->gpu->kfd2kgd->get_max_engine_clock_in_mhz(dev->gpu->kgd);
>>       dev->node_props.max_engine_clk_ccompute =
>>           cpufreq_quick_get_max(0) / 1000;
>> +    dev->node_props.drm_render_minor =
>> +        gpu->shared_resources.drm_render_minor;
>>         kfd_fill_mem_clk_max_info(dev);
>>       kfd_fill_iolink_non_crat_info(dev);
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> index c0be2be..eb54cfc 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>> @@ -71,6 +71,7 @@ struct kfd_node_properties {
>>       uint32_t location_id;
>>       uint32_t max_engine_clk_fcompute;
>>       uint32_t max_engine_clk_ccompute;
>> +    int32_t  drm_render_minor;
>>       uint16_t marketing_name[KFD_TOPOLOGY_PUBLIC_NAME_SIZE];
>>   };

amd-gfx mailing list

Reply via email to