On 18/11/17 03:31 PM, Christian König wrote:
> Am 17.11.2017 um 17:09 schrieb Michel Dänzer:
>> On 17/11/17 11:28 AM, Christian König wrote:
>>> Ping? Michel, Alex can somebody take a look?
>> Patch 2 is
>>
>> Reviewed-by: Michel Dänzer <[email protected]>
>>
>>
>> With patches 1 & 3, it's not 100% clear to me what the idea is behind
>> the handling of the hole on the kernel and userspace side. Maybe you can
>> add some explanation in code comments or the commit logs?
> Yeah, that is actually a bit of a mess because the hardware
> documentation wasn't very clear on how this works.
> 
> How about this as extra code comment on patch 1 to the assignment of
> dev_info.virtual_address_max:
> 
> /*
>  * Old userspace isn't aware of the VA hole on Vega10. So in theory an
> client could get invalid VA addresses assigned.
>  * To fix this and keep backward compatibility we limit the VA space
> reported in this field to the range below the hole.
>  */
> 
> The last patch is then to report the VA space above the hole, cause that
> is actually what libdrm should use.
> 
> The crux is when I put the VA space above the hole directly into the old
> fields older versions of libdrm would break and we can't do that.

That much was actually clear. :) Maybe some questions will expose what
I'm missing:


Patch 1:

> @@ -880,14 +880,14 @@  static int amdgpu_cs_ib_vm_chunk(struct amdgpu_device 
> *adev,
>                       if (chunk->chunk_id != AMDGPU_CHUNK_ID_IB)
>                               continue;
>  
> -                     r = amdgpu_cs_find_mapping(p, chunk_ib->va_start,
> -                                                &aobj, &m);
> +                     va_start = chunk_ib->va_start & AMDGPU_VA_HOLE_MASK;
> +                     r = amdgpu_cs_find_mapping(p, va_start, &aobj, &m);

I don't understand how userspace can make use of the address space above
the hole, since AMDGPU_VA_HOLE_MASK is 0x0000ffffffffffffULL and
AMDGPU_VA_HOLE_END is 0xffff800000000000ULL, so masking with
AMDGPU_VA_HOLE_MASK always results in an address below or inside the hole?


> @@ -566,6 +566,17 @@  int amdgpu_gem_va_ioctl(struct drm_device *dev, void 
> *data,
>               return -EINVAL;
>       }
>  
> +     if (args->va_address >= AMDGPU_VA_HOLE_START &&
> +         args->va_address < AMDGPU_VA_HOLE_END) {
> +             dev_dbg(&dev->pdev->dev,
> +                     "va_address 0x%LX is in VA hole 0x%LX-0x%LX\n",
> +                     args->va_address, AMDGPU_VA_HOLE_START,
> +                     AMDGPU_VA_HOLE_END);
> +             return -EINVAL;
> +     }
> +
> +     args->va_address &= AMDGPU_VA_HOLE_MASK;

Ignoring the above, I think the masking with AMDGPU_VA_HOLE_MASK would
need to be done before checking for the hole, otherwise the masked
address could be inside the hole?


Patch 3:

> @@ -577,10 +578,17 @@  static int amdgpu_info_ioctl(struct drm_device *dev, 
> void *data, struct drm_file
>                       dev_info.ids_flags |= AMDGPU_IDS_FLAGS_FUSION;
>               if (amdgpu_sriov_vf(adev))
>                       dev_info.ids_flags |= AMDGPU_IDS_FLAGS_PREEMPTION;
> +
> +             vm_size = adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE;
>               dev_info.virtual_address_offset = AMDGPU_VA_RESERVED_SIZE;
>               dev_info.virtual_address_max =
> -                     min(adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE,
> -                         AMDGPU_VA_HOLE_START);
> +                     min(vm_size, AMDGPU_VA_HOLE_START);
> +
> +             vm_size -= AMDGPU_VA_RESERVED_SIZE;
> +             if (vm_size > AMDGPU_VA_HOLE_START) {
> +                     dev_info.high_va_offset = AMDGPU_VA_HOLE_END;
> +                     dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size;
> +             }

This mostly makes sense, except for the

                dev_info.high_va_max = AMDGPU_VA_HOLE_END | vm_size;

line. Can you explain how simply or'ing together the address of the end
of the hole and the VM size works correctly in all cases?

As an extreme example, with vm_size == 1, this value would compute as
0xffff800000000001ULL, which seems wrong or at least weird.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to