On 2019-05-31 5:32 p.m., Yang, Philip wrote:
>
> On 2019-05-31 3:42 p.m., Kuehling, Felix wrote:
>> On 2019-05-31 1:28 p.m., Yang, Philip wrote:
>>> On 2019-05-30 6:36 p.m., Kuehling, Felix wrote:
>>>>>       
>>>>>       #if IS_ENABLED(CONFIG_DRM_AMDGPU_USERPTR)
>>>>> - if (gtt->ranges &&
>>>>> -     ttm->pages[0] == hmm_pfn_to_page(&gtt->ranges[0],
>>>>> -                                      gtt->ranges[0].pfns[0]))
>>>>> + if (gtt->range &&
>>>>> +     ttm->pages[0] == hmm_device_entry_to_page(gtt->range,
>>>>> +                                               gtt->range->pfns[0]))
>>>> I think just checking gtt->range here is enough. If gtt->range is not
>>>> NULL here, we're leaking memory.
>>>>
>>> If just checking gtt->range, there is a false warning in amdgpu_test
>>> userptr case in amdgpu_cs_list_validate path. If userptr is invalidated,
>>> then ttm->pages[0] is outdated pages, lobj->user_pages is new pages, it
>>> goes to ttm_tt_unbind first to unpin old pages (this causes false
>>> warning) then call amdgpu_ttm_tt_set_user_pages.
>> But doesn't that mean we're leaking the gtt->range somewhere?
>>
> ttm_tt_unbind is called from ttm_tt_destroy and amdgpu_cs_list_validate,
> the later one causes false warning. ttm_ttm_destory path is fine to only
> check gtt->range.
>
> Double checked, amdgpu_ttm_tt_get_user_pages and
> amdgpu_ttm_tt_get_user_pages_done always match in both paths, so no leak
> gtt->range.
>
> 1. amdgpu_gem_userptr_ioctl
>         amdgpu_ttm_tt_get_user_pages
>             ttm->pages for userptr pages
>         amdgpu_ttm_tt_get_user_pages_done
>
> 2. amdgpu_cs_ioctl
>         amdgpu_cs_parser_bos
>             amdgpu_ttm_tt_get_user_pages
>             if (userpage_invalidated)
>                 e->user_pages for new pages
>             amdgpu_cs_list_validate
>                 if (userpage_invalidated)
>                    ttm_tt_unbind ttm->pages // this causes warning
>                    amdgpu_ttm_tt_set_user_pages(ttm->pages, e->user_pages)

Hmm, I think amdgpu_cs is doing something weird there. It does some 
double book-keeping of the user pages in the BO list and the TTM BO. We 
did something similar for KFD and simplified it when moving to HMM. It 
could probably be simplified for amdgpu_cs as well. But not in this 
patch series.

I'll review your updated change.

Thanks,
   Felix


>         amdgpu_cs_submit
>             amdgpu_ttm_tt_get_user_pages_done
>
>> Regards,
>>      Felix
>>
>>
>>> I will submit patch v2, to add retry if hmm_range_fault returns -EAGAIN.
>>> use kzalloc to allocate small size range.
>>>
>>> Thanks,
>>> Philip
>>>
>>>> Regards,
>>>>        Felix
>>>>
>>>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to