回复: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-20 Thread Pan, Xinhui
; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; dri-devel@lists.freedesktop.org 主题: Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin > swapout function create one swap storage which is filled with zero. And

Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-20 Thread Pan, Xinhui
I just sent out patch below yesterday. swapping unpopulated bo is useless indeed. [RFC PATCH 2/2] drm/ttm: skip swapout when ttm has no backend page.

Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-20 Thread Christian König
人: Kuehling, Felix; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin yes, we really dont swapout SG BOs. The problems is that before we validat

Re: 回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-20 Thread Christian König
, Felix; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin yes, we really dont swapout SG BOs. The problems is that before we validate a userptr BO, we create th

回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Pan, Xinhui
König; Pan, Xinhui; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; dri-devel@lists.freedesktop.org 主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin Looks like we're creating the userptr BO as ttm_bo_type_device. I

回复: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Pan, Xinhui
抄送: Deucher, Alexander; dan...@ffwll.ch; Koenig, Christian; dri-devel@lists.freedesktop.org 主题: Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin I'm scratching my head how that is even possible. See when a BO is created in the system domain it is just an em

Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Felix Kuehling
t;> >> >> 发件人: Pan, Xinhui >> 发送时间: 2021年5月19日 12:09 >> 收件人: Kuehling, Felix; amd-...@lists.freedesktop.org >> 抄送: Deucher, Alexander; Koenig, Christian; >> dri-devel@lists.freedesktop.org; dan...@ffwll.ch >> 主题: 回复: [RFC PATCH 1/2] drm

Re: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-19 Thread Christian König
, Felix; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin yes, we really dont swapout SG BOs. The problems is that before we validate a userptr BO, we create th

回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-18 Thread Pan, Xinhui
, Felix; amd-...@lists.freedesktop.org 抄送: Deucher, Alexander; Koenig, Christian; dri-devel@lists.freedesktop.org; dan...@ffwll.ch 主题: 回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin yes, we really dont swapout SG BOs. The problems is that before we validate a us

回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-18 Thread Pan, Xinhui
[AMD Official Use Only] yes, we really dont swapout SG BOs. The problems is that before we validate a userptr BO, we create this BO in CPU domain by default. So this BO has chance to swapout. we set flag TTM_PAGE_FLAG_SG on userptr BO in popluate() which is too late. I have not try to revert

回复: [RFC PATCH 1/2] drm/amdgpu: Fix memory corruption due to swapout and swapin

2021-05-18 Thread Pan, Xinhui
[AMD Official Use Only] To observe the issue. I made one kfdtest case for debug. It just alloc a userptr memory and detect if memory is corrupted. I can hit this failure in 2 minutes. :( diff --git a/tests/kfdtest/src/KFDMemoryTest.cpp b/tests/kfdtest/src/KFDMemoryTest.cpp index