Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM
On 03/07/17 08:47 PM, Christian König wrote: > Am 03.07.2017 um 11:49 schrieb Michel Dänzer: >>> Instead of messing with all this I suggest that we just add a jiffies >>> based timeout to the BO when we can clear the flag. For kernel BOs this >>> timeout is just infinity. >>> >>> Then we check in amdgpu_cs_bo_validate() before generating the >>> placements if we could clear the flag and do so based on the timeout. >> The idea for this patch was to save the memory and CPU cycles needed for >> that approach. > But when we clear the flag on the end of the move we already moved the > BO to visible VRAM again. Right. Clearing the flag before the move would make the flag ineffective. We want to put the BO in CPU visible VRAM when the flag is set, because it means the BO has been accessed by the CPU since it was created or since it was last moved to VRAM. > Only on after the next swapout/swapin cycle we see an effect of that > change. Right, clearing the flag cannot have any effect before the next time the BO is moved to VRAM anyway. > Is that the intended approach? So yes, it is. The only significant difference to the timestamp based approach (for which John already posted patches before) is that this patch will remember any CPU access until the next time the BO is moved to VRAM, no matter how much time passes in between. BTW, one issue with the timestamp based approach is that we only get a page fault on the first CPU access after the BO was moved. So the timestamp only says how much time has passed since the first CPU access, not since the last CPU access. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM
Am 03.07.2017 um 11:49 schrieb Michel Dänzer: Instead of messing with all this I suggest that we just add a jiffies based timeout to the BO when we can clear the flag. For kernel BOs this timeout is just infinity. Then we check in amdgpu_cs_bo_validate() before generating the placements if we could clear the flag and do so based on the timeout. The idea for this patch was to save the memory and CPU cycles needed for that approach. But when we clear the flag on the end of the move we already moved the BO to visible VRAM again. Only on after the next swapout/swapin cycle we see an effect of that change. Is that the intended approach? Regards, Christian. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM
On 03/07/17 04:06 PM, Christian König wrote: > Am 03.07.2017 um 03:34 schrieb Michel Dänzer: > >> [...] I suggested clearing the flag here to John on IRC. The >> idea is briefly described in the commit log, let me elaborate a bit on >> that: >> >> When a BO is moved to VRAM which has the AMDGPU_BO_FLAG_CPU_ACCESS flag >> set, it is put in CPU visible VRAM, and the flag is cleared. If the CPU >> doesn't access the BO, the next time it will be moved to VRAM (after it >> was evicted from there, for any reason), the flag will no longer be set, >> and the BO will likely be moved to CPU invisible VRAM. >> >> If the BO is accessed by the CPU again though (no matter where the BO is >> currently located at that time), the flag is set again, and the cycle >> from the previous paragraph starts over. >> >> The end result should be similar as with the timestamp based solution in >> John's earlier series: BOs which are at least occasionally accessed by >> the CPU will tend to be in CPU visible VRAM, those which are never >> accessed by the CPU can be in CPU invisible VRAM. > Yeah, I understand the intention. But the implementation isn't even > remotely correct. > > First of all the flag must be cleared in the CS which wants to move the > BO, not in the move functions when the decision where to put it is > already made. For the purpose of this patch, we should clear the flag when the BO is actually moved to VRAM, regardless of how or why. > Second currently the flag is set on page fault, but never cleared > because the place where the code to clear it was added is just > completely incorrect (see above). My bad, thanks for pointing this out. The following at the end of amdgpu_bo_move should do the trick: if (new_mem->mem_type == TTM_PL_VRAM && old_mem->mem_type != TTM_PL_VRAM) { /* amdgpu_bo_fault_reserve_notify will re-set this if * the CPU accesses the BO after it's moved. */ abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS; } > Instead of messing with all this I suggest that we just add a jiffies > based timeout to the BO when we can clear the flag. For kernel BOs this > timeout is just infinity. > > Then we check in amdgpu_cs_bo_validate() before generating the > placements if we could clear the flag and do so based on the timeout. The idea for this patch was to save the memory and CPU cycles needed for that approach. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM
Am 03.07.2017 um 03:34 schrieb Michel Dänzer: On 02/07/17 09:52 PM, Christian König wrote: Am 30.06.2017 um 17:18 schrieb John Brooks: When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This allows it to potentially later move to invisible VRAM if the CPU does not access it again. Setting the CPU_ACCESS flag in amdgpu_fault_reserve_notify() also means that we can remove the loop to restrict lpfn to the end of visible VRAM, because amdgpu_ttm_placement_init() will do it for us. Signed-off-by: John Brooks[...] @@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, if (unlikely(r)) { goto out_cleanup; } + +/* The page fault handler will re-set this if the CPU accesses the BO + * after it's moved. + */ Maybe say "amdgpu_bo_fault_reserve_notify" explicitly here instead of "The page fault handler". +abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS; + This is the wrong place for clearing the flag. This code path is only called when we move things back in after suspend/resume (or run out of GTT space). Surely amdgpu_move_ram_vram is called whenever a BO is moved to VRAM, No, that isn't even remotely correct. amdgpu_move_ram_vram() is only called when the BO is moved directly from the system domain to the VRAM domain. Normally BOs are only moved from the GTT domain to the VRAM domain, except after resume and when we ran out of GTT space. for any reason. I suggested clearing the flag here to John on IRC. The idea is briefly described in the commit log, let me elaborate a bit on that: When a BO is moved to VRAM which has the AMDGPU_BO_FLAG_CPU_ACCESS flag set, it is put in CPU visible VRAM, and the flag is cleared. If the CPU doesn't access the BO, the next time it will be moved to VRAM (after it was evicted from there, for any reason), the flag will no longer be set, and the BO will likely be moved to CPU invisible VRAM. If the BO is accessed by the CPU again though (no matter where the BO is currently located at that time), the flag is set again, and the cycle from the previous paragraph starts over. The end result should be similar as with the timestamp based solution in John's earlier series: BOs which are at least occasionally accessed by the CPU will tend to be in CPU visible VRAM, those which are never accessed by the CPU can be in CPU invisible VRAM. Yeah, I understand the intention. But the implementation isn't even remotely correct. First of all the flag must be cleared in the CS which wants to move the BO, not in the move functions when the decision where to put it is already made. Second currently the flag is set on page fault, but never cleared because the place where the code to clear it was added is just completely incorrect (see above). Instead of messing with all this I suggest that we just add a jiffies based timeout to the BO when we can clear the flag. For kernel BOs this timeout is just infinity. Then we check in amdgpu_cs_bo_validate() before generating the placements if we could clear the flag and do so based on the timeout. I can help implementing this when I'm done getting ride of the BO move size limitation (swapped all of this stuff for that task back into my brain anyway). Regards, Christian. ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM
On 02/07/17 09:52 PM, Christian König wrote: > Am 30.06.2017 um 17:18 schrieb John Brooks: >> When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This >> allows it >> to potentially later move to invisible VRAM if the CPU does not access it >> again. >> >> Setting the CPU_ACCESS flag in amdgpu_fault_reserve_notify() also means >> that we can remove the loop to restrict lpfn to the end of visible VRAM, >> because amdgpu_ttm_placement_init() will do it for us. >> >> Signed-off-by: John Brooks[...] >> @@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct >> ttm_buffer_object *bo, >> if (unlikely(r)) { >> goto out_cleanup; >> } >> + >> +/* The page fault handler will re-set this if the CPU accesses >> the BO >> + * after it's moved. >> + */ Maybe say "amdgpu_bo_fault_reserve_notify" explicitly here instead of "The page fault handler". >> +abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS; >> + > > This is the wrong place for clearing the flag. This code path is only > called when we move things back in after suspend/resume (or run out of > GTT space). Surely amdgpu_move_ram_vram is called whenever a BO is moved to VRAM, for any reason. I suggested clearing the flag here to John on IRC. The idea is briefly described in the commit log, let me elaborate a bit on that: When a BO is moved to VRAM which has the AMDGPU_BO_FLAG_CPU_ACCESS flag set, it is put in CPU visible VRAM, and the flag is cleared. If the CPU doesn't access the BO, the next time it will be moved to VRAM (after it was evicted from there, for any reason), the flag will no longer be set, and the BO will likely be moved to CPU invisible VRAM. If the BO is accessed by the CPU again though (no matter where the BO is currently located at that time), the flag is set again, and the cycle from the previous paragraph starts over. The end result should be similar as with the timestamp based solution in John's earlier series: BOs which are at least occasionally accessed by the CPU will tend to be in CPU visible VRAM, those which are never accessed by the CPU can be in CPU invisible VRAM. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM
Am 30.06.2017 um 17:18 schrieb John Brooks: When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This allows it to potentially later move to invisible VRAM if the CPU does not access it again. Setting the CPU_ACCESS flag in amdgpu_fault_reserve_notify() also means that we can remove the loop to restrict lpfn to the end of visible VRAM, because amdgpu_ttm_placement_init() will do it for us. Signed-off-by: John Brooks--- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 8 2 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index fa8aeca..19bd2fd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -953,6 +953,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) return 0; abo = container_of(bo, struct amdgpu_bo, tbo); + + abo->flags |= AMDGPU_BO_FLAG_CPU_ACCESS; + if (bo->mem.mem_type != TTM_PL_VRAM) return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index c9b131b..cc65cdd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -417,6 +417,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, struct ttm_mem_reg *new_mem) { struct amdgpu_device *adev; + struct amdgpu_bo *abo; struct ttm_mem_reg *old_mem = >mem; struct ttm_mem_reg tmp_mem; struct ttm_placement placement; @@ -424,6 +425,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, int r; adev = amdgpu_ttm_adev(bo->bdev); + abo = container_of(bo, struct amdgpu_bo, tbo); tmp_mem = *new_mem; tmp_mem.mm_node = NULL; placement.num_placement = 1; @@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, if (unlikely(r)) { goto out_cleanup; } + + /* The page fault handler will re-set this if the CPU accesses the BO +* after it's moved. +*/ + abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS; + This is the wrong place for clearing the flag. This code path is only called when we move things back in after suspend/resume (or run out of GTT space). Regards, Christian. out_cleanup: ttm_bo_mem_put(bo, _mem); return r; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH RFC 2/2] drm/amdgpu: Set/clear CPU_ACCESS flag on page fault and move to VRAM
When a BO is moved to VRAM, clear AMDGPU_BO_FLAG_CPU_ACCESS. This allows it to potentially later move to invisible VRAM if the CPU does not access it again. Setting the CPU_ACCESS flag in amdgpu_fault_reserve_notify() also means that we can remove the loop to restrict lpfn to the end of visible VRAM, because amdgpu_ttm_placement_init() will do it for us. Signed-off-by: John Brooks--- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 8 2 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index fa8aeca..19bd2fd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -953,6 +953,9 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo) return 0; abo = container_of(bo, struct amdgpu_bo, tbo); + + abo->flags |= AMDGPU_BO_FLAG_CPU_ACCESS; + if (bo->mem.mem_type != TTM_PL_VRAM) return 0; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index c9b131b..cc65cdd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -417,6 +417,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, struct ttm_mem_reg *new_mem) { struct amdgpu_device *adev; + struct amdgpu_bo *abo; struct ttm_mem_reg *old_mem = >mem; struct ttm_mem_reg tmp_mem; struct ttm_placement placement; @@ -424,6 +425,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, int r; adev = amdgpu_ttm_adev(bo->bdev); + abo = container_of(bo, struct amdgpu_bo, tbo); tmp_mem = *new_mem; tmp_mem.mm_node = NULL; placement.num_placement = 1; @@ -446,6 +448,12 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo, if (unlikely(r)) { goto out_cleanup; } + + /* The page fault handler will re-set this if the CPU accesses the BO +* after it's moved. +*/ + abo->flags &= ~AMDGPU_BO_FLAG_CPU_ACCESS; + out_cleanup: ttm_bo_mem_put(bo, _mem); return r; -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx