Am 02.03.2018 um 17:38 schrieb Felix Kuehling:
On 2018-03-02 02:43 AM, Christian König wrote:
patch #3 looks good to me in general, but why do you need to cache the
The BO already does that anyway, so you can just use
amdgpu_bo_gpu_addr() for this which also makes some additional checks
on debug builds.
Do you mean amdgpu_bo_gpu_offset? That one requires the reservation to
be locked unless it's pinned.
Correct, well that's why I suggested it.
We are caching the PD physical address here to get around a tricky
circular locking issue we ran into under memory pressure with evictions.
I don't remember all the details, but I do remember that the deadlock
involved fences, which aren't tracked by the lock dependency checker. So
it was particularly tricky to nail down. Avoiding the need to lock the
page directory reservation for finding out its physical address breaks
the lock cycle.
OK, so what you do is you get the pd address after you validated the BOs
and cache it until you need it in the hardware setup?
If yes than that would be valid because we do exactly the same in the
job during command submission.
patch #5 well mixing power management into the VM functions is a clear
This part won't be in my initial upstream push for GPUVM.
That certainly doesn't belong there, but we can have a counter how
many compute VMs we have in the manager. amdgpu_vm_make_compute() can
then return if this was the first VM to became a compute VM or not.
We currently trigger compute power profile switching based on the
existence of compute VMs. It was a convenient hook where amdgpu finds
out about the existence of compute work. If that's not acceptable we can
look into moving it elsewhere, possibly using a new KFD2KGD interface.
As I said the VM code can still count how many compute VM there are, the
issue is it should not make power management decisions based on that.
The caller which triggers the switch of the VM to be a compute VM should
make that decision.
The rest of the patch looks good to me.
Am 01.03.2018 um 23:58 schrieb Felix Kuehling:
I have a working patch series against amd-kfg-staging that lets KFD use
VMs from render node FDs, as we discussed. There are two patches in that
series that touch amdgpu_vm.[ch] that I'd like your feedback on before I
commit the changes to amd-kfd-staging and include them in my upstream
patch series for KFD GPUVM support. See attached.
amd-gfx mailing list