On 13.12.2016 10:48, Christian König wrote:
The attached patch has fixed these crashes for me so far, but it's
very heavy-handed: it collects all page table shadows and the page
directory shadow and adds them all to the reservations for the callers
of amdgpu_vm_update_page_directory.
That is most likely just a timing change, cause the shadows should end
up in the duplicates list anyway. So the patch shouldn't have any
effect.
Okay, so the reason for the remaining crash is still unclear at least
for me.
Yeah, that's a really good question. Can you share the call stack of the
problem once more?
Pretty sure I found the root cause now. amdgpu_vm_validate_pt_bos relies
on the eviction counter to be able to skip the validation of the page
tables.
However, moving the shadow page tables out from mem_type TT to SYSTEM
doesn't count as an eviction (it just unbinds the mapping in the GTT).
Clearly, that's a problem.
The quick fix is to skip the num_evictions check in
amdgpu_vm_validate_pt_bos. That has worked for me so far.
The next best thing is to add an unbind counter in addition to the
eviction counter that gets incremented whenever a BO is unbound (so it
counts a superset of what the eviction counter counts), and then check
that instead of the eviction counter.
Cheers,
Nicolai
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx