On 13.12.2016 10:48, Christian König wrote:
The attached patch has fixed these crashes for me so far, but it's
very heavy-handed: it collects all page table shadows and the page
directory shadow and adds them all to the reservations for the callers
of amdgpu_vm_update_page_directory.

That is most likely just a timing change, cause the shadows should end
up in the duplicates list anyway. So the patch shouldn't have any
effect.

Okay, so the reason for the remaining crash is still unclear at least
for me.

Yeah, that's a really good question. Can you share the call stack of the
problem once more?

Pretty sure I found the root cause now. amdgpu_vm_validate_pt_bos relies on the eviction counter to be able to skip the validation of the page tables.

However, moving the shadow page tables out from mem_type TT to SYSTEM doesn't count as an eviction (it just unbinds the mapping in the GTT).

Clearly, that's a problem.

The quick fix is to skip the num_evictions check in amdgpu_vm_validate_pt_bos. That has worked for me so far.

The next best thing is to add an unbind counter in addition to the eviction counter that gets incremented whenever a BO is unbound (so it counts a superset of what the eviction counter counts), and then check that instead of the eviction counter.

Cheers,
Nicolai
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to