Am 12.12.2017 um 15:57 schrieb Marek Olšák:
On Tue, Dec 12, 2017 at 10:01 AM, Christian König
<ckoenig.leichtzumer...@gmail.com> wrote:
Am 11.12.2017 um 22:29 schrieb Marek Olšák:
From: Marek Olšák <marek.ol...@amd.com>

Signed-off-by: Marek Olšák <marek.ol...@amd.com>
---

Is this really correct? I have no easy way to test it.

It's a step in the right direction, but I would rather vote for something
else:

Instead of disabling the timeout by default we only disable the GPU
reset/recovery.

The idea is to add a new parameter amdgpu_gpu_recovery which makes
amdgpu_gpu_recover only prints out an error and doesn't touch the GPU at all
(on bare metal systems).

Then we finally set the amdgpu_lockup_timeout to a non zero value by
default.

Andrey could you take care of this when you have time?
I don't understand this.

Why can't we keep the previous behavior where amdgpu.lockup_timeout=0
disabled GPU reset? Why do we have to add another option for the same
thing?

lockup_timeout=0 never disabled the GPU reset, it just disabled the timeout.

You could still manually trigger a reset and also invalid commands, invalid register writes and requests from the SRIOV hypervisor could trigger this.

And as Monk explained GPU resets are mandatory for SRIOV, you can't disable them at all in this case.

Additional to that we probably want the error message that something timed out, but not touching the hardware in any way.

Regards,
Christian.


Marek
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to