On Wed, 20 Feb 2019 at 20:39, Grodzovsky, Andrey
wrote:
> No, we only fixed the original deadlock with display driver during GPU
> reset. I still didn't have time to go over your captures for the GPU
> page fault.
>
> The deadlock we see here is another deadlock, different from the one
> already f
On 2/20/19 12:28 AM, Mikhail Gavrilov wrote:
> On Tue, 19 Feb 2019 at 20:24, Grodzovsky, Andrey
> wrote:
>> Just pull in latest drm-next from here -
>> https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next
>>
>> Andrey
> Tested this kernel and result not good for me.
> 1) "amdgpu
Just pull in latest drm-next from here -
https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next
Andrey
On 2/14/19 11:18 PM, Mikhail Gavrilov wrote:
> On Thu, 14 Feb 2019 at 20:51, Grodzovsky, Andrey
> wrote:
>> Got it.
>>
>> Andrey
>>
> Cool, please don't forget give me patch for
On Thu, 14 Feb 2019 at 20:51, Grodzovsky, Andrey
wrote:
>
> Got it.
>
> Andrey
>
Cool, please don't forget give me patch for testing.
--
Best Regards,
Mike Gavrilov.
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/
Got it.
Andrey
On 2/14/19 4:32 AM, Christian König wrote:
Hey Andrey,
this is on Vega10, so the ASIC always stops after it sees the first fault.
I'm actually working on implementing that it should continue without
interruption.
Regards,
Christian.
Am 13.02.19 um 22:47 schrieb Grodzovsky, And
Hey Andrey,
this is on Vega10, so the ASIC always stops after it sees the first fault.
I'm actually working on implementing that it should continue without
interruption.
Regards,
Christian.
Am 13.02.19 um 22:47 schrieb Grodzovsky, Andrey:
Looks like you are still running this without the l
[ Puts on list administrator hat ]
On 2019-02-14 5:16 a.m., Mikhail Gavrilov via amd-gfx wrote:
>
> Just in case, I duplicated all the files on the file sharing service Mega:
> https://mega.nz/#F!pgYCjYrS!NkeTFIja_qwmxqLoSEUyzA
Please only share such large files via an external service, don't
Looks like you are still running this without the latest hang fix since i see
the deadlock again, but actually what i forgot to ask you is to load amdgpu
with vm_fault_stop=2 to freeze the ASIC once VM_FAULT is encountered - sorry
about that. So please retest with amdgpu.vm_fault_stop=2 paramete
OK, just apply the following to your amdgpu_dm_do_flip function and see
if GPU reset does proceed after you experience the hang.
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index d59bafc..586301f 100644
--- a/drivers/gpu/drm/
On Tue, 12 Feb 2019 at 20:23, Grodzovsky, Andrey
wrote:
>
> It should recover you - so this looks like a bug. I noticed in one of
> the call traces this - drm_atomic_helper_suspend which points to system
> going into sleep mode, is it what happened, did it hang when system
> tried to sleep ?
>
It
Sure, that probably would be the solution, one missing detail here
(besides confirming with the debug prints that this is the scenario we
are hitting) is WHY we even stuck in
reservation_object_wait_timeout_rcu, in amdgpu_device_pre_asic_reset
(during GPU reset) we are first forcing all outstan
The MAX_SCHEDULE_TIMEOUT is probably not a good idea on the wait in DM.
I wonder if we could just do shorter wait and skip the FB
update/programming if it fails after some reasonable amount of time.
This would still allow recovery to happen at least even if the display
isn't showing the right b
On 2/12/19 7:34 AM, Mikhail Gavrilov wrote:
> Hi folks. Sorry for noise.
> But I really don't know Is it enough to send my logs or not.
> As I am understand different sequences may cause "ring gfx timeout".
> I am also not hear which version I need wait or which patch I needs
> apply before testin
They are useful. I am gonna take a look later.
Andrey
On 2/12/19 10:49 AM, Mikhail Gavrilov wrote:
> On Tue, 12 Feb 2019 at 20:23, Grodzovsky, Andrey
> wrote:
>> It should recover you - so this looks like a bug. I noticed in one of
>> the call traces this - drm_atomic_helper_suspend which points
I suspect the issue is that amdgpu_dm_do_flip is holding the BO reserved
and then stack waiting for fences to signal in
reservation_object_wait_timeout_rcu (which won't signal because there
was a VM_FAULT). Then when we try to shutdown display block during reset
recovery from drm_atomic_helper_
15 matches
Mail list logo