21:41
To: Grodzovsky, Andrey ; Christian König
; Koenig, Christian
; Lazar, Lijo ;
dri-devel@lists.freedesktop.org ;
amd-...@lists.freedesktop.org ; Chen, JingWen
Cc: Chen, Horace ; Liu, Monk
Subject: Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs
Hi Andrey,
I don't
Just a gentle ping.
Andrey
From: Grodzovsky, Andrey
Sent: 26 January 2022 10:52
To: Christian König ; Koenig, Christian
; Lazar, Lijo ;
dri-devel@lists.freedesktop.org ;
amd-...@lists.freedesktop.org ; Chen, JingWen
Cc: Chen, Horace ; Liu, Monk
Subject: Re
AFAIK this one is independent.
Christian, can you confirm ?
Andrey
From: amd-gfx on behalf of Alex Deucher
Sent: 14 September 2021 15:33
To: Christian König
Cc: Liu, Monk ; amd-gfx list ;
Maling list - DRI developers
Subject: Re: [PATCH 1/2] drm/sched: fix
What about removing
(kthread_should_park()) ? We decided it's useless as far as I remember.
Andrey
From: amd-gfx on behalf of Liu, Monk
Sent: 31 August 2021 20:24
To: Liu, Monk ; amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Subject: RE:
Is libdrm on gitlab ? I wasn't aware of this. I assumed code reviews still go
through dri-devel.
Andrey
From: Alex Deucher
Sent: 03 June 2021 17:20
To: Grodzovsky, Andrey
Cc: Maling list - DRI developers ; amd-gfx
list ; Deucher, Alexander
; Christian König
Ok then, I guess I will proceed with the dummy pages list implementation then.
Andrey
From: Koenig, Christian
Sent: 08 January 2021 09:52
To: Grodzovsky, Andrey ; Daniel Vetter
Cc: amd-...@lists.freedesktop.org ;
dri-devel@lists.freedesktop.org ;
daniel.vet
Hey, just a ping on my comments/question bellow.
Andrey
From: Grodzovsky, Andrey
Sent: 25 November 2020 12:39
To: Daniel Vetter
Cc: amd-gfx list ; dri-devel
; Christian König
; Rob Herring ; Lucas Stach
; Qiang Yu ; Anholt, Eric
; Pekka Paalanen ; Deucher
Hey Daniel, just a ping on a bunch of questions i posted bellow.
Andtey
From: Grodzovsky, Andrey
Sent: 25 November 2020 14:34
To: Daniel Vetter ; Koenig, Christian
Cc: r...@kernel.org ; daniel.vet...@ffwll.ch
; dri-devel@lists.freedesktop.org
; e
the issue Emily reported can be
avoided.
Andrey
From: Deng, Emily
Sent: 25 November 2019 16:44:36
To: Grodzovsky, Andrey
Cc: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org; Koenig,
Christian; steven.pr...@arm.com; Grodzovsky, Andrey
Subject: RE
On 10/29/19 2:03 PM, Dan Carpenter wrote:
> On Tue, Oct 29, 2019 at 11:04:44AM -0400, Andrey Grodzovsky wrote:
>> Fix a static code checker warning.
>>
>> Signed-off-by: Andrey Grodzovsky
>> ---
>> drivers/gpu/drm/scheduler/sched_main.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2
On 10/25/19 11:55 AM, Koenig, Christian wrote:
> Am 25.10.19 um 16:57 schrieb Grodzovsky, Andrey:
>> On 10/25/19 4:44 AM, Christian König wrote:
>>> Am 24.10.19 um 21:57 schrieb Andrey Grodzovsky:
>>>> Problem:
>>>> When run_job fails and HW fence returne
On 10/25/19 4:44 AM, Christian König wrote:
> Am 24.10.19 um 21:57 schrieb Andrey Grodzovsky:
>> Problem:
>> When run_job fails and HW fence returned is NULL we still signal
>> the s_fence to avoid hangs but the user has no way of knowing if
>> the actual HW job was ran and finished.
>>
>> Fix:
On 10/3/19 4:34 AM, Neil Armstrong wrote:
> Hi Andrey,
>
> Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit :
>> On 9/30/19 10:52 AM, Hillf Danton wrote:
>>> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>>>> Did a new run from 5.3:
On 9/30/19 5:17 AM, Neil Armstrong wrote:
> Hi Andrey,
>
> On 27/09/2019 22:55, Grodzovsky, Andrey wrote:
>> Can you please use addr2line or gdb to pinpoint where in
>> drm_sched_increase_karma you hit the NULL ptr ? It looks like the guilty
>> job, but to be sur
On 9/30/19 10:52 AM, Hillf Danton wrote:
> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote:
>> Did a new run from 5.3:
>>
>> [ 35.971972] Call trace:
>> [ 35.974391] drm_sched_increase_karma+0x5c/0xf0
>> 10667f3810667F94
>>
Can you please use addr2line or gdb to pinpoint where in
drm_sched_increase_karma you hit the NULL ptr ? It looks like the guilty
job, but to be sure.
Andrey
On 9/27/19 4:12 AM, Neil Armstrong wrote:
> Hi Christian,
>
> In v5.3, running dEQP triggers the following kernel crash :
>
> [
On 9/26/19 11:59 AM, Steven Price wrote:
> On 26/09/2019 16:48, Grodzovsky, Andrey wrote:
>> On 9/26/19 11:23 AM, Steven Price wrote:
>>> On 26/09/2019 16:14, Grodzovsky, Andrey wrote:
>>>> On 9/26/19 10:16 AM, Steven Price wrote:
>>>>> drm_sched_
On 9/26/19 11:23 AM, Steven Price wrote:
> On 26/09/2019 16:14, Grodzovsky, Andrey wrote:
>> On 9/26/19 10:16 AM, Steven Price wrote:
>>> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
>>> it is called as the condition of wait_event_interrupti
On 9/26/19 10:16 AM, Steven Price wrote:
> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
> it is called as the condition of wait_event_interruptible() it must not
> sleep. Unfortuantly some free callbacks (notibly for Panfrost) do sleep.
>
> Instead let's rename
On 9/26/19 3:07 AM, Koenig, Christian wrote:
> Am 25.09.19 um 17:14 schrieb Steven Price:
>> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
>> it is called as the condition of wait_event_interruptible() it must not
>> sleep. Unfortunately some free callbacks (notably for
On 9/26/19 5:41 AM, Steven Price wrote:
> On 25/09/2019 21:09, Grodzovsky, Andrey wrote:
>> On 9/25/19 12:07 PM, Andrey Grodzovsky wrote:
>>> On 9/25/19 12:00 PM, Steven Price wrote:
>>>
>>>> On 25/09/2019 16:56, Grodzovsky, Andrey wrote:
>&g
On 9/25/19 12:07 PM, Andrey Grodzovsky wrote:
> On 9/25/19 12:00 PM, Steven Price wrote:
>
>> On 25/09/2019 16:56, Grodzovsky, Andrey wrote:
>>> On 9/25/19 11:14 AM, Steven Price wrote:
>>>
>>>> drm_sched_cleanup_jobs() attempts to free finished job
On 9/25/19 12:00 PM, Steven Price wrote:
> On 25/09/2019 16:56, Grodzovsky, Andrey wrote:
>> On 9/25/19 11:14 AM, Steven Price wrote:
>>
>>> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
>>> it is called as the condition of w
On 9/25/19 11:14 AM, Steven Price wrote:
> drm_sched_cleanup_jobs() attempts to free finished jobs, however because
> it is called as the condition of wait_event_interruptible() it must not
> sleep. Unfortunately some free callbacks (notably for Panfrost) do sleep.
>
> Instead let's rename
Acked-by: Andrey Grodzovsky
Andrey
On 8/9/19 11:31 AM, Christian König wrote:
> The spsc_queue_peek function is accessing queue->head which belongs to
> the consumer thread and shouldn't be accessed by the producer
>
> This is fixing a rare race condition when destroying entit
On 7/3/19 10:53 AM, Lucas Stach wrote:
> Am Mittwoch, den 03.07.2019, 14:41 + schrieb Grodzovsky, Andrey:
>> On 7/3/19 10:32 AM, Lucas Stach wrote:
>>> Am Mittwoch, den 03.07.2019, 14:23 + schrieb Grodzovsky, Andrey:
>>>> On 7/3
On 7/3/19 10:32 AM, Lucas Stach wrote:
> Am Mittwoch, den 03.07.2019, 14:23 + schrieb Grodzovsky, Andrey:
>> On 7/3/19 6:28 AM, Lucas Stach wrote:
>>> drm_sched_entity_kill_jobs_cb() is called right from the last scheduled
>>> job finished fence signaling. As
On 6/3/19 3:24 AM, Daniel Vetter wrote:
> On Thu, May 30, 2019 at 05:04:20PM +0200, Christian König wrote:
>> Am 29.05.19 um 21:36 schrieb Daniel Vetter:
>>> On Wed, May 29, 2019 at 04:43:45PM +, Grodzovsky, Andrey wrote:
>>>> I don't, sorry.
>>> Shoul
I don't, sorry.
Andrey
On 5/29/19 12:42 PM, Alex Deucher wrote:
> On Wed, May 29, 2019 at 10:29 AM Andrey Grodzovsky
> wrote:
>> Signed-off-by: Andrey Grodzovsky
> Reviewed-by: Alex Deucher
>
> I'll push it to drm-misc in a minute unless you have commit rights.
>
> Alex
>
>> ---
>>
Thanks for letting know, I will send a fix soon.
Andrey
On 5/22/19 9:07 AM, Dan Carpenter wrote:
> [CAUTION: External Email]
>
> Hello Christian König,
>
> The patch 5918045c4ed4: "drm/scheduler: rework job destruction" from
> Apr 18, 2019, leads to the following static checker warning:
>
>
17:42:48
To: Grodzovsky, Andrey
Cc: Deucher, Alexander; Koenig, Christian; Zhou, David(ChunMing); David Airlie;
Daniel Vetter; Lucas Stach; Russell King; Christian Gmeiner; Qiang Yu; Rob
Herring; Tomeu Vizoso; Eric Anholt; Rex Zhu; Huang, Ray; Deng, Emily; Nayan
Deshmukh; Sharat Masetty; amd
On 5/17/19 3:35 PM, Erico Nunes wrote:
> [CAUTION: External Email]
>
> Hello,
>
> I have recently observed a memory leak issue with lima using
> drm-misc-next, which I initially reported here:
> https://gitlab.freedesktop.org/lima/linux/issues/24
> It is an easily reproduceable memory leak which
cannot fully judge patch #4, #5, #6.
-David
From: amd-gfx
<mailto:amd-gfx-boun...@lists.freedesktop.org>
On Behalf Of Grodzovsky, Andrey
Sent: Friday, April 26, 2019 10:09 PM
To: Koenig, Christian
<mailto:christian.koe...@amd.com>; Zhou,
David(ChunMing) <mailto:david1.z...
that we don't do any processing
any more and then start with our reset procedure including forcing all hw
fences to complete.
Christian.
-David
From: amd-gfx
<mailto:amd-gfx-boun...@lists.freedesktop.org>
On Behalf Of Grodzovsky, Andrey
Sent: Wednesday, April 24, 2019 12:00 AM
To: Zhou, Da
: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already
signaled.
From: "Grodzovsky, Andrey"
To: "Zhou, David(ChunMing)"
,dri-devel@lists.freedesktop.org,amd-...@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.
On 4/22/19 8:59 AM, Zhou, David(ChunMing) wrote:
> +Monk to response this patch.
>
>
> 在 2019/4/18 23:00, Andrey Grodzovsky 写道:
>> For later driver's reference to see if the fence is signaled.
>>
>> v2: Move parent fence put to resubmit jobs.
>>
>> Signed-off-by: Andrey Grodzovsky
>>
.
Andrey
Original Message
Subject: Re: [PATCH v5 3/6] drm/scheduler: rework job destruction
From: "Grodzovsky, Andrey"
To: "Zhou, David(ChunMing)"
,dri-devel@lists.freedesktop.org,amd-...@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,cko
On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote:
> +Monk.
>
> GPU reset is used widely in SRIOV, so need virtulizatino guy take a look.
>
> But out of curious, why guilty job can signal more if the job is already
> set to guilty? set it wrongly?
>
>
> -David
It's possible that the job does
On 4/22/19 8:48 AM, Chunming Zhou wrote:
> Hi Andrey,
>
> static void drm_sched_process_job(struct dma_fence *f, struct
> dma_fence_cb *cb)
> {
> ...
> spin_lock_irqsave(>job_list_lock, flags);
> /* remove job from ring_mirror_list */
> list_del_init(_job->node);
>
Reviewed-by: Andrey Grodzovsky
Andrey
On 4/20/19 8:50 AM, Jonathan Neuschäfer wrote:
> Since commit 222b5f044159 ("drm/sched: Refactor ring mirror list
> handling."), drm_sched_hw_job_reset is no longer there, so let's adjust
> the doc comment accordingly.
>
> Signed-of
ing the mail and the KASAN dump.
Andrey
>
> And we should probably commit patch #1 and #2.
>
> Christian.
>
> Am 22.04.19 um 13:54 schrieb Grodzovsky, Andrey:
>> Ping for patches 3, new patch 5 and patch 6.
>>
>> Andrey
>>
>> On 4/18/19 11:00 AM,
Koenig, Christian wrote:
>> Well you at least have to give me time till after the holidays to get
>> going again :)
>>
>> Not sure exactly jet why we need patch number 5.
>>
>> And we should probably commit patch #1 and #2.
>>
>> Christian.
>>
&g
This series is on top of drm-misc because of panfrost and lima drovers
which are missing form amd-staging-drm-next. Once i land it in drm-misc
I will merge and p[ush it into drm-next.
Andrey
On 4/22/19 10:35 PM, Dieter Nützel wrote:
> Hello Andrey,
>
> this series can't apply (brake on #3) on
Ping for patches 3, new patch 5 and patch 6.
Andrey
On 4/18/19 11:00 AM, Andrey Grodzovsky wrote:
> Also reject TDRs if another one already running.
>
> v2:
> Stop all schedulers across device and entire XGMI hive before
> force signaling HW fences.
> Avoid passing job_signaled to helper
On 4/16/19 12:00 PM, Koenig, Christian wrote:
> Am 16.04.19 um 17:42 schrieb Grodzovsky, Andrey:
>> On 4/16/19 10:58 AM, Grodzovsky, Andrey wrote:
>>> On 4/16/19 10:43 AM, Koenig, Christian wrote:
>>>> Am 16.04.19 um 16:36 schrieb Grodzovsky, Andrey:
>>>
On 4/17/19 2:01 PM, Koenig, Christian wrote:
> Am 17.04.19 um 19:59 schrieb Christian König:
>> Am 17.04.19 um 19:53 schrieb Grodzovsky, Andrey:
>>> On 4/17/19 1:17 PM, Christian König wrote:
>>>> I can't review this patch, since I'm one of the authors of it, but in
and keep it all in one place which is amdgpu_device_gpu_recover.
Andrey
>
> Regards,
> Christian.
>
> Am 17.04.19 um 16:36 schrieb Grodzovsky, Andrey:
>> Ping on this patch and patch 5. The rest already RBed.
>>
>> Andrey
>>
>> On 4/16/19 2:23 PM, Andrey
Ping on this patch and patch 5. The rest already RBed.
Andrey
On 4/16/19 2:23 PM, Andrey Grodzovsky wrote:
> From: Christian König
>
> We now destroy finished jobs from the worker thread to make sure that
> we never destroy a job currently in timeout processing.
> By this we avoid holding lock
On 4/16/19 10:58 AM, Grodzovsky, Andrey wrote:
> On 4/16/19 10:43 AM, Koenig, Christian wrote:
>> Am 16.04.19 um 16:36 schrieb Grodzovsky, Andrey:
>>> On 4/16/19 5:47 AM, Christian König wrote:
>>>> Am 15.04.19 um 23:17 schrieb Eric Anholt:
>>>>>
On 4/16/19 10:43 AM, Koenig, Christian wrote:
> Am 16.04.19 um 16:36 schrieb Grodzovsky, Andrey:
>> On 4/16/19 5:47 AM, Christian König wrote:
>>> Am 15.04.19 um 23:17 schrieb Eric Anholt:
>>>> Andrey Grodzovsky writes:
>>>>
>>>>> From:
On 4/16/19 5:47 AM, Christian König wrote:
> Am 15.04.19 um 23:17 schrieb Eric Anholt:
>> Andrey Grodzovsky writes:
>>
>>> From: Christian König
>>>
>>> We now destroy finished jobs from the worker thread to make sure that
>>> we never destroy a job currently in timeout processing.
>>> By this
On 4/15/19 2:46 AM, Koenig, Christian wrote:
I agree this would be good in case of amdgpu_device_pre_asic_reset
because we can totally skip this function if guilty job already
signaled, but for amdgpu_device_post_asic_reset it crates complications
because drm_sched_start is right in the middle
On 4/12/19 3:39 AM, Christian König wrote:
> Am 11.04.19 um 18:03 schrieb Andrey Grodzovsky:
>> Also reject TDRs if another one already running.
>>
>> v2:
>> Stop all schedulers across device and entire XGMI hive before
>> force signaling HW fences.
>>
>> Signed-off-by: Andrey Grodzovsky
>> ---
On 4/12/19 3:40 AM, Christian König wrote:
> Am 11.04.19 um 18:03 schrieb Andrey Grodzovsky:
>> Patch '5edb0c9b Fix deadlock with display during hanged ring recovery'
>> was accidentaly removed during one of DALs code merges.
>>
>> Signed-off-by: Andrey Grodzovsky
>> ---
>>
On 4/11/19 12:41 PM, Kazlauskas, Nicholas wrote:
> On 4/11/19 12:03 PM, Andrey Grodzovsky wrote:
>> Patch '5edb0c9b Fix deadlock with display during hanged ring recovery'
>> was accidentaly removed during one of DALs code merges.
>>
>> Signed-off-by: Andrey Grodzovsky
> Reviewed-by: Nicholas
On 4/10/19 10:41 AM, Christian König wrote:
> Am 10.04.19 um 16:28 schrieb Grodzovsky, Andrey:
>> On 4/10/19 10:06 AM, Christian König wrote:
>>> Am 09.04.19 um 18:42 schrieb Grodzovsky, Andrey:
>>>> On 4/9/19 10:50 AM, Christian König wrote:
>>>>> A
On 4/10/19 10:06 AM, Christian König wrote:
> Am 09.04.19 um 18:42 schrieb Grodzovsky, Andrey:
>> On 4/9/19 10:50 AM, Christian König wrote:
>>> Am 08.04.19 um 18:08 schrieb Andrey Grodzovsky:
>>>> Also reject TDRs if another one already running.
>>>
On 4/9/19 10:50 AM, Christian König wrote:
> Am 08.04.19 um 18:08 schrieb Andrey Grodzovsky:
>> Also reject TDRs if another one already running.
>>
>> Signed-off-by: Andrey Grodzovsky
>> ---
>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 94
>> +-
>> 1 file
np
Andrey
On 3/13/19 1:53 PM, Eric Anholt wrote:
> "Grodzovsky, Andrey" writes:
>
>> On 3/13/19 12:13 PM, Eric Anholt wrote:
>>> "Grodzovsky, Andrey" writes:
>>>
>>>> They are not the same, but the guilty job belongs to only o
On 3/13/19 12:13 PM, Eric Anholt wrote:
> "Grodzovsky, Andrey" writes:
>
>> They are not the same, but the guilty job belongs to only one {entity,
>> scheduler} pair and so we mark as guilty only for that particular
>> entity in the context of that schedule
To: Grodzovsky, Andrey; dri-devel@lists.freedesktop.org;
amd-...@lists.freedesktop.org; to...@tomeuvizoso.net
Cc: Grodzovsky, Andrey
Subject: Re: [PATCH] drm/v3d: Fix calling drm_sched_resubmit_jobs for same
sched.
Andrey Grodzovsky writes:
> Also stop calling drm_sched_increase_karma multiple ti
On 3/12/19 3:43 AM, Tomeu Vizoso wrote:
> On Thu, 27 Dec 2018 at 20:28, Andrey Grodzovsky
> wrote:
>> Decauple sched threads stop and start and ring mirror
>> list handling from the policy of what to do about the
>> guilty jobs.
>> When stoppping the sched thread and detaching sched fences
>>
On 3/6/19 1:37 AM, Cui, Flora wrote:
> deadlock test for sdma will cause gpu recoverty.
> disable the test for now until GPU reset recovery could survive at least
> 1000 times test.
Can you specify what issues you see and on what ASIC ?
Andrey
>
> v2: add modprobe parameter
>
> Change-Id:
er actually isn't used any more, isn't it?
>
>> +retry_wait:
> Not used any more.
>
> But apart from that at least patch #1 and #2 look like they can have my
> rb now.
>
> Patch #3 looks also like it should work after a bit of polishing.
>
> Thanks,
> Christia
, Christian wrote:
> Am 18.01.19 um 18:34 schrieb Grodzovsky, Andrey:
>> On 01/18/2019 12:10 PM, Koenig, Christian wrote:
>>> Am 18.01.19 um 16:21 schrieb Grodzovsky, Andrey:
>>>> On 01/18/2019 04:25 AM, Koenig, Christian wrote:
>>>>> [SNIP]
>>
On 01/18/2019 12:10 PM, Koenig, Christian wrote:
> Am 18.01.19 um 16:21 schrieb Grodzovsky, Andrey:
>> On 01/18/2019 04:25 AM, Koenig, Christian wrote:
>>> [SNIP]
>>>>>>> Re-arming the timeout should probably have a much reduced value
>>>>&
On 01/18/2019 04:25 AM, Koenig, Christian wrote:
> [SNIP]
> Re-arming the timeout should probably have a much reduced value
> when the job hasn't changed. E.g. something like a few ms.
>> Now i got thinking about non hanged job in progress (job A) and let's
>> say it's a long job , it
On 01/17/2019 10:29 AM, Koenig, Christian wrote:
Am 17.01.19 um 16:22 schrieb Grodzovsky, Andrey:
On 01/17/2019 02:45 AM, Christian König wrote:
Am 16.01.19 um 18:17 schrieb Grodzovsky, Andrey:
On 01/16/2019 11:02 AM, Koenig, Christian wrote:
Am 16.01.19 um 16:45 schrieb Grodzovsky, Andrey
On 01/17/2019 10:29 AM, Koenig, Christian wrote:
Am 17.01.19 um 16:22 schrieb Grodzovsky, Andrey:
On 01/17/2019 02:45 AM, Christian König wrote:
Am 16.01.19 um 18:17 schrieb Grodzovsky, Andrey:
On 01/16/2019 11:02 AM, Koenig, Christian wrote:
Am 16.01.19 um 16:45 schrieb Grodzovsky, Andrey
On 01/17/2019 02:45 AM, Christian König wrote:
Am 16.01.19 um 18:17 schrieb Grodzovsky, Andrey:
On 01/16/2019 11:02 AM, Koenig, Christian wrote:
Am 16.01.19 um 16:45 schrieb Grodzovsky, Andrey:
On 01/16/2019 02:46 AM, Christian König wrote:
Am 15.01.19 um 23:01 schrieb Grodzovsky, Andrey
On 01/16/2019 11:02 AM, Koenig, Christian wrote:
Am 16.01.19 um 16:45 schrieb Grodzovsky, Andrey:
On 01/16/2019 02:46 AM, Christian König wrote:
Am 15.01.19 um 23:01 schrieb Grodzovsky, Andrey:
On 01/11/2019 05:03 PM, Andrey Grodzovsky wrote:
On 01/11/2019 02:11 PM, Koenig, Christian wrote
On 01/16/2019 02:46 AM, Christian König wrote:
Am 15.01.19 um 23:01 schrieb Grodzovsky, Andrey:
On 01/11/2019 05:03 PM, Andrey Grodzovsky wrote:
On 01/11/2019 02:11 PM, Koenig, Christian wrote:
Am 11.01.19 um 16:37 schrieb Grodzovsky, Andrey:
On 01/11/2019 04:42 AM, Koenig, Christian
On 01/11/2019 05:03 PM, Andrey Grodzovsky wrote:
>
>
> On 01/11/2019 02:11 PM, Koenig, Christian wrote:
>> Am 11.01.19 um 16:37 schrieb Grodzovsky, Andrey:
>>> On 01/11/2019 04:42 AM, Koenig, Christian wrote:
>>>> Am 10.01.19 um 16:56 schrieb Grodzovsky,
On 01/11/2019 02:11 PM, Koenig, Christian wrote:
> Am 11.01.19 um 16:37 schrieb Grodzovsky, Andrey:
>> On 01/11/2019 04:42 AM, Koenig, Christian wrote:
>>> Am 10.01.19 um 16:56 schrieb Grodzovsky, Andrey:
>>>> [SNIP]
>>>>>>> But we will not be a
On 01/11/2019 04:42 AM, Koenig, Christian wrote:
> Am 10.01.19 um 16:56 schrieb Grodzovsky, Andrey:
>> [SNIP]
>>>>> But we will not be adding the cb back in drm_sched_stop anymore, now we
>>>>> are only going to add back the cb in drm_sched_
Just a ping.
Andrey
On 01/09/2019 10:18 AM, Andrey Grodzovsky wrote:
>
>
> On 01/09/2019 05:22 AM, Christian König wrote:
>> Am 07.01.19 um 20:47 schrieb Grodzovsky, Andrey:
>>>
>>> On 01/07/2019 09:13 AM, Christian König wrote:
>>>> Am 03.01.19 um
On 01/09/2019 05:22 AM, Christian König wrote:
> Am 07.01.19 um 20:47 schrieb Grodzovsky, Andrey:
>>
>> On 01/07/2019 09:13 AM, Christian König wrote:
>>> Am 03.01.19 um 18:42 schrieb Grodzovsky, Andrey:
>>>> On 01/03/2019 11:20 AM, Grodzovsky, Andrey wrote:
On 01/07/2019 09:13 AM, Christian König wrote:
> Am 03.01.19 um 18:42 schrieb Grodzovsky, Andrey:
>>
>> On 01/03/2019 11:20 AM, Grodzovsky, Andrey wrote:
>>> On 01/03/2019 03:54 AM, Koenig, Christian wrote:
>>>> Am 21.12.18 um 21:36 schrieb Grodzovsky,
On 01/03/2019 11:20 AM, Grodzovsky, Andrey wrote:
>
> On 01/03/2019 03:54 AM, Koenig, Christian wrote:
>> Am 21.12.18 um 21:36 schrieb Grodzovsky, Andrey:
>>> On 12/21/2018 01:37 PM, Christian König wrote:
>>>> Am 20.12.18 um 20:23 schrieb Andrey Grodzovsky:
On 01/03/2019 03:54 AM, Koenig, Christian wrote:
> Am 21.12.18 um 21:36 schrieb Grodzovsky, Andrey:
>> On 12/21/2018 01:37 PM, Christian König wrote:
>>> Am 20.12.18 um 20:23 schrieb Andrey Grodzovsky:
>>>> Decauple sched threads stop and start and ring mirror
>
On 12/21/2018 01:37 PM, Christian König wrote:
> Am 20.12.18 um 20:23 schrieb Andrey Grodzovsky:
>> Decauple sched threads stop and start and ring mirror
>> list handling from the policy of what to do about the
>> guilty jobs.
>> When stoppping the sched thread and detaching sched fences
>> from
As far as we discussed this internally looks good to me, but obviously
we need to wait for some feedback from non AMD people.
Acked-by: Andrey Grodzovsky
Andrey
On 12/21/2018 09:33 AM, Nicholas Kazlauskas wrote:
> The behavior of drm_atomic_helper_cleanup_planes differs depend
On 12/19/2018 11:21 AM, Christian König wrote:
> Am 17.12.18 um 20:51 schrieb Andrey Grodzovsky:
>> Decauple sched threads stop and start and ring mirror
>> list handling from the policy of what to do about the
>> guilty jobs.
>> When stoppping the sched thread and detaching sched fences
>> from
On 12/17/2018 10:27 AM, Christian König wrote:
> Am 10.12.18 um 22:43 schrieb Andrey Grodzovsky:
>> Decauple sched threads stop and start and ring mirror
>> list handling from the policy of what to do about the
>> guilty jobs.
>> When stoppping the sched thread and detaching sched fences
>> from
Just a reminder. Any new comments in light of all the discussion ?
Andrey
On 12/12/2018 08:08 AM, Grodzovsky, Andrey wrote:
> BTW, the problem I pointed out with drm_sched_entity_kill_jobs_cb is not
> an issue with this patch set since it removes the cb from
> s_fence->finished in g
ote:
> Yeah, completely correct explained.
>
> I was unfortunately really busy today, but going to give that a look
> as soon as I have time.
>
> Christian.
>
> Am 11.12.18 um 17:01 schrieb Grodzovsky, Andrey:
>> A I understand you say that by the time the fence callback r
np
Andrey
On 12/11/2018 03:18 PM, Alex Deucher wrote:
> On Tue, Dec 11, 2018 at 3:13 PM Andrey Grodzovsky
> wrote:
>> I retested GPU recovery with Bonaire ASIC and it works.
>>
>> Signed-off-by: Andrey Grodzovsky
> Reviewed-by: Alex Deucher
>
> Care to enable it in the kernel as well?
>
>
Tuesday, December 11, 2018 5:44 AM
>> To: dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org;
>> ckoenig.leichtzumer...@gmail.com; e...@anholt.net;
>> etna...@lists.freedesktop.org
>> Cc: Zhou, David(ChunMing) ; Liu, Monk
>> ; Grodzovsky, Andrey
>>
>> Su
On 12/07/2018 03:19 AM, Christian König wrote:
> Am 07.12.18 um 04:18 schrieb Zhou, David(ChunMing):
>>
>>> -Original Message-
>>> From: dri-devel On Behalf Of
>>> Andrey Grodzovsky
>>> Sent: Friday, December 07, 2018 1:41 AM
>>> To: dri-devel@lists.freedesktop.org;
On 12/06/2018 01:33 PM, Christian König wrote:
> Am 06.12.18 um 18:41 schrieb Andrey Grodzovsky:
>> Decauple sched threads stop and start and ring mirror
>> list handling from the policy of what to do about the
>> guilty jobs.
>> When stoppping the sched thread and detaching sched fences
>> from
On 12/06/2018 12:41 PM, Andrey Grodzovsky wrote:
> Expedite job deletion from ring mirror list to the HW fence signal
> callback instead from finish_work, together with waiting for all
> such fences to signal in drm_sched_stop we garantee that
> already signaled job will not be processed twice.
>
There is a pplib messaging related failure currently during GPU reset. I will
put this issue on my TODO
list for later time after handling more prioritized stuff and will disable the
deadlock test suite for all non dGPU gfx8/9 ASICs until then.
Andrey
On 11/02/2018 02:14 PM, Grodzovsky
On 11/02/2018 02:12 PM, Alex Deucher wrote:
> On Fri, Nov 2, 2018 at 11:59 AM Grodzovsky, Andrey
> wrote:
>>
>>
>> On 11/02/2018 10:24 AM, Michel Dänzer wrote:
>>> On 2018-10-31 7:33 p.m., Andrey Grodzovsky wrote:
>>>> Illegal access will cause CP h
On 11/02/2018 10:24 AM, Michel Dänzer wrote:
> On 2018-10-31 7:33 p.m., Andrey Grodzovsky wrote:
>> Illegal access will cause CP hang followed by job timeout and
>> recovery kicking in.
>> Also, disable the suite for all APU ASICs until GPU
>> reset issues for them will be resolved and GPU reset
On 10/31/2018 03:49 PM, Alex Deucher wrote:
> On Wed, Oct 31, 2018 at 2:33 PM Andrey Grodzovsky
> wrote:
>> Illegal access will cause CP hang followed by job timeout and
>> recovery kicking in.
>> Also, disable the suite for all APU ASICs until GPU
>> reset issues for them will be resolved and
On 10/31/2018 03:49 PM, Alex Deucher wrote:
> On Wed, Oct 31, 2018 at 2:33 PM Andrey Grodzovsky
> wrote:
>> Illegal access will cause CP hang followed by job timeout and
>> recovery kicking in.
>> Also, disable the suite for all APU ASICs until GPU
>> reset issues for them will be resolved and
Acked-by: Andrey Grodzovsky
Andrey
On 10/29/2018 05:32 AM, Sharat Masetty wrote:
> This patch adds a new API to clean up the scheduler job resources. This
> is primarliy needed in cases the job was created but was not queued to
> the scheduler queue. Additionally with this change,
On 10/22/2018 05:33 AM, Koenig, Christian wrote:
> Am 19.10.18 um 22:52 schrieb Andrey Grodzovsky:
>> Problem:
>> A particular scheduler may become unsuable (underlying HW) after
>> some event (e.g. GPU reset). If it's later chosen by
>> the get free sched. policy a command will fail to be
>>
On 10/23/2018 05:23 AM, Christian König wrote:
> Am 22.10.18 um 22:46 schrieb Andrey Grodzovsky:
>> Start using drm_gpu_scheduler.ready isntead.
>>
>> v3:
>> Add helper function to run ring test and set
>> sched.ready flag status accordingly, clean explicit
>> sched.ready sets from the IP
That my next step.
Andrey
On 10/19/2018 12:28 PM, Christian König wrote:
From my testing looks like we can, compute ring 0 is dead but IB tests
pass on other compute rings.
Interesting, but I would rather investigate why compute ring 0 is dead while
other still work.
1 - 100 of 115 matches
Mail list logo