Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-29 Thread Christian König
; Zhou, David(ChunMing) ; dri-devel@lists.freedesktop.org; amd-...@lists.freedesktop.org; e...@anholt.net; etna...@lists.freedesktop.org *Cc:* Kazlauskas, Nicholas ; Liu, Monk *Subject:* Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. Ping (mostly David and Monk

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-29 Thread Grodzovsky, Andrey
s, Nicholas <mailto:nicholas.kazlaus...@amd.com>; Liu, Monk <mailto:monk....@amd.com> Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. Ping (mostly David and Monk). Andrey On 4/24/19 3:09 AM, Christian König wrote: Am 24.04.19 um 05:02 schrieb

RE: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-27 Thread Zhou, David(ChunMing)
-...@lists.freedesktop.org; e...@anholt.net; etna...@lists.freedesktop.org Cc: Kazlauskas, Nicholas ; Liu, Monk Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. Ping (mostly David and Monk). Andrey On 4/24/19 3:09 AM, Christian König wrote: Am 24.04.19 um 05:02 schrieb

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-26 Thread Grodzovsky, Andrey
lt;mailto:etna...@lists.freedesktop.org>; ckoenig.leichtzumer...@gmail.com<mailto:ckoenig.leichtzumer...@gmail.com> Cc: Kazlauskas, Nicholas <mailto:nicholas.kazlaus...@amd.com>; Liu, Monk <mailto:monk@amd.com> Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job a

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-24 Thread Christian König
tzumer...@gmail.com *Cc:* Kazlauskas, Nicholas ; Liu, Monk *Subject:* Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. No, i mean the actual HW fence which signals when the job finished execution on the HW. Andrey On 4/23/19 11:19 AM, Zhou, David(ChunMing) wrot

RE: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Zhou, David(ChunMing)
stopping sched for the reason of hw reset? ---- Original Message ---- Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. From: "Grodzovsky, Andrey" To: "Zhou, David(ChunMing)" ,dri-devel@lists.freedesktop.org,amd-...@lists.freedesktop.org,e..

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Grodzovsky, Andrey
: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. From: "Grodzovsky, Andrey" To: "Zhou, David(ChunMing)" ,dri-devel@lists.freedesktop.org,amd-...@lists.freedesktop.org,e...@anholt.net,etna...@lists.freedesktop.org,ckoenig.leichtzumer...@gmail.

Re:[PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Zhou, David(ChunMing)
do you mean fence timer? why not stop it as well when stopping sched for the reason of hw reset? Original Message Subject: Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled. From: "Grodzovsky, Andrey" To: "Zhou, David(ChunMin

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Grodzovsky, Andrey
On 4/22/19 9:09 AM, Zhou, David(ChunMing) wrote: > +Monk. > > GPU reset is used widely in SRIOV, so need virtulizatino guy take a look. > > But out of curious, why guilty job can signal more if the job is already > set to guilty? set it wrongly? > > > -David It's possible that the job does

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Christian König
Am 23.04.19 um 16:12 schrieb Grodzovsky, Andrey: On 4/23/19 8:32 AM, Koenig, Christian wrote: Well you at least have to give me time till after the holidays to get going again :) Not sure exactly jet why we need patch number 5. Probably you missed the mail where I pointed out a bug I found

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Grodzovsky, Andrey
On 4/23/19 8:32 AM, Koenig, Christian wrote: > Well you at least have to give me time till after the holidays to get > going again :) > > Not sure exactly jet why we need patch number 5. Probably you missed the mail where I pointed out a bug I found during testing - I am  reattaching the mail

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Grodzovsky, Andrey
OK, i will merge them into amd-staging drm-next. Andrey On 4/23/19 9:14 AM, Kazlauskas, Nicholas wrote: > Feel free to merge 1+2 since they don't really depend on any other work > in the series and they were previously reviewed. > > Nicholas Kazlauskas > > On 4/23/19 8:32 AM, Koenig, Christian

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Kazlauskas, Nicholas
Feel free to merge 1+2 since they don't really depend on any other work in the series and they were previously reviewed. Nicholas Kazlauskas On 4/23/19 8:32 AM, Koenig, Christian wrote: > Well you at least have to give me time till after the holidays to get > going again :) > > Not sure

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-23 Thread Koenig, Christian
Well you at least have to give me time till after the holidays to get going again :) Not sure exactly jet why we need patch number 5. And we should probably commit patch #1 and #2. Christian. Am 22.04.19 um 13:54 schrieb Grodzovsky, Andrey: > Ping for patches 3, new patch 5 and patch 6. > >

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-22 Thread Chunming Zhou
+Monk. GPU reset is used widely in SRIOV, so need virtulizatino guy take a look. But out of curious, why guilty job can signal more if the job is already set to guilty? set it wrongly? -David 在 2019/4/18 23:00, Andrey Grodzovsky 写道: > Also reject TDRs if another one already running. > > v2:

Re: [PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-22 Thread Grodzovsky, Andrey
Ping for patches 3, new patch 5 and patch 6. Andrey On 4/18/19 11:00 AM, Andrey Grodzovsky wrote: > Also reject TDRs if another one already running. > > v2: > Stop all schedulers across device and entire XGMI hive before > force signaling HW fences. > Avoid passing job_signaled to helper

[PATCH v5 6/6] drm/amdgpu: Avoid HW reset if guilty job already signaled.

2019-04-18 Thread Andrey Grodzovsky
Also reject TDRs if another one already running. v2: Stop all schedulers across device and entire XGMI hive before force signaling HW fences. Avoid passing job_signaled to helper fnctions to keep all the decision making about skipping HW reset in one place. v3: Fix SW sched. hang after non HW