>Can you explain your reasoning behind your current position that the KIQ
>shouldn't be used by baremetal amdgpu?
[ML] I didn't mean KIQ shouldn't leveraged by bare-metal, instead how it is
used by bare-metal is none of my interest ...
I mean it better not be used under SR-IOV case by other
On 2017年05月06日 06:57, Felix Kuehling wrote:
We ran into a similar problem when we played with priorities on KFD
queues. You can't change an MQD of a currently mapped queue. To change a
queue priority we need to unmap it, update the MQD, and then map it again.
I wonder if there is similar
On 2017-05-08 03:07 PM, Dave Airlie wrote:
On 9 May 2017 at 04:54, Harry Wentland wrote:
Hi Daniel,
Thanks for taking the time to look at DC.
I had a couple more questions/comments in regard to the patch you posted on
IRC: http://paste.debian.net/plain/930704
My
On Mon, May 8, 2017 at 9:25 AM, Christian König wrote:
> From: Christian König
>
> This kind of reset handling was removed a long time ago.
>
> Signed-off-by: Christian König
Reviewed-by: Alex Deucher
On Fri, May 5, 2017 at 10:27 AM, Alex Deucher wrote:
> Update the scratch reg for when the engine is hung.
>
> Signed-off-by: Alex Deucher
ping on this series.
Alex
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 13 +
>
On Wed, Apr 5, 2017 at 9:01 AM, Nath, Arindam wrote:
>
> >-Original Message-
> >From: Daniel Drake [mailto:dr...@endlessm.com]
> >Sent: Thursday, March 30, 2017 7:15 PM
> >To: Nath, Arindam
> >Cc: j...@8bytes.org; Deucher, Alexander; Bridgman, John; amd-
>
Local variable use_doorbell is assigned to a constant value and it is never
updated again. Remove this variable and the dead code it guards.
Addresses-Coverity-ID: 1401828
Signed-off-by: Gustavo A. R. Silva
---
drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c | 53
> -Original Message-
> From: amd-gfx [mailto:amd-gfx-boun...@lists.freedesktop.org] On Behalf
> Of Alex Xie
> Sent: Monday, May 08, 2017 11:32 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Xie, AlexBin
> Subject: [PATCH] drm/amdgpu: fix errors in comments.
>
> Signed-off-by: Alex Xie
Unfortunately, further testing shows that this doesn't actually fix the
problem. FWIW, that test runs very reliably on SI with the radeon drm,
but with the amdgpu drm it fails. VI is fine on amdgpu, which is why I
was sent down this road.
Anyway, back to trying to figure this out :/
Cheers,
On 2017-05-08 02:08 AM, Liu, Monk wrote:
> Andres
>
> Some previous patches like move KIQ mutex-lock from amdgpu_virt to common
> place jumped my NAK, but from technique perspective it's no matter anyway,
> But this patch and the following patches are go to a dead end,
>
> 1, Don't use KIQ
Signed-off-by: Alex Xie
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 66bb60e..aab3206 100644
---
Am 08.05.2017 um 09:01 schrieb Liu, Monk:
@Christian
This one is changed to guilty job scheme accordingly with your response
BR Monk
-Original Message-
From: Monk Liu [mailto:monk@amd.com]
Sent: Monday, May 08, 2017 3:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk
Because we can always rely on TDR and HYPERVISOR to detect GPU hang and
resubmit malicious jobs or even kick them out later,
and the gpu reset will eventually be invoked, so there is no reason to manually
and voluntarily call gpu reset under SRIOV case.
Well there is a rather good reason, we
The VM fault interrupt or illegal instruction will be delivered to GPU no
matter it's SR-IOV or bare-metal case,
And I removed them from invoking GPU reset is due to the same reason:
Don't trigger gpu reset for sriov case if possible, always beware that trigger
GPU reset under SR-IOV is a heavy
Sounds good, but what do we do with the amdgpu_irq_reset_work_func?
Please note that I find that calling amdgpu_gpu_reset() here is a bad
idea in the first place.
Instead we should consider the scheduler as faulting and let the
scheduler handle that as in the same way as a job timeout.
But
Am 08.05.2017 um 11:28 schrieb Monk Liu:
Change-Id: Ie9730852da54ceb8b4c2c44acac2df3556a32d17
Signed-off-by: Monk Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git
From: Nicolai Hähnle
Bring the code in line with what the radeon module does.
Without this change, the fence following the IB may be signalled
to the CPU even though some data written by shaders may not have
been written back yet.
This change fixes the OpenGL CTS test
Change-Id: Ie9730852da54ceb8b4c2c44acac2df3556a32d17
Signed-off-by: Monk Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 9 +++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
I agree with disabling debugfs for amdgpu_reset when SRIOV detected.
-Original Message-
From: Christian König [mailto:deathsim...@vodafone.de]
Sent: Monday, May 08, 2017 5:20 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 1/4] drm/amdgpu:don't
You know that gpu reset under SR-IOV will have very big impact on all other VFs
...
Mhm, good argument. But in this case we need to give at least some
warning message instead of doing nothing.
Or even better disable creating the amdgpu_reste debugfs file
altogether. This way nobody will
yeah my mistake, thanks for catch
-Original Message-
From: Christian König [mailto:deathsim...@vodafone.de]
Sent: Monday, May 08, 2017 5:11 PM
To: Liu, Monk ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 3/4] drm/amdgpu:only call flr_work under infinite timeout
Am 08.05.2017 um 08:51 schrieb Monk Liu:
Change-Id: I541aa5109f4fcab06ece4761a09dc7e053ec6837
Signed-off-by: Monk Liu
---
drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 15 +--
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git
For SR-IOV use case, we call gpu reset under the case we have no choice ...
So many places like debug fs shouldn't a good reason to trigger gpu reset
You know that gpu reset under SR-IOV will have very big impact on all other VFs
...
BR Monk
-Original Message-
From: Christian König
Am 08.05.2017 um 08:51 schrieb Monk Liu:
that way we can know which job cause hang and
can do per sched reset/recovery instead of all
sched.
Change-Id: Ifc98cd74b2d93823c489de6a89087ba188957eff
Signed-off-by: Monk Liu
Reviewed-by: Christian König
Am 08.05.2017 um 08:51 schrieb Monk Liu:
because we don't want to do sriov-gpu-reset under certain
cases, so just split those two funtion and don't invoke
sr-iov one from bare-metal one.
Change-Id: I641126c241e2ee2dfd54e6d16c389b159f99cfe0
Signed-off-by: Monk Liu
---
@Christian
This one is changed to guilty job scheme accordingly with your response
BR Monk
-Original Message-
From: Monk Liu [mailto:monk@amd.com]
Sent: Monday, May 08, 2017 3:00 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk
Subject: [PATCH]
1,TDR will kickout guilty job if it hang exceed the threshold
of the given one from kernel paramter "job_hang_limit", that
way a bad command stream will not infinitly cause GPU hang.
by default this threshold is 1 so a job will be kicked out
after it hang.
2,if a job timeout TDR routine will not
because we don't want to do sriov-gpu-reset under certain
cases, so just split those two funtion and don't invoke
sr-iov one from bare-metal one.
Change-Id: I641126c241e2ee2dfd54e6d16c389b159f99cfe0
Signed-off-by: Monk Liu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ---
for SRIOV gpu reset:
this feature allows driver to judge how much time can a job hang for
and will kickout this job from ring_mirror list when doing recover if
the threshold is exceeded.
Monk Liu (4):
drm/amdgpu:don't invoke srio-gpu-reset in gpu-reset
drm/amdgpu:use job* to replace
Andres
Some previous patches like move KIQ mutex-lock from amdgpu_virt to common place
jumped my NAK, but from technique perspective it's no matter anyway,
But this patch and the following patches are go to a dead end,
1, Don't use KIQ to access register inside INTR context
2, Don't
30 matches
Mail list logo