From: Harish Kasiviswanathan
This enables KFD_EVENT_TYPE_HW_EXCEPTION notifications to user mode in
response to bad opcodes in a CP queue.
Signed-off-by: Harish Kasiviswanathan
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c | 3 ++-
1 file changed, 2
it doesn't need to be a separate allocation from the amdgpu_vm_pt.
Acked-by: Felix Kuehling
Regards,
Felix
On 2018-09-12 04:55 AM, Christian König wrote:
> Instead of the double linked list. Gets the size of amdgpu_vm_pt down to
> 64 bytes again.
>
> We could even reduce it down
On 2018-09-12 09:55 PM, Alex Deucher wrote:
> On Wed, Sep 12, 2018 at 9:45 PM Felix Kuehling wrote:
>> From: Emily Deng
>>
>> Correct the format
>>
>> For vega10 sriov, the sdma doorbell must be fixed as follow to keep the
>> same setting with h
On 2018-09-14 01:52 PM, Christian König wrote:
> Am 14.09.2018 um 19:47 schrieb Philip Yang:
>> On 2018-09-14 03:51 AM, Christian König wrote:
>>> Am 13.09.2018 um 23:51 schrieb Felix Kuehling:
>>>> On 2018-09-13 04:52 PM, Philip Yang wrote:
You need ROCm 1.9 to work with the upstream KFD. libhsakmt from ROCm 1.8
is incompatible with the upstream KFD ABI.
Where did you get KFDTest? It's part of the same repository on GitHub as
libhsakmt. It's new on the 1.9 branch. You need libhsakmt from the same
branch. The ROCm 1.9 binaries are
On 2018-09-13 04:52 PM, Philip Yang wrote:
> Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables
> callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in
> DRM_AMDGPU_USERPTR Kconfig.
>
> It supports both KFD userptr and gfx userptr paths.
>
> This depends on
On 2018-09-13 01:50 PM, Christian König wrote:
> Am 12.09.2018 um 22:21 schrieb Felix Kuehling:
>> On 2018-09-12 04:55 AM, Christian König wrote:
>>> We can get that just by casting tv.bo.
>>>
>>> Signed-off-by: Christian König
>>> ---
>>&
Am 2019-04-30 um 1:03 p.m. schrieb Koenig, Christian:
>>> The only real solution I can see is to be able to reliable kill shaders
>>> in an OOM situation.
>> Well, we can in fact preempt our compute shaders with low latency.
>> Killing a KFD process will do exactly that.
> I've taken a look at
On 2019-11-01 16:12, Zhao, Yong wrote:
create_cp_queue() could also work with SDMA queues, so we should rename
it.
Change-Id: I76cbaed8fa95dd9062d786cbc1dd037ff041da9d
Signed-off-by: Yong Zhao
The name change makes sense. This patch is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm
On 2019-11-13 5:09 p.m., Yong Zhao wrote:
After the recent cleanup, the functionalities provided by the previous
kfd_kernel_queue_*.c are actually all packet manager related. So rename
them to reflect that.
Change-Id: I6544ccb38da827c747544c0787aa949df20edbb0
Signed-off-by: Yong Zhao
---
that the Vega
code name for the SOC that's used elsewhere in the code.
Regards,
Felix
Yong
On 2019-11-13 5:19 p.m., Felix Kuehling wrote:
On 2019-11-13 5:09 p.m., Yong Zhao wrote:
After the recent cleanup, the functionalities provided by the previous
kfd_kernel_queue_*.c are actually all p
See one comment inline. With that fixed, the series is
Reviewed-by: Felix Kuehling
I could think of more follow-up cleanup while you're at it:
1. Can you see any reason why the kq->ops need to be function pointers.
Looks to me like they are the same for all kernel queues, so th
On 2019-11-13 5:39 p.m., Yong Zhao wrote:
After the recent cleanup, the functionalities provided by the previous
kfd_kernel_queue_*.c are actually all packet manager related. So rename
them to reflect that.
NAK. Like I mentioned in the other email, AI refers to the SOC
generation by its
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/Makefile | 4 ++--
.../amdkfd/{kfd_kernel_queue_v9.c => kfd_packet_manager_v9.c} | 0
.../amdkfd/{kfd_kernel_queue_vi.c => kfd_packet_manager_vi.c} | 0
3 files changed, 2 insertions(+), 2 deletions(-)
-off-by: Yong Zhao
The series is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10
On 2019-11-12 4:35 p.m., Yong Zhao wrote:
Hi Felix,
See one thing inline I am not too sure.
Yong
On 2019-11-12 4:30 p.m., Felix Kuehling wrote:
On 2019-11-12 4:26 p.m., Yong Zhao wrote:
Adapt the change from 1cd106ecfc1f04
The change is:
drm/amdkfd: Stop using GFP_NOIO explicitly
qd_cntl_stack_size;;
Please fix the double-semicolon. With that fixed this change is
Reviewed-by: Felix Kuehling
if (copy_to_user(ctl_stack, mqd_ctl_stack, m->cp_hqd_cntl_stack_size))
return -EFAULT;
___
amd-gfx
are
Reviewed-by: Felix Kuehling
Patch 3 should arguably not be part of this series, because it does not
affect GFXv10.
Regards,
Felix
---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 14 +-
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd
On 2019-11-01 16:10, Zhao, Yong wrote:
Change-Id: I75da23bba90231762cf58da3170f5bb77ece45ed
Signed-off-by: Yong Zhao
I agree with the name changes. One suggestion for a comment inline. With
that fixed, this patch is
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/amdkfd
separate the queue properties from the
queue driver state. That would probably change some internal interfaces
to use struct queue instead of queue_properties.
Anyway, this patch is
Reviewed-by: Felix Kuehling
Change-Id: I553045ff9fcb3676900c92d10426f2ceb3660005
Signed-off-by: Yong Zhao
On 2019-11-11 15:43, Felix Kuehling wrote:
On 2019-11-01 16:10, Zhao, Yong wrote:
dorbell_off in the queue properties is mainly used for the doorbell dw
offset in pci bar. We should not set it to the doorbell byte offset in
process doorbell pages. This makes the code much easier to read.
I
The subject doesn't match the change. This changes ttm_bo_cleanup_refs,
not ttm_buffer_object_transfer.
On 2019-11-11 9:58 a.m., Christian König wrote:
The function is always called with deleted BOs.
While at it cleanup the indentation as well.
Signed-off-by: Christian König
---
-by: Hulk Robot
Fixes: 1ae99eab34f9 ("drm/amdkfd: Initialize HSA_CAP_ATS_PRESENT capability in
topology codes")
Signed-off-by: zhengbin
The patch is
Reviewed-by: Felix Kuehling
I'm applying it to amd-staging-drm-next.
Thanks,
Felix
---
drivers/gpu/drm/amd/amdkfd/kfd_iommu.c |
, to give a confident R-b.
Acked-by: Felix Kuehling
---
drivers/gpu/drm/ttm/ttm_bo.c | 215 +-
drivers/gpu/drm/ttm/ttm_bo_util.c | 1 -
include/drm/ttm/ttm_bo_api.h | 11 +-
3 files changed, 97 insertions(+), 130 deletions(-)
diff --git a/drivers/gpu
I'm not sure about this one. Looks like the interface is getting
needlessly more complicated. Now the caller has to keep track of the
runlist IB address and size just to pass those to another function. I
could understand this if there was a use case that needs to separate the
allocation of the
On 2019-11-22 3:23 p.m., Oak Zeng wrote:
Config the translation retry behavior according to noretry
kernel parameter
Change-Id: I5b91ea77715137cf8cb84e258ccdfbb19c7a4ed1
Signed-off-by: Oak Zeng
Suggested-by: Jay Cornwall
---
drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 +++-
On 2019-11-22 5:55 p.m., Oak Zeng wrote:
Config the translation retry behavior according to noretry
kernel parameter
Change-Id: I5b91ea77715137cf8cb84e258ccdfbb19c7a4ed1
Signed-off-by: Oak Zeng
Suggested-by: Jay Cornwall
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu
On 2019-12-04 10:38 a.m., Christian König wrote:
Allows us to reduce the overhead while syncing to fences a bit.
This allows some further simplification. See two comments inline.
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 18 +++---
can be distinguished in user mode.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 30
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 2 ++
2 files changed, 28 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
[+Alejandro]
On 2019-12-04 10:38 a.m., Christian König wrote:
This way we can do updates even without the resv obj locked.
This could use a bit more explanation. This change depends on the
previous one that adds explicit synchronization with page table updates
during command submission in
ic context any more, but that took me way longer than expected as
well.
I'm currently experimenting with using a trylock driver mutex, that at least
that should work for now until we got something better.
Regards,
Christian.
Am 28.11.19 um 21:30 schrieb Felix Kuehling:
Hi Christian,
I'm thin
I agree with Christian's comments on patch 1. With those fixed, the
series is
Reviewed-by: Felix Kuehling
Regards,
Felix
On 2019-12-02 20:42, Yong Zhao wrote:
Since Arcturus has it own function pointer, we can move Arcturus
specific logic to there rather than leaving it entangled
I agree. Removing the call to pre-reset probably breaks GPU reset for KFD.
We call the KFD suspend function in pre-reset, which uses the HIQ to
stop any user mode queues still running. If that is not possible because
the HIQ is hanging, it should fail with a timeout. There may be
something we
On 2019-12-17 12:28, Jonathan Kim wrote:
The DF routines to arm xGMI performance will attempt to re-arm both on
performance monitoring start and read on initial failure to arm.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 153 ---
1 file
See comment inline. Other than that, the series looks good to me.
On 2019-12-16 2:02, Huang Rui wrote:
Thunk driver would like to know the num_cp_queues data, however this data relied
on different asic specific. So it's better to get it from kfd driver.
Signed-off-by: Huang Rui
---
you that the operation is in progress and you should check
its status later."
This call is neither non-blocking nor is the requested page table update
in progress when this error is returned. So I'd think a better error to
return here would be EBUSY.
Other than that, this patch is
Reviewed-b
On 2019-12-05 8:39 a.m., Christian König wrote:
Allows us to reduce the overhead while syncing to fences a bit.
v2: also drop adev parameter from the functions
Signed-off-by: Christian König
Reviewed-by: Felix Kuehling
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8
be updated because "other VM updates" fences are no longer in the
resv. Something like this: VM updates only sync with moves but not with
user command submissions or KFD evictions fences. With that fixed, this
patch is
Reviewed-by: Feli
On 2019-12-05 8:39 a.m., Christian König wrote:
When a BO is evicted immedially invalidate the mapped PTEs.
I think you mentioned that this is just a proof of concept. I wouldn't
submit the patch like this because it's overkill for VMs that don't want
to use recoverable page faults and
I don't think this should go into amdgpu_vram_mgr. KFD tries to avoid
running out of VRAM for page tables because we cannot oversubscribe
memory within a process and we want to avoid compute processes evicting
each other because that would mean thrashing. Those limitation don't
apply to
On 2019-12-05 11:10 a.m., Philip Yang wrote:
One comment in line.
With it is fixed, this is reviewed by Philip Yang
Philip
On 2019-12-04 11:13 p.m., Felix Kuehling wrote:
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 8276601a122f
On 2019-12-16 3:06 p.m., Zhao, Yong wrote:
[AMD Official Use Only - Internal Distribution Only]
The problem happens when we want to reuse the same function for ASICs
which have fewer SDMA engines. Some pointers on which SOC15_REG_OFFSET
depends for some higher index SDMA engines are 0,
On 2019-12-13 8:38, Yong Zhao wrote:
This prevents the NULL pointer access when there are fewer than 8 sdma
engines.
I don't see where you got a NULL pointer in the old code. Also this
change is in an Arcturus-specific source file. AFAIK Arcturus always has
8 SDMA engines.
The new code is
Hi Christian,
Alex started trying to invalidate PTEs in the MMU notifiers and we're
finding that we still need to reserve the VM reservation for
amdgpu_sync_resv in amdgpu_vm_sdma_prepare. Is that sync_resv still
needed now, given that VM fences aren't in that reservation object any more?
On 2019-11-18 17:24, Alex Sierra wrote:
Only for the debugger use case.
[why]
Avoid endless translation retries, after an invalid address access has
been issued to the GPU. Instead, the trap handler is forced to enter by
generating a no-retry-fault.
A s_trap instruction is inserted in the
n 2019-11-15 11:07, Yong Zhao wrote:
It is the same as KFD_MQD_TYPE_CP, so delete it. As a result, we will
have one less mqd mananger per device.
Change-Id: Iaa98fc17be06b216de7a826c3577f44bc0536b4c
Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd
: I4180c30e2631dc0401cbd6171f8a6776e4733c9a
Signed-off-by: Alex Sierra
This commit adds some unnecessary empty lines. See inline. With that
fixed, the series is
Reviewed-by: Felix Kuehling
Please also give Christian a chance to review.
Thanks,
Felix
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 -
1 file changed
On 2019-10-11 1:12 p.m., t...@kernel.org wrote:
Hello, Daniel.
On Wed, Oct 09, 2019 at 06:06:52PM +0200, Daniel Vetter wrote:
That's not the point I was making. For cpu cgroups there's a very well
defined connection between the cpu bitmasks/numbers in cgroups and the cpu
bitmasks you use in
Hi Timothy,
Thank you for the patch and for confirming that it works. We did some
experimental work on Power8 a few years ago. I see that Talos II is Power9.
At the time we were working on Power8 we had to add some #ifdef
CONFIG_ACPI guards around some ACPI-specific code in KFD. Do you know
each + 256GB system memory = 512GB
Old page table reservation per GPU: 1GB
New page table reservation per GPU: 32MB
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 15 ++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu
., Felix Kuehling wrote:
Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.
Example
On 2019-11-25 4:06 p.m., Timothy Pearson wrote:
- Original Message -
From: "Felix Kuehling"
To: "Timothy Pearson" , "amd-gfx"
Sent: Monday, November 25, 2019 11:07:31 AM
Subject: Re: [PATCH 1/1] amdgpu: Enable KFD on POWER systems
Hi Tim
Allow KFD applications to use more unpinned system memory through
HMM.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm
Hi Christian,
I'm thinking about this problem, trying to come up with a solution. The
fundamental problem is that we need low-overhead access to the page
table in the MMU notifier, without much memory management or locking.
There is one "driver lock" that we're supposed to take in the MMU
confusing.
Signed-off-by: Jonathan Kim
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/df_v3_6.c | 151 +++
1 file changed, 129 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
b/drivers/gpu/drm/amd/amdgpu/df_v3_6.c
index
I'm thinking, if we know we're preparing for a GPU reset, maybe we
shouldn't even try to suspend processes and stop the HIQ.
kfd_suspend_all_processes, stop_cpsch and other functions up that call
chain up to kgd2kfd_suspend could have a parameter (bool pre_reset) that
would update the driver
?
Regards
shaoyun.liu
On 2019-12-19 5:44 p.m., Felix Kuehling wrote:
I'm thinking, if we know we're preparing for a GPU reset, maybe we
shouldn't even try to suspend processes and stop the HIQ.
kfd_suspend_all_processes, stop_cpsch and other functions up that
call chain up to kgd2kfd_suspend could
un.liu
On 2019-12-20 11:33 a.m., Felix Kuehling wrote:
dqm->is_hws_hang is protected by the DQM lock. kq_uninitialize runs
outside that lock protection. Therefore I opted to pass in the
hanging flag as a parameter. It also keeps the logic that decides all
of that inside the device queue manager
On 2019-12-18 3:45 a.m., Huang Rui wrote:
Thunk driver would like to know the num_cp_queues data, however this data relied
on different asic specific. So it's better to get it from kfd driver.
v2: don't update name size.
Signed-off-by: Huang Rui
The series is
Reviewed-by: Felix Kuehling
Move HWS hand detection into unmap_queues_cpsch to catch hangs in all
cases. If this happens during a reset, don't schedule another reset
because the reset already in progress is expected to take care of it.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 3
Reading from /sys/kernel/debug/kfd/hang_hws would cause a kernel
oops because we didn't implement a read callback. Set the permission
to write-only to prevent that.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_debugfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion
dqm->pipeline_mem wasn't used anywhere.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 1 -
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 1 -
2 files changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manage
Don't use the HWS if it's known to be hanging. In a reset also
don't try to destroy the HIQ because that may hang on SRIOV if the
KIQ is unresponsive.
Signed-off-by: Felix Kuehling
---
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c| 12
drivers/gpu/drm/amd/amdkfd
FLR for SRIOV.
Regards,
Felix
On 2019-12-19 9:09 p.m., Felix Kuehling wrote:
The intention of the pre_reset callback is to update the driver
state to reflect that all user mode queues are preempted and the
HIQ is destroyed. However we should not actually preempt any queues
or otherwise touch the hardw
he reset by avoiding unnecessary timeouts from
a potentially hanging GPU scheduler.
CC: shaoyunl
CC: Liu Monk
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 24 ++---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 27 ---
of actually stopping all queues.
This should prevent KIQ register access hanging on SRIOV function
level reset (FLR). It should also speed up the reset by avoiding
unnecessary timeouts from a potentially hanging GPU scheduler.
CC: shaoyunl
CC: Liu Monk
Signed-off-by: Felix Kuehling
---
drivers
FLR for SRIOV.
Regards,
Felix
On 2019-12-19 9:09 p.m., Felix Kuehling wrote:
> The intention of the pre_reset callback is to update the driver
> state to reflect that all user mode queues are preempted and the
> HIQ is destroyed. However we should not actually preempt any queues
> or
Some nit-picks inline. Looks good otherwise.
On 2019-12-18 2:04 p.m., Jonathan Kim wrote:
The DF routines to arm xGMI performance will attempt to re-arm both on
performance monitoring start and read on initial failure to arm.
v2: Roll back reset_perfmon_cntr to void return since new perfmon
On 2019-12-20 12:22, Zeng, Oak wrote:
[AMD Official Use Only - Internal Distribution Only]
Regards,
Oak
-Original Message-
From: amd-gfx On Behalf Of Felix
Kuehling
Sent: Friday, December 20, 2019 3:30 AM
To: amd-gfx@lists.freedesktop.org
Subject: [PATCH 4/4] drm/amdkfd: Avoid
I think even Inside that kq_uninitialize function , we still can get
dqm as kq->dev->dqm .
shaoyun.liu
On 2019-12-20 3:30 a.m., Felix Kuehling wrote:
Don't use the HWS if it's known to be hanging. In a reset also
don't try to destroy the HIQ because that may hang on SRIOV if the
K
or GPU resets in KFD.
I think we created the worker to avoid locking issues, but there may be
better ways to do this.
Regards,
Felix
Regards,
Oak
-Original Message-
From: amd-gfx On Behalf Of Felix
Kuehling
Sent: Friday, December 20, 2019 3:30 AM
To: amd-gfx@lists.freedeskto
On 2019-12-20 1:24, Alex Sierra wrote:
This can be used directly from amdgpu and amdkfd to invalidate
TLB through pasid.
It supports gmc v7, v8, v9 and v10.
Two small corrections inline to make the behaviour between KIQ and
MMIO-based flushing consistent. Looks good otherwise.
Change-Id:
: Ic2c7d4a0d19fe1e884dee1ff10a520d31252afee
Signed-off-by: Alex Sierra
This patch is
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 -
.../drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c| 67 -
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 41
.../gpu/drm
: I5531c9337836e7d4a430df3f16dcc82888e8018c
Signed-off-by: Alex Sierra
This patch is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 ++---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 28 +-
2 files changed, 34 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm
On 2019-12-20 1:24, Alex Sierra wrote:
[Why]
TLB flush method has been deprecated using kfd2kgd interface.
This implementation is now on the amdgpu_amdkfd API.
[How]
TLB flush functions now implemented in amdgpu_amdkfd.
Change-Id: Ic51cccdfe6e71288d78da772b6e1b6ced72f8ef7
Signed-off-by: Alex
I think this patch is just a proof of concept for now. It should not be
submitted because there are still some known locking issues that need to
be solved, and we don't have the code yet that handles the recoverable
page faults resulting from this.
Regards,
Felix
On 2019-12-20 1:24, Alex
adding VM updates fences to the resv obj
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 14 --
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index
-by: Felix Kuehling
On 2020-02-24 17:18, Yong Zhao wrote:
The previous SDMA queue counting was wrong. In addition, after confirming
with MEC firmware team, we understands that only one unmap queue package,
instead of one unmap queue package for CP and each SDMA engine, is needed,
which results in much
Signed-off-by: Eric Huang
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
b/drivers/gpu/drm/amd/amdkfd
On 2020-03-04 3:21 p.m., Yong Zhao wrote:
ALLOC_MEM_FLAGS_* used are the same as the KFD_IOC_ALLOC_MEM_FLAGS_*,
but they are interweavedly used in kernel driver, resulting in bad
readability. For example, KFD_IOC_ALLOC_MEM_FLAGS_COHERENT is totally
not referenced in kernel, and it functions in
wrote:
Series is Reviewed-by: xinhui pan
2020年3月5日 05:50,Kuehling, Felix 写道:
Otherwise BOs may wait for the fence indefinitely and never be destroyed.
v2: Signal the fence right after destroying queues to avoid unnecessary
delaye-delete in kfd_process_wq_release
Signed-off-by: Felix
On 2020-02-27 9:28, Christian König wrote:
Hi Felix,
so coming back to this after two weeks of distraction.
Am 14.02.20 um 22:12 schrieb Felix Kuehling:
Now you allow eviction of page tables while you allocate page tables.
Isn't the whole point of the eviction lock to prevent page table
Add a dummy implementation of amdgpu_amdkfd_remove_fence_on_pt_pd_bos
for kernel configs without KFD.
Fixes: be8e48e08499 ("drm/amdgpu: Remove kfd eviction fence before release bo")
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 +
1 file
I've seen hangs on a Raven AM4 system after the Ubuntu upgrade to kernel
5.3. I am able to work around it by disabling stutter mode with the
module parameter amdgpu.ppfeaturemask=0xfffdbfff. If that doesn't help,
you could also try disabling GFXOFF with amdgpu.ppfeaturemask=0xfffd3fff.
On 2020-01-27 20:29, Rajneesh Bhardwaj wrote:
During system suspend the kfd driver aquires a lock that prohibits
further kfd actions unless the gpu is resumed. This adds some info which
can be useful while debugging.
Signed-off-by: Rajneesh Bhardwaj
---
HI Rajneesh,
See comments inline ...
And a general question: Why do you need to set the autosuspend_delay in
so many places? Amdgpu only has a single call to this function during
initialization.
On 2020-01-27 20:29, Rajneesh Bhardwaj wrote:
So far the kfd driver implemented same routines
and files underneath
are generated when a queue is created. They are removed when the queue is
destroyed.
Signed-off-by: Amber Lin
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 90
Bhardwaj
One small comment inline. Other than that patches 1-3 are
Reviewed-by: Felix Kuehling
Also, I believe patch 1 is unchanged from v1 and already got a
Reviewed-by from Alex. Please remember to add that tag before you submit.
The last patch that enabled runtime PM by default, I'd leave
If we're using the BAR, we should probably flush HDP cache/buffers
before reading or after writing.
Regards,
Felix
On 2020-02-05 10:22 a.m., Christian König wrote:
This should speed up debugging VRAM access a lot.
Signed-off-by: Christian König
---
On 2020-02-05 10:22 a.m., Christian König wrote:
Make use of the better performance here as well.
This patch is only compile tested!
Signed-off-by: Christian König
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 38 +++--
1 file changed, 23 insertions(+), 15 deletions(-)
On 2020-02-06 9:30, Christian König wrote:
Make use of the better performance here as well.
This patch is only compile tested!
v2: fix calculation bug pointed out by Felix
Signed-off-by: Christian König
Acked-by: Jonathan Kim
The series is
Reviewed-by: Felix Kuehling
---
drivers
t;conf and hwc->config are in different members of that union. So
hwc->conf aliases some other variable in the structure that hwc->config
is in. If I did the math right, hwc->conf aliases hwc->last_tag.
Anyway, the patch is
Reviewed-by: Felix Kuehling
Signed-off-by: Jonathan K
On 2020-01-30 7:49, Christian König wrote:
For the root PD mask can be 0x as well which would
overrun to 0 if we don't cast it before we add one.
You're fixing parentheses, not braces.
Parentheses: ()
Brackets: []
Braces: {}
With the title fixed, this patch is
Reviewed-by: Felix
On 2020-01-30 7:49, Christian König wrote:
That we can't find a PD above the root is expected can only happen if
we try to update a larger range than actually managed by the VM.
Signed-off-by: Christian König
Tested-by: Tom St Denis
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd
, this
patch is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 23 ++-
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 9705c961405b
-by: Christian König
Tested-by: Tom St Denis
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 35 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 3 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c| 7
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
,
compute_queue_count in pm_calc_rlib_size() is one more than the
actual compute queue number, because the queue_count has been
incremented while sdma_queue_count has not. This patch fixes that.
Change-Id: I20353e657efd505353d0dd9f7eb2fab5085e7202
Signed-off-by: Yong Zhao
Reviewed-by: Felix Kuehling
But I
On 2020-01-30 14:01, Bhardwaj, Rajneesh wrote:
Hello Felix,
Thanks for your time to review and for your feedback.
On 1/29/2020 5:52 PM, Felix Kuehling wrote:
HI Rajneesh,
See comments inline ...
And a general question: Why do you need to set the autosuspend_delay
in so many places? Amdgpu
On 2020-01-30 7:49, Christian König wrote:
No matter what we always need to sync to moves.
Signed-off-by: Christian König
Tested-by: Tom St Denis
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 15 +++
1 file changed, 11 insertions(+), 4
On 2020-01-30 17:11, Alex Deucher wrote:
On Thu, Jan 30, 2020 at 4:55 PM Felix Kuehling wrote:
On 2020-01-30 14:01, Bhardwaj, Rajneesh wrote:
Hello Felix,
Thanks for your time to review and for your feedback.
On 1/29/2020 5:52 PM, Felix Kuehling wrote:
HI Rajneesh,
See comments inline
901 - 1000 of 3271 matches
Mail list logo