and CU1.
Signed-off-by: Oak Zeng
Acked-by: Alex Deucher
Reviewed-by: Felix Kuehling
Signed-off-by: Alex Deucher
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0
A few small patches found during DKMS branch rebasing that were
missing from amd-staging-drm-next for no good reason.
Felix Kuehling (2):
drm/amdkfd: Fix comment formatting
drm/amdgpu: Add missing parameter description in comments
Oak Zeng (1):
drm/amdgpu: Changed CU reservation golden
t sweeps over its pfns array a couple of times anyhow.
>
> Signed-off-by: Jason Gunthorpe
> Signed-off-by: Christoph Hellwig
Hi Jason,
I pointed out a typo in the documentation inline. Other than that, the
series is
Acked-by: Felix Kuehling
I'll try to build it and run so
ioctl/botching-up-ioctls.html
>>
>> Alex
>>
>> ----
>> *From:* amd-gfx
>> <mailto:amd-gfx-boun...@lists.freedesktop.org> on behalf of Felix
>> Kuehling <mailto:felix.kuehl...@amd.
Am 2020-05-05 um 11:19 a.m. schrieb Christian König:
> Am 05.05.20 um 16:58 schrieb Felix Kuehling:
>> Am 2020-05-05 um 3:47 a.m. schrieb Christian König:
>>> Just to reply here once more, this patch is a clear NAK.
>>
>> Agreed. But see below. I don't think all is w
BO would hold one token reference to the TTM BO, which it can
drop when the GEM BO refcount drops to 0. Finally, the amdgpu BO should
only be freed once the TTM BO refcount also becomes 0.
Regards,
Felix
>
> Regards,
> Christian.
>
> Am 01.05.20 um 16:44 schrieb Felix Kuehling:
-off-by: Felix Kuehling
Tested-by: Alex Sierra
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 1247938b1ec1
Am 2020-05-05 um 1:29 p.m. schrieb Felix Kuehling:
> Am 2020-05-05 um 11:19 a.m. schrieb Christian König:
>> Am 05.05.20 um 16:58 schrieb Felix Kuehling:
>>> Am 2020-05-05 um 3:47 a.m. schrieb Christian König:
>>>> Just to reply here once more, this patch is a clea
ts enablement from ioctl to fd write
>
> Signed-off-by: Amber Lin
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
> drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 2 +
> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 18 ++
&
I believe this should be squashed into Patch #8 or applied after patch
#8. Otherwise it creates a broken intermediate state where Arcturus
doesn't have any valid IH support. That said, it's probably less
critical because it only affects the case of direct (backdoor) firmware
loading.
How much overlap is there between arcturus_ih and nave10_ih? Given that
they both use the same register map, could they share the same driver
code with only minor differences?
If they're almost the same, maybe you could rename navi10_ih.[ch] to
osssys_v5_0.[ch] and use it for both navi10 and
That looks like a nice cleanup. Some nit-picks inline ...
On 2020-03-19 9:41, Christian König wrote:
Cleanup amdgpu_ttm_copy_mem_to_mem by using fewer variables
for the same value.
Rename amdgpu_map_buffer to amdgpu_ttm_map_buffer, move it
to avoid the forward decleration, cleanup by moving
is the subject "PATCH 1/6"? It makes me wonder, what are the other 5
patches. Anyway, this patch is
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
b/drivers/g
this description because the patch is no longer limited to
Arcturus.
One more comment inline. With those fixed, the patch is
Reviewed-by: Felix Kuehling
Signed-off-by: Alex Sierra
---
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 21 +
1 file changed, 17 insertions(+), 4 deletions
On 2020-03-20 10:06, Deucher, Alexander wrote:
[AMD Public Use]
This seems kind of complicated and error prone. I didn't realize the
extent to the changes required. I think it would be better to either
add arcturus specific versions of these functions or just go with your
original
On 2020-03-19 20:22, Alex Sierra wrote:
This macro calculates the IH ring register offset based on
the three ring numbers and asic type.
The parameters needed are the register's name without the prefix mmIH
and the ring number taken from RING0, RING1 or RING2 macros.
Signed-off-by: Alex Sierra
On 2020-03-20 10:39, Deucher, Alexander wrote:
[AMD Public Use]
I'm worried we'll miss a register by accident. We went with per IP
sub drivers to avoid handling complexities around IP differences if
possible. Also the scheme seems like kind of a one off compared to
what we do for other
documentation.
No functional change.
v2: add some more cleanup suggested by Felix
Signed-off-by: Christian König
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 269
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 4 +-
2 files changed, 135
ee one comment inline. With that fixed, the patch is
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 26 +++--
> 1 file changed, 19 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> b/driv
This fixes an intermittent bug where a root PD clear operation still in
progress could overwrite a PDE update done by the CPU, resulting in a
VM fault.
Fixes: 108b4d928c03 ("drm/amd/amdgpu: Update VM function pointer")
Reported-by: Jay Cornwall
Tested-by: Jay Cornwall
Signed-off
Am 2020-05-20 um 11:34 p.m. schrieb Evan Quan:
> Since the PCI bus number retrieved by PCI_BUS_NUM(pdev->devfn)
> is wrong.
>
> Change-Id: I882a8531a65cdf91be20e34a034aca1f43f658b4
> Signed-off-by: Evan Quan
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/am
Am 2020-05-21 um 9:50 a.m. schrieb Christian König:
> Am 21.05.20 um 00:51 schrieb Felix Kuehling:
>> This fixes an intermittent bug where a root PD clear operation still in
>> progress could overwrite a PDE update done by the CPU, resulting in a
>> VM fault.
>
Hi Mukul,
This looks pretty good. See some suggestions inline.
Am 2020-05-14 um 4:33 p.m. schrieb Mukul Joshi:
> Track SDMA usage on a per process basis and report it through sysfs.
> The value in the sysfs file indicates the amount of time SDMA has
> been in-use by this process since the
O user pages if MMU interval notifer is gone.
>
> Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgp
On 2020-09-01 11:21 a.m., Li, Dennis wrote:
[AMD Official Use Only - Internal Distribution Only]
Hi, Felix,
If GPU hang, execute_queues_cpsch will fail to unmap or map queues and then
create_queue_cpsch will return error. If pqm_create_queue find create_queue_cpsch
failed, it will call
Should there a corresponding change in mmhub_v2_0.c?
Other than that, the series is
Reviewed-by: Felix Kuehling
On 2020-09-01 5:51 p.m., Alex Deucher wrote:
Print the name of the client rather than the number. This
makes it easier to debug what block is causing the fault.
Signed-off
adev->irq.ih1.doorbell_index = (adev->doorbell_index.ih + 1) << 1;
> + if (adev->asic_type == CHIP_ARCTURUS) {
This may apply to the Arcturus successor as well. I'd use asic_type <
NAVI10 instead, to be future-proof.
With these two issues fixed, the patch is
Reviewed-by: Felix K
Am 2020-09-02 um 2:13 p.m. schrieb Alex Deucher:
> On Wed, Sep 2, 2020 at 2:08 PM Alex Deucher wrote:
>> On Wed, Sep 2, 2020 at 1:01 PM Alex Sierra wrote:
>>> Enable multi-ring ih1 and ih2 for Arcturus only.
>>> For Navi10 family multi-ring has been disabled.
>>> Apparently, having multi-ring
Am 2020-09-02 um 2:08 p.m. schrieb Alex Deucher:
> On Wed, Sep 2, 2020 at 1:01 PM Alex Sierra wrote:
>> Enable multi-ring ih1 and ih2 for Arcturus only.
>> For Navi10 family multi-ring has been disabled.
>> Apparently, having multi-ring enabled in Navi was causing
>> continus page fault
quot;drm/amdkfd: fix set kfd node ras properties value")
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
b/drivers/gpu/drm/amd/amdkfd/kfd_topology.
Am 2020-09-10 um 2:54 p.m. schrieb Philip Cox:
> Reduce the eviction and restore messages from INFO level to DEBUG level.
>
> Signed-off-by: Philip Cox
This patch is
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 8
>
Am 2020-09-10 um 2:54 p.m. schrieb Philip Cox:
> Add per-process eviction counters to sysfs to keep track of
> how many eviction events have happened for each process.
>
> Signed-off-by: Philip Cox
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 15 ++-
>
Am 2020-09-10 um 2:54 p.m. schrieb Philip Cox:
> Extending the module parameter debug_evictions to also print a stack
> trace when the eviction code path is called.
>
> Signed-off-by: Philip Cox
This patch is
Reviewed-by: Felix Kuehling
> ---
> drivers/
Am 2020-09-11 um 12:27 p.m. schrieb Mukul Joshi:
> SDMA utilization calculations are enabled/disabled by
> writing to SDMAx_PUB_DUMMY_REG2 register. Currently,
> enable this only for Arcturus.
>
> Signed-off-by: Mukul Joshi
> ---
> drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 10 ++
> 1 file
ccess. And then kfd_process_evict_queues will
> access a freed memory, which cause a system crash.
>
> v2:
> The failure to execute_queues should probably not be reported to
> the caller of create_queue, because the queue was already created.
... and the failure affects all processes in t
is is probably expected, and imperceptible to the standard user,
> I thought I'd at least mention it in an effort to keep contributing.
>
>
>
> Thanks for the continued open source work. You all make my life.
>
>
>
> Cheers,
>
> Matt
>
> On 9/3/20 2:05 AM, Christian Kö
he suspend
> stage of GPU recovery.
>
> Signed-off-by: Dennis Li
Reviewed-by: Felix Kuehling
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
> index 069ba4be1e8f..20ef048d6a03 10
l Joshi
Two nit-picks inline. With those fixed, the patch is
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 30 +--
> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 3 ++
> .../drm/amd/amdkfd/kfd_device_queue_manager.c | 3 +-
> drive
_amdkfd_gpuvm_alloc_memory_of_gpu+0x9ca/0xb10
> [amdgpu]
> [ 420.990666] kfd_ioctl_alloc_memory_of_gpu+0xef/0x2c0 [amdgpu]
>
> Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 1 +
> 1 file changed, 1 insertion(+)
Am 2020-09-14 um 11:42 a.m. schrieb Alex Deucher:
> Copy paste typo.
>
> Reported-by: kernel test robot
> Signed-off-by: Alex Deucher
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a
I'm not sure how the bug you're fixing is caused, but your fix is
clearly in the wrong place.
A queue being disabled is not the same thing as a queue being destroyed.
Queues can be disabled for legitimate reasons, but they still should
exist and be in the qpd->queues_list.
If a destroyed queue
ven_device_info = {
> ^
>
> As Huang Rui suggested, Raven already has the fallback path,
> so it should be out of IOMMU v2 flag.
>
> Suggested-by: Huang Rui
> Signed-off-by: YueHaibing
Reviewed-by: Felix Kuehling
I applied yo
Am 2020-09-11 um 4:10 p.m. schrieb Philip Cox:
> Add per-process eviction counters to sysfs to keep track of
> how many eviction events have happened for each process.
>
> v2: rename the stats dir, and track all evictions per process, per device.
>
> Signed-off-by: Philip Cox
Some more comments
Am 2020-09-11 um 5:33 p.m. schrieb Mukul Joshi:
> SDMA utilization calculations are enabled/disabled by
> writing to SDMAx_PUB_DUMMY_REG2 register. Currently,
> enable this only for Arcturus.
>
> Signed-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
> ---
> drive
FYI, I applied this patch to amd-staging-drm-next. Sorry for the delay,
I finally caught up with my vacation backlog.
Regards,
Felix
Am 2020-09-22 um 10:28 p.m. schrieb kernel test robot:
> Fixes: 0b54e1e30e9f ("drm/amdgpu: Fix handling of KFD initialization
> failures")
> Signed-off-by:
Do you have more details about those test failures. In theory that test
should pass with noretry=0. If it fails, I'd rather look into the
problem than hiding it with a workaround.
Regards,
Felix
Am 2020-10-13 um 11:13 a.m. schrieb Chengming Gui:
> noretry = 0 cause some dGPU's kfd page fault
Am 2020-10-14 um 11:35 p.m. schrieb Chengming Gui:
> noretry = 0 cause some dGPU's kfd page fault tests fail,
> so set noretry to 1 for these special ASICs:
> vega20/navi10/navi14/ARCTURUS
>
> v2: merge raven and default case due to the same setting
> v3: remove ARCTURUS
>
> Signed-off-by:
Am 2020-10-16 um 11:34 a.m. schrieb Nirmoy Das:
> Compute queues are configurable with module param, num_kcq.
> amdgpu_gfx_is_high_priority_compute_queue was setting 1st 4 queues to
> high priority queue leaving a null drm scheduler in
> adev->gpu_sched[hw_ip]["normal_prio"].sched if num_kcq < 5
consistency.
>
> Signed-off-by: Alex Deucher
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 ---
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 11 +++
> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h| 1 +
> drivers/gpu/dr
Am 2020-10-19 um 10:02 a.m. schrieb Jay Cornwall:
> 0 causes instruction fetch stall at cache line boundary under some
> conditions on Navi10. A non-zero prefetch is the preferred default
> in any case.
>
> Fixes soft hang in Luxmark.
>
> Signed-off-by: Jay Cornwall
Review
Signed-off-by: Christian König
> Reviewed-by: Madhav Chauhan (v1)
I guess the speedup comes from the locking/prepare/commit overhead in
amdgpu_vm_update_mapping that is now getting called less frequently and
does more work in a single call.
Reviewed-by: Felix Kuehling
> ---
> drive
nction if CRAT is broken.
> v5: refine acpi crat good but no iommu support case, and rename the
> title.
> v6: fix the issue of dGPU initialized firstly, just modify the report
> value in the node_show().
>
> Signed-off-by: Huang Rui
Reviewed-by: Felix Kuehling
> ---
>
Am 2020-08-17 um 4:45 p.m. schrieb Mukul Joshi:
> Add __user annotation to fix related sparse warning while reading
> SDMA counters from userland.
>
> Reported-by: kernel test robot
> Signed-off-by: Mukul Joshi
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 8 +---
> 1
Sorry, more bike-shedding.
Am 2020-08-17 um 7:58 p.m. schrieb Mukul Joshi:
> Add __user annotation to fix related sparse warning while reading
> SDMA counters from userland.
>
> Reported-by: kernel test robot
> Signed-off-by: Mukul Joshi
> ---
>
Am 2020-08-18 um 9:09 a.m. schrieb Huang Rui:
> We still have a few iommu issues which need to address, so force raven
> as "dgpu" path for the moment.
>
> This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
> or ACPI CRAT table not correct.
>
> v2: use ignore_crat parameter
I'd recommend making this the first change in the series. Make
'drm/amdkfd: force raven as "dgpu" path' the second patch. That way it
only needs to change one place.
A few more comments inline.
Am 2020-08-18 um 9:09 a.m. schrieb Huang Rui:
> It's better to use inline function to wrap the iommu
Interesting. Does this actually work on Carrizo or Kaveri? I'd like to
see any Thunk changes needed to support this before giving my R-b. For
now this patch is
Acked-by: Felix Kuehling
Am 2020-08-18 um 9:09 a.m. schrieb Huang Rui:
> We already support the fallback path, so it doesn't n
d-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
> ---
> .../drm/amd/amdkfd/kfd_device_queue_manager.c | 28 ++-
> .../drm/amd/amdkfd/kfd_device_queue_manager.h | 8 +-
> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 ++--
> 3 files changed, 12 insertion
Am 2020-08-17 um 1:05 p.m. schrieb Mukul Joshi:
> To prevent reporting erroneous SDMA usage, initialize SDMA
> activity counter to 0 before using.
>
> Signed-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 1 +
> 1 file
Am 2020-08-28 um 1:53 p.m. schrieb Mukul Joshi:
> Add support for reporting GPU reset events through SMI. KFD
> would report both pre and post GPU reset events.
>
> Signed-off-by: Mukul Joshi
Minor coding-style nit-picks inline. With those fixed, this patch is
Reviewed-by: Fe
Am 2020-08-28 um 8:38 p.m. schrieb Mukul Joshi:
> Replace spaces with Tabs to fix indentation in kfd_smi_event
> enum.
>
> Signed-off-by: Mukul Joshi
Reviewed-by: Felix Kuehling
> ---
> include/uapi/linux/kfd_ioctl.h | 6 +++---
> 1 file changed, 3 insertions(+), 3
Just for Carrizo, HDP flushing doesn't make a lot of sense because we
don't use HDP to access the framebuffer.
The code you're changing doesn't look Carrizo-specific, but VI-specific.
So it would affect Fiji and Polaris as well. We already support Fiji and
Polaris dGPUs with KFD, apparently
Am 2020-08-21 um 8:50 a.m. schrieb Huang Rui:
> We still have a few iommu issues which need to address, so force raven
> as "dgpu" path for the moment.
>
> This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
> or ACPI CRAT table not correct.
>
> v2: Use ignore_crat parameter
Am 2020-08-21 um 9:42 p.m. schrieb Huang Rui:
> On Sat, Aug 22, 2020 at 12:48:00AM +0800, Kuehling, Felix wrote:
>> Am 2020-08-21 um 8:50 a.m. schrieb Huang Rui:
>>> We still have a few iommu issues which need to address, so force raven
>>> as "dgpu" path for the moment.
>>>
>>> This is to add
Am 2020-08-20 um 11:53 p.m. schrieb Huang Rui:
> On Fri, Aug 21, 2020 at 10:41:17AM +0800, Kuehling, Felix wrote:
>> Am 2020-08-20 um 4:40 a.m. schrieb Huang Rui:
>>> We still have a few iommu issues which need to address, so force raven
>>> as "dgpu" path for the moment.
>>>
>>> This is to add
Am 2020-08-19 um 11:01 a.m. schrieb Huang Rui:
> On Wed, Aug 19, 2020 at 10:36:05PM +0800, Kuehling, Felix wrote:
>> Just for Carrizo, HDP flushing doesn't make a lot of sense because we
>> don't use HDP to access the framebuffer.
> OK, so soc15 and later need use HDP to access the framebuffer
Am 2020-08-19 um 7:06 a.m. schrieb Huang Rui:
> We still have a few iommu issues which need to address, so force raven
> as "dgpu" path for the moment.
>
> This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
> or ACPI CRAT table not correct.
>
> v2: Use ignore_crat parameter
On 2020-08-19 7:56 p.m., Huang Rui wrote:
On Wed, Aug 19, 2020 at 11:38:34PM +0800, Kuehling, Felix wrote:
Am 2020-08-19 um 7:06 a.m. schrieb Huang Rui:
We still have a few iommu issues which need to address, so force raven
as "dgpu" path for the moment.
This is to add the fallback path to
[+amd-gfx]
Am 2020-08-19 um 12:15 p.m. schrieb Ba, Gang:
> [AMD Official Use Only - Internal Distribution Only]
>
> Hi Felix,
>
> For write point update, what is the best way to do:
Do you mean read pointer update? The hardware updates the write pointer,
the driver updates the read pointer.
>
Am 2020-08-19 um 11:09 p.m. schrieb Huang Rui:
> On Thu, Aug 20, 2020 at 08:18:57AM +0800, Kuehling, Felix wrote:
>> On 2020-08-19 7:56 p.m., Huang Rui wrote:
>>> On Wed, Aug 19, 2020 at 11:38:34PM +0800, Kuehling, Felix wrote:
Am 2020-08-19 um 7:06 a.m. schrieb Huang Rui:
> We still
Am 2020-08-20 um 4:40 a.m. schrieb Huang Rui:
> We still have a few iommu issues which need to address, so force raven
> as "dgpu" path for the moment.
>
> This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled
> or ACPI CRAT table not correct.
>
> v2: Use ignore_crat parameter
MMU_NOTIFY_PROTECTION_VMA is not specific to NUMA auto-balancing. It can
also be the result of an mprotect system call which actually makes the
VMA read-only. I don't think it's OK to ignore that notifier in the
general case.
Regards,
Felix
Am 2020-08-19 um 2:00 p.m. schrieb Philip Yang:
>
Am 2020-08-26 um 4:01 p.m. schrieb Mukul Joshi:
> Add support for reporting GPU reset events through SMI. KFD
> would report both pre and post GPU reset events.
>
> Signed-off-by: Mukul Joshi
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 4 +++
> drivers/gpu/drm/amd/amdkfd/kfd_priv.h
No need to use a function pointer because the implementation is not
ASIC-specific.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 1
No need to use a function pointer because the implementation is not
ASIC-specific. This fixes missing support due to a missing function
pointer on Arcturus.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c | 1 -
drivers/gpu/drm/amd/amdgpu
The KFD part looks good to me, other than the SDMA comment that Guchun
pointed out. With that fixed this patch is
Acked-by: Felix Kuehling
Thanks,
Felix
Am 2020-08-24 um 6:33 a.m. schrieb Stanley.Yang:
> The ctx->features are new RAS implementation which
> is only available f
Am 2020-08-20 um 5:38 a.m. schrieb Huang Rui:
> On Thu, Aug 20, 2020 at 08:31:25AM +0800, Huang Rui wrote:
>> On Thu, Aug 20, 2020 at 08:18:57AM +0800, Kuehling, Felix wrote:
>>> On 2020-08-19 7:56 p.m., Huang Rui wrote:
On Wed, Aug 19, 2020 at 11:38:34PM +0800, Kuehling, Felix wrote:
>
Am 2020-09-29 um 7:31 a.m. schrieb Kent Russell:
> The VG20 DIDs 66a0, 66a1 and 66a4 are used for various SKUs that may or may
> not have the FRU EEPROM on it. Parse the VBIOS to check for server SKU
> variants (D131 or D134) until a more general solution can be determined.
>
> Signed-off-by:
: Remove string-based logic, correct the VBIOS string comment
>
> Signed-off-by: Kent Russell
Reviewed-by: Felix Kuehling
> ---
> .../gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c| 34 +--
> 1 file changed, 23 insertions(+), 11 deletions(-)
>
> dif
The series is
Acked-by: Felix Kuehling
I'm hoping Laurent can give it a more through and informed R-b.
Thanks,
Felix
Am 2020-10-01 um 2:24 p.m. schrieb Jay Cornwall:
> ATC and MTYPE fields do not exist in gfx9 or later.
>
> Signed-off-by: Jay Cornwall
> Cc: Lauren
lso need kvfree to free the
memory. For consistency, use kvmalloc for GPU CRAT allocation, too, and
replace the kmemdup in kfd_create_crat_image_acpi with kvmalloc+memcpy.
Let's make that kmalloc->kvmalloc change a second commit.
This patch is
Reviewed-by: Felix Kuehling
Regards
If I had to guess, I'd say something HMM-related. There has been some
back-and-forth between kernel releases. So I won't say anything more
specific without knowing exactly which branch or release you're on.
Regards,
Felix
Am 2020-09-25 um 10:29 a.m. schrieb Thomas Zimmermann:
> Hi,
>
>
[+Alex]
I think this was added for Arcturus, which shares the same IH IP as
Navi10 and needs to support virtualization.
Regards,
Felix
Am 2020-09-25 um 7:30 a.m. schrieb Zhang, Hawking:
> [AMD Public Use]
>
> Hi Likun,
>
> Let's take a step back to check with Alex S why he add the ASIC type
hen
> they can be added to this structure, but I don't think the eviction stats
> should be.
Thanks. Makes sense. I expect that all the stats will need the PDD. They
are all per-process, per-device stats.
With the small nit-picks fixed, the patch is
Reviewed-by: Felix Kuehling
Regards,
| snprintf(buffer, PAGE_SIZE, "%s"fmt, buffer, __VA_ARGS__)
| ^
This patch fixes the warnings and makes the sysfs code more efficient
by remembering the offset in the buffer between append operations.
Signed-off-by: Feli
check")
Fixes: 522b89c63370 ("drm/amdkfd: Track SDMA utilization per process")
Signed-off-by: Colin Ian King
Reviewed-by: Felix Kuehling
I applied the patch to our internal amd-staging-drm-next.
Regards,
Felix
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 +++--
ion of past acitivt counter under dqm_lock.
Typo: activity
Other than that, the patch is
Reviewed-by: Felix Kuehling
>
> Signed-off-by: Mukul Joshi
> ---
> .../drm/amd/amdkfd/kfd_device_queue_manager.c | 57
> .../drm/amd/amdkfd/kfd_device_queue_manager.h | 2
Am 2020-05-22 um 3:38 p.m. schrieb Simon Ser:
> drm_gem_object_put_unlocked has been renamed to drm_gem_object_put.
Alex, I guess you'll need to apply this patch when you include
e07ddb0ce7cd in a pull request to Dave Airlie. I don't think it makes
sense to apply this on amd-kfd-staging until the
Hi Fenghua,
The PASID width in KFD is currently limited to 16 bits. I believe this
reflects what our hardware can handle. KFD will never allocate a PASID
bigger than 16 bits. That said, I'm OK with changing this field in the
kfd_process structure to unsigned int. Generally, I find uint16_t in
ven when KFD failed.
Signed-off-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 11 ++-
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1 +
drivers/gpu/drm/amd/amdkfd/kfd_module.c| 1 +
3 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/
Some nit-picks and one more possible simplification inline. I want to
make adding more stats later as painless as possible.
Looks good otherwise.
Am 2020-09-16 um 2:42 p.m. schrieb Philip Cox:
> Add per-process eviction counters to sysfs to keep track of
> how many eviction events have happened
So rocm-smi reads the decimal and converts it to hex? Then changing KFD
will break rocm-smi. If you want to fix rocminfo, you'll need to fix it
in the rocminfo code to do the conversion to hex.
Regards,
Felix
Am 2020-10-28 um 12:02 p.m. schrieb Russell, Kent:
> [AMD Public Use]
>
> rocminfo
Please also remove the broken code that initializes gpu->unique_id and
remove the unique_id field from the structure.
Regards,
Felix
Am 2020-10-28 um 11:22 a.m. schrieb Kent Russell:
> Since the unique_id is now obtained in amdgpu in smu_late_init,
> topology's device addition is now happening
This is an ABI-breaking change. Is any user mode code using this already?
Regards,
Felix
Am 2020-10-28 um 11:22 a.m. schrieb Kent Russell:
> amdgpu's unique_id prints in hex format, so change topology's printout
> to hex by adding a new sysfs_print macro specifically for hex output,
> and use
Am 2020-07-21 um 5:01 p.m. schrieb Mukul Joshi:
> Add support for reporting thermal throttling events through SMI.
> Also, add a counter to count the number of throttling interrupts
> observed and report the count in the SMI event message.
>
> Signed-off-by: Mukul Joshi
> ---
>
Am 2020-08-04 um 4:42 p.m. schrieb Philip Yang:
> If multiple process share system memory through /dev/shm, KFD allocate
> memory should not fail if it reaches the system memory limit because
> one copy of physical system memory are shared by multiple process.
>
> Add module parameter
gt; because system memory reaches limit.
>
> Signed-off-by: Philip Yang
Reviewed-by: Felix Kuehling
> ---
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++
> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 6 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
Am 2020-08-07 um 2:57 a.m. schrieb Christian König:
[snip]
> That's a really good argument, but I still hesitate to merge this
> patch. How severe is the lockdep splat?
I argued before that any lockdep splat is bad, because it disables
further lockdep checking and can hide other lockdep problems
The commit headline is misleading. An annotation would be something like
replacing mutex_lock with mutex_lock_nested. You're not annotating
anything, you're actually changing the locking.
Am 2020-08-05 um 9:24 p.m. schrieb Dennis Li:
> [ 264.483189]
Am 2020-08-07 um 4:25 a.m. schrieb Huang Rui:
> We still have a few iommu issues which need to address, so force raven
> as "dgpu" path for the moment.
>
> Will enable IOMMUv2 since the issues are fixed.
Do you mean "_when_ the issues are fixed"?
The current iommuv2 troubles aside, I think this
1101 - 1200 of 3288 matches
Mail list logo