date:20231204

Re: [PATCH v14 2/9] platform/x86/amd: Add support for AMD ACPI based Wifi band RFI mitigation feature

2023-12-04 Thread Ma, Jun

Hi Mario,

On 11/30/2023 1:24 AM, Mario Limonciello wrote:
> On 11/29/2023 03:13, Ma Jun wrote:
>> Due to electrical and mechanical constraints in certain platform designs
>> there may be likely interference of relatively high-powered harmonics of
>> the (G-)DDR memory clocks with local radio module frequency bands used
>> by Wifi 6/6e/7.
>>
>> To mitigate this, AMD has introduced a mechanism that devices can use to
>> notify active use of particular frequencies so that other devices can make
>> relative internal adjustments as necessary to avoid this resonance.
>>
>> Co-developed-by: Evan Quan 
>> Signed-off-by: Evan Quan 
>> Signed-off-by: Ma Jun 
> 
> I have one very minor nit below that if no other feedback is needed for 
> the series can be fixed when committing.
> 
> Reviewed-by: Mario Limonciello 
> 
>>
>> --
>> v11:
>>   - fix typo(Simon)
>> v12:
>>   - Fix the code logic (Rafael)
>>   - Move amd_wbrf.c to drivers/platform/x86/amd/wbrf.c
>>   - Updated Evan's email because he's no longer at AMD.Thanks
>> for his work in earlier versions.
>> v13:
>>   - Fix the format issue (IIpo Jarvinen)
>>   - Add comment for some functions
>> v14:
>>   - Use the apci_check_dsm and acpi_evaluate_dsm (Hans de Goede)
>> ---
>>   drivers/platform/x86/amd/Kconfig  |  15 ++
>>   drivers/platform/x86/amd/Makefile |   1 +
>>   drivers/platform/x86/amd/wbrf.c   | 337 ++
>>   include/linux/acpi_amd_wbrf.h |  94 +
>>   4 files changed, 447 insertions(+)
>>   create mode 100644 drivers/platform/x86/amd/wbrf.c
>>   create mode 100644 include/linux/acpi_amd_wbrf.h
>>
>> diff --git a/drivers/platform/x86/amd/Kconfig 
>> b/drivers/platform/x86/amd/Kconfig
>> index 55f3a2fc6aec..ac2b7758a04f 100644
>> --- a/drivers/platform/x86/amd/Kconfig
>> +++ b/drivers/platform/x86/amd/Kconfig
>> @@ -18,3 +18,18 @@ config AMD_HSMP
>>   
>>If you choose to compile this driver as a module the module will be
>>called amd_hsmp.
>> +
>> +config AMD_WBRF
>> +bool "AMD Wifi RF Band mitigations (WBRF)"
>> +depends on ACPI
>> +default n
> 
> I believe the "default n" is unnecessary as it's implied.

Thanks for review. I'll fix this and other issues your commented
in the next version.

Regards,
Ma Jun

> 
>> +help
>> +  WBRF(Wifi Band RFI mitigation) mechanism allows Wifi drivers
>> +  to notify the frequencies they are using so that other hardware
>> +  can be reconfigured to avoid harmonic conflicts.
>> +
>> +  AMD provides an ACPI based mechanism to support WBRF on platform with
>> +  appropriate underlying support.
>> +
>> +  This mechanism will only be activated on platforms that advertise a
>> +  need for it.
>> diff --git a/drivers/platform/x86/amd/Makefile 
>> b/drivers/platform/x86/amd/Makefile
>> index f04932b7a7d1..dcec0a46f8af 100644
>> --- a/drivers/platform/x86/amd/Makefile
>> +++ b/drivers/platform/x86/amd/Makefile
>> @@ -8,3 +8,4 @@ obj-$(CONFIG_AMD_PMC)+= pmc/
>>   amd_hsmp-y := hsmp.o
>>   obj-$(CONFIG_AMD_HSMP) += amd_hsmp.o
>>   obj-$(CONFIG_AMD_PMF)  += pmf/
>> +obj-$(CONFIG_AMD_WBRF)  += wbrf.o
>> diff --git a/drivers/platform/x86/amd/wbrf.c 
>> b/drivers/platform/x86/amd/wbrf.c
>> new file mode 100644
>> index ..36e90a1159be
>> --- /dev/null
>> +++ b/drivers/platform/x86/amd/wbrf.c
>> @@ -0,0 +1,337 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Wifi Frequency Band Manage Interface
>> + * Copyright (C) 2023 Advanced Micro Devices
>> + */
>> +
>> +#include 
>> +#include 
>> +
>> +#define ACPI_AMD_WBRF_METHOD"\\WBRF"
>> +
>> +/*
>> + * Functions bit vector for WBRF method
>> + *
>> + * Bit 0: WBRF supported.
>> + * Bit 1: Function 1 (Add / Remove frequency) is supported.
>> + * Bit 2: Function 2 (Get frequency list) is supported.
>> + */
>> +#define WBRF_ENABLED0x0
>> +#define WBRF_RECORD 0x1
>> +#define WBRF_RETRIEVE   0x2
>> +
>> +#define WBRF_REVISION   0x1
>> +
>> +/*
>> + * The data structure used for WBRF_RETRIEVE is not naturally aligned.
>> + * And unfortunately the design has been settled down.
>> + */
>> +struct amd_wbrf_ranges_out {
>> +u32 num_of_ranges;
>> +struct freq_band_range  band_list[MAX_NUM_OF_WBRF_RANGES];
>> +} __packed;
>> +
>> +static const guid_t wifi_acpi_dsm_guid =
>> +GUID_INIT(0x7b7656cf, 0xdc3d, 0x4c1c,
>> +  0x83, 0xe9, 0x66, 0xe7, 0x21, 0xde, 0x30, 0x70);
>> +
>> +/*
>> + * Used to notify consumer (amdgpu driver currently) about
>> + * the wifi frequency is change.
>> + */
>> +static BLOCKING_NOTIFIER_HEAD(wbrf_chain_head);
>> +
>> +static int wbrf_record(struct acpi_device *adev, uint8_t action, struct 
>> wbrf_ranges_in_out *in)
>> +{
>> +union acpi_object argv4;
>> +union acpi_object *tmp;
>> +union acpi_object *obj;
>> +u32 num_of_ranges = 0;
>> +u32 num_of_elements;
>> +u32

Re: [PATCH v14 2/9] platform/x86/amd: Add support for AMD ACPI based Wifi band RFI mitigation feature

2023-12-04 Thread Ma, Jun

Hi Hans,

On 12/4/2023 9:00 PM, Hans de Goede wrote:
> Hi,
> 
> On 11/29/23 10:13, Ma Jun wrote:
>> Due to electrical and mechanical constraints in certain platform designs
>> there may be likely interference of relatively high-powered harmonics of
>> the (G-)DDR memory clocks with local radio module frequency bands used
>> by Wifi 6/6e/7.
>>
>> To mitigate this, AMD has introduced a mechanism that devices can use to
>> notify active use of particular frequencies so that other devices can make
>> relative internal adjustments as necessary to avoid this resonance.
>>
>> Co-developed-by: Evan Quan 
>> Signed-off-by: Evan Quan 
>> Signed-off-by: Ma Jun 
>>
>> --
>> v11:
>>  - fix typo(Simon)
>> v12:
>>  - Fix the code logic (Rafael)
>>  - Move amd_wbrf.c to drivers/platform/x86/amd/wbrf.c
>>  - Updated Evan's email because he's no longer at AMD.Thanks
>> for his work in earlier versions.
>> v13:
>>  - Fix the format issue (IIpo Jarvinen)
>>  - Add comment for some functions
>> v14:
>>  - Use the apci_check_dsm and acpi_evaluate_dsm (Hans de Goede)
> 
> Thank you this is much better.
> 
> I notice that the #define ACPI_AMD_WBRF_METHOD"\\WBRF"
> still exists though and that this is still used in
> static bool acpi_amd_wbrf_supported_system(void).
> 
> I think it might be better to just remove
> these 2 all together.
> 
> Checking if a DSM with the expected GUID is present
> and if that has the correct bits set in its supported
> mask should be enough.
> 
> And on future systems the implementer may decide to
> not have a WBRF helper function at all and instead
> handle everything in the _DSM method.
> 
> So the "\\WBRF" check seems to be checking for
> what really is an implementation detail.
> 
Yes,you are right. I will fix these issues.
Thanks for your review and suggestion. 

Regards,
Ma Jun
> 2 other very small remark
> 
>> +/**
>> + * acpi_amd_wbrf_supported_producer - determine if the WBRF can be enabled
>> + *for the device as a producer
>> + *
>> + * @dev: device pointer
>> + *
>> + * Check if the platform equipped with necessary implementations to
>> + * support WBRF for the device as a producer.
>> + *
>> + * Return:
>> + * true if WBRF is supported, otherwise returns false
>> + */
>> +bool acpi_amd_wbrf_supported_producer(struct device *dev)
>> +{
>> +struct acpi_device *adev;
>> +
>> +adev = ACPI_COMPANION(dev);
>> +if (!adev)
>> +return false;
>> +
>> +if (!acpi_amd_wbrf_supported_system())
>> +return false;
>> +
>> +
>> +return acpi_check_dsm(adev->handle, _acpi_dsm_guid,
>> +  WBRF_REVISION, BIT(WBRF_RECORD));
>> +}
>> +EXPORT_SYMBOL_GPL(acpi_amd_wbrf_supported_producer);
> 
> Please don't use double empty lines, one empty line to separate things
> is enough.
> 
>> +
>> +/**
>> + * acpi_amd_wbrf_supported_consumer - determine if the WBRF can be enabled
>> + *for the device as a consumer
>> + *
>> + * @dev: device pointer
>> + *
>> + * Determine if the platform equipped with necessary implementations to
>> + * support WBRF for the device as a consumer.
>> + *
>> + * Return:
>> + * true if WBRF is supported, otherwise returns false.
>> + */
>> +bool acpi_amd_wbrf_supported_consumer(struct device *dev)
>> +{
>> +struct acpi_device *adev;
>> +
>> +adev = ACPI_COMPANION(dev);
>> +if (!adev)
>> +return false;
>> +
>> +if (!acpi_amd_wbrf_supported_system())
>> +return false;
>> +
>> +return acpi_check_dsm(adev->handle, _acpi_dsm_guid,
>> +  WBRF_REVISION, BIT(WBRF_RETRIEVE));
>> +}
>> +EXPORT_SYMBOL_GPL(acpi_amd_wbrf_supported_consumer);
>> +
>> +/**
>> + * amd_wbrf_retrieve_freq_band - retrieve current active frequency
>> + * bands
> 
> You may go a bit over the 80 chars limit, please just make this
> a single line:
> 
>  * amd_wbrf_retrieve_freq_band - retrieve current active frequency bands
> 
>> + *
>> + * @dev: device pointer
>> + * @out: output structure containing all the active frequency bands
>> + *
>> + * Retrieve the current active frequency bands which were broadcasted
>> + * by other producers. The consumer who calls this API should take
>> + * proper actions if any of the frequency band may cause RFI with its
>> + * own frequency band used.
>> + *
>> + * Return:
>> + * 0 for getting wifi freq band successfully.
>> + * Returns a negative error code for failure.
>> + */
> 
> Regards,
> 
> Hans
>

Re: [PATCH 1/6] Revert "drm/prime: Unexport helpers for fd/handle conversion"

2023-12-04 Thread Felix Kuehling




On 2023-12-04 12:49, Deucher, Alexander wrote:

[AMD Official Use Only - General]


-Original Message-
From: Kuehling, Felix 
Sent: Friday, December 1, 2023 6:40 PM
To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Deucher,
Alexander 
Cc: Daniel Vetter ; Koenig, Christian
; Thomas Zimmermann

Subject: Re: [PATCH 1/6] Revert "drm/prime: Unexport helpers for fd/handle
conversion"

Hi Alex,

I'm about to push patches 1-3 to the rebased amd-staging-drm-next. It would
be good to get patch 1 into drm-fixes so that Linux 6.6 will be the only kernel
without these prime helpers. That would minimize the hassle for DKMS driver
installations on future distros.

Already done:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0514f63cfff38a0dcb7ba9c5f245827edc0c5107


Thank you, all! I also saw Sasha Levin is backporting it to 6.6.

Cheers,
  Felix




Alex


Thanks,
Felix


On 2023-12-01 18:34, Felix Kuehling wrote:

This reverts commit 71a7974ac7019afeec105a54447ae1dc7216cbb3.

These helper functions are needed for KFD to export and import DMABufs
the right way without duplicating the tracking of DMABufs associated
with GEM objects while ensuring that move notifier callbacks are
working as intended.

Acked-by: Christian König 
Acked-by: Thomas Zimmermann 
Acked-by: Daniel Vetter 
Signed-off-by: Felix Kuehling 
---
   drivers/gpu/drm/drm_prime.c | 33 ++---
   include/drm/drm_prime.h |  7 +++
   2 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 63b709a67471..834a5e28abbe 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -278,7 +278,7 @@ void drm_gem_dmabuf_release(struct dma_buf

*dma_buf)

   }
   EXPORT_SYMBOL(drm_gem_dmabuf_release);

-/*
+/**
* drm_gem_prime_fd_to_handle - PRIME import function for GEM drivers
* @dev: drm_device to import into
* @file_priv: drm file-private structure @@ -292,9 +292,9 @@
EXPORT_SYMBOL(drm_gem_dmabuf_release);
*
* Returns 0 on success or a negative error code on failure.
*/
-static int drm_gem_prime_fd_to_handle(struct drm_device *dev,
- struct drm_file *file_priv, int prime_fd,
- uint32_t *handle)
+int drm_gem_prime_fd_to_handle(struct drm_device *dev,
+  struct drm_file *file_priv, int prime_fd,
+  uint32_t *handle)
   {
 struct dma_buf *dma_buf;
 struct drm_gem_object *obj;
@@ -360,6 +360,7 @@ static int drm_gem_prime_fd_to_handle(struct

drm_device *dev,

 dma_buf_put(dma_buf);
 return ret;
   }
+EXPORT_SYMBOL(drm_gem_prime_fd_to_handle);

   int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_priv)
@@ -408,7 +409,7 @@ static struct dma_buf

*export_and_register_object(struct drm_device *dev,

 return dmabuf;
   }

-/*
+/**
* drm_gem_prime_handle_to_fd - PRIME export function for GEM drivers
* @dev: dev to export the buffer from
* @file_priv: drm file-private structure @@ -421,10 +422,10 @@
static struct dma_buf *export_and_register_object(struct drm_device *dev,
* The actual exporting from GEM object to a dma-buf is done through the
* _gem_object_funcs.export callback.
*/
-static int drm_gem_prime_handle_to_fd(struct drm_device *dev,
- struct drm_file *file_priv, uint32_t handle,
- uint32_t flags,
- int *prime_fd)
+int drm_gem_prime_handle_to_fd(struct drm_device *dev,
+  struct drm_file *file_priv, uint32_t handle,
+  uint32_t flags,
+  int *prime_fd)
   {
 struct drm_gem_object *obj;
 int ret = 0;
@@ -506,6 +507,7 @@ static int drm_gem_prime_handle_to_fd(struct
drm_device *dev,

 return ret;
   }
+EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);

   int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_priv)
@@ -864,9 +866,9 @@

EXPORT_SYMBOL(drm_prime_get_contiguous_size);

* @obj: GEM object to export
* @flags: flags like DRM_CLOEXEC and DRM_RDWR
*
- * This is the implementation of the _gem_object_funcs.export
functions
- * for GEM drivers using the PRIME helpers. It is used as the default
for
- * drivers that do not set their own.
+ * This is the implementation of the _gem_object_funcs.export
+ functions for GEM drivers
+ * using the PRIME helpers. It is used as the default in
+ * drm_gem_prime_handle_to_fd().
*/
   struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
  int flags)
@@ -962,9 +964,10 @@ EXPORT_SYMBOL(drm_gem_prime_import_dev);
* @dev: drm_device to import into
* @dma_buf: dma-buf object to import
*
- * This

[PATCH AUTOSEL 5.10 6/7] drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer

2023-12-04 Thread Sasha Levin

From: Lu Yao 

[ Upstream commit 2161e09cd05a50d80736fe397145340d2e8f6c05 ]

For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init
'didt_rreg' and 'didt_wreg' to 'NULL'. But in func
'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT'
lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use
these two definitions won't be added for 'AMDGPU_FAMILY_SI'.

So, add null pointer judgment before calling.

Reviewed-by: Christian König 
Signed-off-by: Lu Yao 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 48df32dd352ed..3e573077368b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -459,6 +459,9 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_rreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
@@ -518,6 +521,9 @@ static ssize_t amdgpu_debugfs_regs_didt_write(struct file 
*f, const char __user
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_wreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
-- 
2.42.0

[PATCH AUTOSEL 5.15 09/10] drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer

2023-12-04 Thread Sasha Levin

From: Lu Yao 

[ Upstream commit 2161e09cd05a50d80736fe397145340d2e8f6c05 ]

For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init
'didt_rreg' and 'didt_wreg' to 'NULL'. But in func
'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT'
lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use
these two definitions won't be added for 'AMDGPU_FAMILY_SI'.

So, add null pointer judgment before calling.

Reviewed-by: Christian König 
Signed-off-by: Lu Yao 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index beb199d13451b..632d8df04ef45 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -342,6 +342,9 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_rreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
@@ -401,6 +404,9 @@ static ssize_t amdgpu_debugfs_regs_didt_write(struct file 
*f, const char __user
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_wreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
-- 
2.42.0

[PATCH AUTOSEL 6.1 13/17] drm/amd/display: update dcn315 lpddr pstate latency

2023-12-04 Thread Sasha Levin

From: Dmytro Laktyushkin 

[ Upstream commit c92da0403d373c03ea5c65c0260c7db6762013b0 ]

[WHY/HOW]
Increase the pstate latency to improve ac/dc transition

Reviewed-by: Charlene Liu 
Acked-by: Tom Chung 
Signed-off-by: Dmytro Laktyushkin 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
index 893991a0eb971..28b83133db910 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
@@ -324,7 +324,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
@@ -332,7 +332,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
@@ -340,7 +340,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
@@ -348,7 +348,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
-- 
2.42.0

[PATCH AUTOSEL 6.1 14/17] drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer

2023-12-04 Thread Sasha Levin

From: Lu Yao 

[ Upstream commit 2161e09cd05a50d80736fe397145340d2e8f6c05 ]

For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init
'didt_rreg' and 'didt_wreg' to 'NULL'. But in func
'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT'
lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use
these two definitions won't be added for 'AMDGPU_FAMILY_SI'.

So, add null pointer judgment before calling.

Reviewed-by: Christian König 
Signed-off-by: Lu Yao 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index fd796574f87a5..8123feb1a1161 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -479,6 +479,9 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_rreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
@@ -535,6 +538,9 @@ static ssize_t amdgpu_debugfs_regs_didt_write(struct file 
*f, const char __user
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_wreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
-- 
2.42.0

[PATCH AUTOSEL 6.6 28/32] drm/amdgpu: Use another offset for GC 9.4.3 remap

2023-12-04 Thread Sasha Levin

From: Lijo Lazar 

[ Upstream commit ed6e4f0a27ebafffbd12bf3878ab004787685d8a ]

The legacy region at 0x7F000 maps to valid registers in GC 9.4.3 SOCs.
Use 0x1A000 offset instead as MMIO register remap region.

Signed-off-by: Lijo Lazar 
Reviewed-by: Hawking Zhang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/soc15.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c 
b/drivers/gpu/drm/amd/amdgpu/soc15.c
index f5be40d7ba367..b85011106347c 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1159,6 +1159,11 @@ static int soc15_common_early_init(void *handle)
AMD_PG_SUPPORT_VCN_DPG |
AMD_PG_SUPPORT_JPEG;
adev->external_rev_id = adev->rev_id + 0x46;
+   /* GC 9.4.3 uses MMIO register region hole at a different 
offset */
+   if (!amdgpu_sriov_vf(adev)) {
+   adev->rmmio_remap.reg_offset = 0x1A000;
+   adev->rmmio_remap.bus_addr = adev->rmmio_base + 0x1A000;
+   }
break;
default:
/* FIXME: not supported yet */
-- 
2.42.0

[PATCH AUTOSEL 6.6 27/32] drm/amdkfd: Free gang_ctx_bo and wptr_bo in pqm_uninit

2023-12-04 Thread Sasha Levin

From: ZhenGuo Yin 

[ Upstream commit 72838777aa38352e20301e123b97110c456cd38e ]

[Why]
Memory leaks of gang_ctx_bo and wptr_bo.

[How]
Free gang_ctx_bo and wptr_bo in pqm_uninit.

v2: add a common function pqm_clean_queue_resource to
free queue's resources.
v3: reset pdd->pqd.num_gws when destorying GWS queue.

Reviewed-by: Felix Kuehling 
Signed-off-by: ZhenGuo Yin 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../amd/amdkfd/kfd_process_queue_manager.c| 54 +++
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index adb5e4bdc0b20..7d0f887d99558 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -169,16 +169,43 @@ int pqm_init(struct process_queue_manager *pqm, struct 
kfd_process *p)
return 0;
 }
 
+static void pqm_clean_queue_resource(struct process_queue_manager *pqm,
+struct process_queue_node *pqn)
+{
+   struct kfd_node *dev;
+   struct kfd_process_device *pdd;
+
+   dev = pqn->q->device;
+
+   pdd = kfd_get_process_device_data(dev, pqm->process);
+   if (!pdd) {
+   pr_err("Process device data doesn't exist\n");
+   return;
+   }
+
+   if (pqn->q->gws) {
+   if (KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 3) &&
+   !dev->kfd->shared_resources.enable_mes)
+   amdgpu_amdkfd_remove_gws_from_process(
+   pqm->process->kgd_process_info, pqn->q->gws);
+   pdd->qpd.num_gws = 0;
+   }
+
+   if (dev->kfd->shared_resources.enable_mes) {
+   amdgpu_amdkfd_free_gtt_mem(dev->adev, pqn->q->gang_ctx_bo);
+   if (pqn->q->wptr_bo)
+   amdgpu_amdkfd_free_gtt_mem(dev->adev, pqn->q->wptr_bo);
+   }
+}
+
 void pqm_uninit(struct process_queue_manager *pqm)
 {
struct process_queue_node *pqn, *next;
 
list_for_each_entry_safe(pqn, next, >queues, process_queue_list) {
-   if (pqn->q && pqn->q->gws &&
-   KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 3) &&
-   !pqn->q->device->kfd->shared_resources.enable_mes)
-   
amdgpu_amdkfd_remove_gws_from_process(pqm->process->kgd_process_info,
-   pqn->q->gws);
+   if (pqn->q)
+   pqm_clean_queue_resource(pqm, pqn);
+
kfd_procfs_del_queue(pqn->q);
uninit_queue(pqn->q);
list_del(>process_queue_list);
@@ -460,22 +487,7 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, 
unsigned int qid)
goto err_destroy_queue;
}
 
-   if (pqn->q->gws) {
-   if (KFD_GC_VERSION(pqn->q->device) != IP_VERSION(9, 4, 
3) &&
-   !dev->kfd->shared_resources.enable_mes)
-   amdgpu_amdkfd_remove_gws_from_process(
-   pqm->process->kgd_process_info,
-   pqn->q->gws);
-   pdd->qpd.num_gws = 0;
-   }
-
-   if (dev->kfd->shared_resources.enable_mes) {
-   amdgpu_amdkfd_free_gtt_mem(dev->adev,
-  pqn->q->gang_ctx_bo);
-   if (pqn->q->wptr_bo)
-   amdgpu_amdkfd_free_gtt_mem(dev->adev, 
pqn->q->wptr_bo);
-
-   }
+   pqm_clean_queue_resource(pqm, pqn);
uninit_queue(pqn->q);
}
 
-- 
2.42.0

[PATCH AUTOSEL 6.6 26/32] drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer

2023-12-04 Thread Sasha Levin

From: Lu Yao 

[ Upstream commit 2161e09cd05a50d80736fe397145340d2e8f6c05 ]

For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init
'didt_rreg' and 'didt_wreg' to 'NULL'. But in func
'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT'
lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use
these two definitions won't be added for 'AMDGPU_FAMILY_SI'.

So, add null pointer judgment before calling.

Reviewed-by: Christian König 
Signed-off-by: Lu Yao 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index 05405da51e7a2..3f2126f99923e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -638,6 +638,9 @@ static ssize_t amdgpu_debugfs_regs_didt_read(struct file 
*f, char __user *buf,
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_rreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
@@ -694,6 +697,9 @@ static ssize_t amdgpu_debugfs_regs_didt_write(struct file 
*f, const char __user
if (size & 0x3 || *pos & 0x3)
return -EINVAL;
 
+   if (!adev->didt_wreg)
+   return -EOPNOTSUPP;
+
r = pm_runtime_get_sync(adev_to_drm(adev)->dev);
if (r < 0) {
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);
-- 
2.42.0

[PATCH AUTOSEL 6.6 24/32] drm/amdkfd: Use common function for IP version check

2023-12-04 Thread Sasha Levin

From: Mukul Joshi 

[ Upstream commit 2f86bf79b63dbe6963ebc647b77a5f576a906b40 ]

KFD_GC_VERSION was recently updated to use a new function
for IP version checks. As a result, use KFD_GC_VERSION as
the common function for all IP version checks in KFD.

Signed-off-by: Mukul Joshi 
Reviewed-by: Harish Kasiviswanathan 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index fa24e1852493d..df7a5cdb8693f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1128,7 +1128,7 @@ static inline struct kfd_node *kfd_node_by_irq_ids(struct 
amdgpu_device *adev,
struct kfd_dev *dev = adev->kfd.dev;
uint32_t i;
 
-   if (adev->ip_versions[GC_HWIP][0] != IP_VERSION(9, 4, 3))
+   if (KFD_GC_VERSION(dev) != IP_VERSION(9, 4, 3))
return dev->nodes[0];
 
for (i = 0; i < dev->num_nodes; i++)
-- 
2.42.0

[PATCH AUTOSEL 6.6 25/32] drm/amd/display: update dcn315 lpddr pstate latency

2023-12-04 Thread Sasha Levin

From: Dmytro Laktyushkin 

[ Upstream commit c92da0403d373c03ea5c65c0260c7db6762013b0 ]

[WHY/HOW]
Increase the pstate latency to improve ac/dc transition

Reviewed-by: Charlene Liu 
Acked-by: Tom Chung 
Signed-off-by: Dmytro Laktyushkin 
Tested-by: Daniel Wheeler 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 .../drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c| 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
index b2c4f97afc8b4..8776055bbeaae 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn315/dcn315_clk_mgr.c
@@ -334,7 +334,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
@@ -342,7 +342,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
@@ -350,7 +350,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
@@ -358,7 +358,7 @@ static struct wm_table lpddr5_wm_table = {
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
-   .pstate_latency_us = 11.65333,
+   .pstate_latency_us = 129.0,
.sr_exit_time_us = 11.5,
.sr_enter_plus_exit_time_us = 14.5,
.valid = true,
-- 
2.42.0

[PATCH AUTOSEL 6.6 23/32] drm/amdgpu: Do not issue gpu reset from nbio v7_9 bif interrupt

2023-12-04 Thread Sasha Levin

From: Hawking Zhang 

[ Upstream commit 884e9b0827e889a8742e203ccd052101fb0b945d ]

In nbio v7_9, host driver should not issu gpu reset

Signed-off-by: Hawking Zhang 
Reviewed-by: Stanley Yang 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c 
b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
index f85eec05d2181..ae45656eb8779 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c
@@ -604,11 +604,6 @@ static void 
nbio_v7_9_handle_ras_controller_intr_no_bifring(struct amdgpu_device
 
dev_info(adev->dev, "RAS controller interrupt triggered "
"by NBIF error\n");
-
-   /* ras_controller_int is dedicated for nbif ras error,
-* not the global interrupt for sync flood
-*/
-   amdgpu_ras_reset_gpu(adev);
}
 }
 
-- 
2.42.0

[PATCH v2] drm/amdkfd: Use partial hmm page walk during buffer validation in SVM

2023-12-04 Thread Xiaogang . Chen

From: Xiaogang Chen 

v2:
-not need calculate vram page number for new registered svm range, only
do it for split vram pages.

SVM uses hmm page walk to valid buffer before map to gpu vm. After have partial
migration/mapping do validation on same vm range as migration/map do instead of
whole svm range that can be very large. This change is expected to improve svm
code performance.

Signed-off-by: Xiaogang Chen
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 149 ---
 1 file changed, 109 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 2834fb351818..2f14cd1a3416 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -158,13 +158,12 @@ svm_is_valid_dma_mapping_addr(struct device *dev, 
dma_addr_t dma_addr)
 static int
 svm_range_dma_map_dev(struct amdgpu_device *adev, struct svm_range *prange,
  unsigned long offset, unsigned long npages,
- unsigned long *hmm_pfns, uint32_t gpuidx, uint64_t 
*vram_pages)
+ unsigned long *hmm_pfns, uint32_t gpuidx)
 {
enum dma_data_direction dir = DMA_BIDIRECTIONAL;
dma_addr_t *addr = prange->dma_addr[gpuidx];
struct device *dev = adev->dev;
struct page *page;
-   uint64_t vram_pages_dev;
int i, r;
 
if (!addr) {
@@ -174,7 +173,6 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
prange->dma_addr[gpuidx] = addr;
}
 
-   vram_pages_dev = 0;
addr += offset;
for (i = 0; i < npages; i++) {
if (svm_is_valid_dma_mapping_addr(dev, addr[i]))
@@ -184,7 +182,6 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
if (is_zone_device_page(page)) {
struct amdgpu_device *bo_adev = 
prange->svm_bo->node->adev;
 
-   vram_pages_dev++;
addr[i] = (hmm_pfns[i] << PAGE_SHIFT) +
   bo_adev->vm_manager.vram_base_offset -
   bo_adev->kfd.pgmap.range.start;
@@ -201,14 +198,14 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
pr_debug_ratelimited("dma mapping 0x%llx for page addr 0x%lx\n",
 addr[i] >> PAGE_SHIFT, page_to_pfn(page));
}
-   *vram_pages = vram_pages_dev;
+
return 0;
 }
 
 static int
 svm_range_dma_map(struct svm_range *prange, unsigned long *bitmap,
  unsigned long offset, unsigned long npages,
- unsigned long *hmm_pfns, uint64_t *vram_pages)
+ unsigned long *hmm_pfns)
 {
struct kfd_process *p;
uint32_t gpuidx;
@@ -227,7 +224,7 @@ svm_range_dma_map(struct svm_range *prange, unsigned long 
*bitmap,
}
 
r = svm_range_dma_map_dev(pdd->dev->adev, prange, offset, 
npages,
- hmm_pfns, gpuidx, vram_pages);
+ hmm_pfns, gpuidx);
if (r)
break;
}
@@ -982,11 +979,6 @@ svm_range_split_nodes(struct svm_range *new, struct 
svm_range *old,
new->svm_bo = svm_range_bo_ref(old->svm_bo);
new->ttm_res = old->ttm_res;
 
-   /* set new's vram_pages as old range's now, the acurate vram_pages
-* will be updated during mapping
-*/
-   new->vram_pages = min(old->vram_pages, new->npages);
-
spin_lock(>svm_bo->list_lock);
list_add(>svm_bo_list, >svm_bo->range_list);
spin_unlock(>svm_bo->list_lock);
@@ -1107,9 +1099,9 @@ svm_range_split(struct svm_range *prange, uint64_t start, 
uint64_t last,
 
 static int
 svm_range_split_tail(struct svm_range *prange, uint64_t new_last,
-struct list_head *insert_list, struct list_head 
*remap_list)
+struct list_head *insert_list, struct list_head 
*remap_list,
+struct svm_range *tail)
 {
-   struct svm_range *tail;
int r = svm_range_split(prange, prange->start, new_last, );
 
if (!r) {
@@ -1122,9 +1114,9 @@ svm_range_split_tail(struct svm_range *prange, uint64_t 
new_last,
 
 static int
 svm_range_split_head(struct svm_range *prange, uint64_t new_start,
-struct list_head *insert_list, struct list_head 
*remap_list)
+struct list_head *insert_list, struct list_head 
*remap_list,
+struct svm_range *head)
 {
-   struct svm_range *head;
int r = svm_range_split(prange, new_start, prange->last, );
 
if (!r) {
@@ -1573,7 +1565,6 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
struct svm_validate_context *ctx;
unsigned long start, end, addr;
struct kfd_process *p;
-   uint64_t vram_pages;
void

Re: [PATCH] [v2] drm/radeon/trinity_dpm: fix a memleak in trinity_parse_power_table

2023-12-04 Thread Alex Deucher

Applied.  Thanks!

On Mon, Dec 4, 2023 at 5:39 AM Zhipeng Lu  wrote:
>
> The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
> following error-handling path. However, in the error-handling of
> rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
> resulting in a memleak in this function.
>
> Fixes: d70229f70447 ("drm/radeon/kms: add dpm support for trinity asics")
> Signed-off-by: Zhipeng Lu 
> ---
>
> Changelog:
>
> v2: Adding {} to correct the if statement.
> ---
>  drivers/gpu/drm/radeon/trinity_dpm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/trinity_dpm.c 
> b/drivers/gpu/drm/radeon/trinity_dpm.c
> index 08ea1c864cb2..ef1cc7bad20a 100644
> --- a/drivers/gpu/drm/radeon/trinity_dpm.c
> +++ b/drivers/gpu/drm/radeon/trinity_dpm.c
> @@ -1726,8 +1726,10 @@ static int trinity_parse_power_table(struct 
> radeon_device *rdev)
> non_clock_array_index = power_state->v2.nonClockInfoIndex;
> non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)
> 
> _clock_info_array->nonClockInfo[non_clock_array_index];
> -   if (!rdev->pm.power_state[i].clock_info)
> +   if (!rdev->pm.power_state[i].clock_info) {
> +   kfree(rdev->pm.dpm.ps);
> return -EINVAL;
> +   }
> ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
> if (ps == NULL) {
> kfree(rdev->pm.dpm.ps);
> --
> 2.34.1
>

Re: [PATCH] [v2] drm/radeon/dpm: fix a memleak in sumo_parse_power_table

2023-12-04 Thread Alex Deucher

On Mon, Dec 4, 2023 at 5:39 AM Zhipeng Lu  wrote:
>
> The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
> following error-handling path. However, in the error-handling of
> rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
> resulting in a memleak in this function.
>
> Fixes: 80ea2c129c76 ("drm/radeon/kms: add dpm support for sumo asics (v2)")
> Signed-off-by: Zhipeng Lu 

Applied.  Thanks!

> ---
>
> Changelog:
>
> v2: Adding {} to make if statement correct.
> ---
>  drivers/gpu/drm/radeon/sumo_dpm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/sumo_dpm.c 
> b/drivers/gpu/drm/radeon/sumo_dpm.c
> index f74f381af05f..d49c145db437 100644
> --- a/drivers/gpu/drm/radeon/sumo_dpm.c
> +++ b/drivers/gpu/drm/radeon/sumo_dpm.c
> @@ -1493,8 +1493,10 @@ static int sumo_parse_power_table(struct radeon_device 
> *rdev)
> non_clock_array_index = power_state->v2.nonClockInfoIndex;
> non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)
> 
> _clock_info_array->nonClockInfo[non_clock_array_index];
> -   if (!rdev->pm.power_state[i].clock_info)
> +   if (!rdev->pm.power_state[i].clock_info) {
> +   kfree(rdev->pm.dpm.ps);
> return -EINVAL;
> +   }
> ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
> if (ps == NULL) {
> kfree(rdev->pm.dpm.ps);
> --
> 2.34.1
>

RE: [PATCH 1/6] Revert "drm/prime: Unexport helpers for fd/handle conversion"

2023-12-04 Thread Deucher, Alexander

[AMD Official Use Only - General]

> -Original Message-
> From: Kuehling, Felix 
> Sent: Friday, December 1, 2023 6:40 PM
> To: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Deucher,
> Alexander 
> Cc: Daniel Vetter ; Koenig, Christian
> ; Thomas Zimmermann
> 
> Subject: Re: [PATCH 1/6] Revert "drm/prime: Unexport helpers for fd/handle
> conversion"
>
> Hi Alex,
>
> I'm about to push patches 1-3 to the rebased amd-staging-drm-next. It would
> be good to get patch 1 into drm-fixes so that Linux 6.6 will be the only 
> kernel
> without these prime helpers. That would minimize the hassle for DKMS driver
> installations on future distros.

Already done:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0514f63cfff38a0dcb7ba9c5f245827edc0c5107

Alex

>
> Thanks,
>Felix
>
>
> On 2023-12-01 18:34, Felix Kuehling wrote:
> > This reverts commit 71a7974ac7019afeec105a54447ae1dc7216cbb3.
> >
> > These helper functions are needed for KFD to export and import DMABufs
> > the right way without duplicating the tracking of DMABufs associated
> > with GEM objects while ensuring that move notifier callbacks are
> > working as intended.
> >
> > Acked-by: Christian König 
> > Acked-by: Thomas Zimmermann 
> > Acked-by: Daniel Vetter 
> > Signed-off-by: Felix Kuehling 
> > ---
> >   drivers/gpu/drm/drm_prime.c | 33 ++---
> >   include/drm/drm_prime.h |  7 +++
> >   2 files changed, 25 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> > index 63b709a67471..834a5e28abbe 100644
> > --- a/drivers/gpu/drm/drm_prime.c
> > +++ b/drivers/gpu/drm/drm_prime.c
> > @@ -278,7 +278,7 @@ void drm_gem_dmabuf_release(struct dma_buf
> *dma_buf)
> >   }
> >   EXPORT_SYMBOL(drm_gem_dmabuf_release);
> >
> > -/*
> > +/**
> >* drm_gem_prime_fd_to_handle - PRIME import function for GEM drivers
> >* @dev: drm_device to import into
> >* @file_priv: drm file-private structure @@ -292,9 +292,9 @@
> > EXPORT_SYMBOL(drm_gem_dmabuf_release);
> >*
> >* Returns 0 on success or a negative error code on failure.
> >*/
> > -static int drm_gem_prime_fd_to_handle(struct drm_device *dev,
> > - struct drm_file *file_priv, int prime_fd,
> > - uint32_t *handle)
> > +int drm_gem_prime_fd_to_handle(struct drm_device *dev,
> > +  struct drm_file *file_priv, int prime_fd,
> > +  uint32_t *handle)
> >   {
> > struct dma_buf *dma_buf;
> > struct drm_gem_object *obj;
> > @@ -360,6 +360,7 @@ static int drm_gem_prime_fd_to_handle(struct
> drm_device *dev,
> > dma_buf_put(dma_buf);
> > return ret;
> >   }
> > +EXPORT_SYMBOL(drm_gem_prime_fd_to_handle);
> >
> >   int drm_prime_fd_to_handle_ioctl(struct drm_device *dev, void *data,
> >  struct drm_file *file_priv)
> > @@ -408,7 +409,7 @@ static struct dma_buf
> *export_and_register_object(struct drm_device *dev,
> > return dmabuf;
> >   }
> >
> > -/*
> > +/**
> >* drm_gem_prime_handle_to_fd - PRIME export function for GEM drivers
> >* @dev: dev to export the buffer from
> >* @file_priv: drm file-private structure @@ -421,10 +422,10 @@
> > static struct dma_buf *export_and_register_object(struct drm_device *dev,
> >* The actual exporting from GEM object to a dma-buf is done through the
> >* _gem_object_funcs.export callback.
> >*/
> > -static int drm_gem_prime_handle_to_fd(struct drm_device *dev,
> > - struct drm_file *file_priv, uint32_t 
> > handle,
> > - uint32_t flags,
> > - int *prime_fd)
> > +int drm_gem_prime_handle_to_fd(struct drm_device *dev,
> > +  struct drm_file *file_priv, uint32_t handle,
> > +  uint32_t flags,
> > +  int *prime_fd)
> >   {
> > struct drm_gem_object *obj;
> > int ret = 0;
> > @@ -506,6 +507,7 @@ static int drm_gem_prime_handle_to_fd(struct
> > drm_device *dev,
> >
> > return ret;
> >   }
> > +EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);
> >
> >   int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data,
> >  struct drm_file *file_priv)
> > @@ -864,9 +866,9 @@
> EXPORT_SYMBOL(drm_prime_get_contiguous_size);
> >* @obj: GEM object to export
> >* @flags: flags like DRM_CLOEXEC and DRM_RDWR
> >*
> > - * This is the implementation of the _gem_object_funcs.export
> > functions
> > - * for GEM drivers using the PRIME helpers. It is used as the default
> > for
> > - * drivers that do not set their own.
> > + * This is the implementation of the _gem_object_funcs.export
> > + functions for GEM drivers
> > + * using the PRIME helpers. It is used as the default in
> > + * drm_gem_prime_handle_to_fd().
> >*/
> >   struct

Re: radeon flooding "flip queue failed" in Xorg and system log

2023-12-04 Thread Christian König

Well, essentially we shouldn't print the error from the kernel in the 
first place and just return the correct error code to the userspace 
application.


The userspace application/driver would then say: Oh, I don't have enough 
memory for triple buffering, maybe I should just fallback to double 
buffering


Retrying the same approach over and over again is not really a good 
implementation from userspace.


But devices with only 64MiB of VRAM are rare today, so

Regards,
Christian.

Am 04.12.23 um 17:38 schrieb smesgr:

Hi Chrisitan,

you are right, after bumping the memory in BIOS the error goes away. 
Still would be nice if the error would point to "out of VMem" or 
something.


Best Regards
Christian König  schrieb am Montag, 
4. Dezember 2023 um 16:18:



Am 02.12.23 um 20:00 schrieb smesgr:

Hi,

I have installed Debian 12 on my old Fujitsu S920 with AMD GX-424CC.
After the installation completed, my logs are flooded by the 
following error messages:


dmesg:
[ 967.869183] radeon :00:01.0: 3ba3c813 pin failed
[ 967.869212] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip

[ 967.886254] radeon :00:01.0: 51be2216 pin failed
[ 967.886282] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip

[ 967.907984] radeon :00:01.0: 3ba3c813 pin failed
[ 967.908014] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip

[ 967.925551] radeon :00:01.0: 51be2216 pin failed
[ 967.925580] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip


Xorg.log
[ 47.777] (WW) RADEON(0): flip queue failed: Invalid argument
[ 47.777] (WW) RADEON(0): Page flip failed: Invalid argument
[ 47.777] (EE) RADEON(0): present flip failed

lspci:
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Mullins [Radeon R4/R5 Graphics] (rev 01) (prog-if 00 [VGA 
controller])

Subsystem: Fujitsu Technology Solutions Mullins [Radeon R4/R5 Graphics]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
SERR- 
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 35
Region 0: Memory at f800 (64-bit, prefetchable) [size=64M]
Region 2: Memory at fc00 (64-bit, prefetchable) [size=8M]
Region 4: I/O ports at f000 [size=256]
Region 5: Memory at feb0 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at 000c [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1+,D2+,D3hot+,D3cold-)

Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag+ RBE+ FLReset-
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr+ FatalErr- UnsupReq+ AuxPwr- TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, 
MaxEETLPPrefixes 1

EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 
10BitTagReq- OBFF Disabled,

AtomicOpsCtl: ReqEn-
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: fee08004 Data: 0023
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 
Len=010 

Capabilities: [270 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [2b0 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [2c0 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 0020, Page Request Allocation: 
Capabilities: [2d0 v1] Process Address Space ID (PASID)
PASIDCap: Exec+ Priv+, Max PASID Width: 10
PASIDCtl: Enable- Exec- Priv-
Kernel driver in use: radeon
Kernel modules: radeon, amdgpu

I'm a bit confused. Is this an kernel issue, Xorg issue or something 
else?


Well, what is connected to the GPU? E.g. what monitor, projector or 
other display device do you use?


What happens is that only 64MiB is assigned to the device and for 
some reason a process (maybe the compositor?) Is trying to do double 
or triple buffering and doesn't has enough memory for that.


So it falls back to single buffering, that might work but will 
probably not look as good and you see tons of error messages.


You most likely can go into the BIOS and reserve more memory to the 
GPU (it will just be stolen from system memory).


Regards,
Christian.

Re: radeon flooding "flip queue failed" in Xorg and system log

2023-12-04 Thread smesgr

Hi Chrisitan,

you are right, after bumping the memory in BIOS the error goes away. Still 
would be nice if the error would point to "out of VMem" or something.

Best Regards
Christian König  schrieb am Montag, 4. 
Dezember 2023 um 16:18:

> Am 02.12.23 um 20:00 schrieb smesgr:
>
>> Hi,
>>
>> I have installed Debian 12 on my old Fujitsu S920 with AMD GX-424CC.
>> After the installation completed, my logs are flooded by the following error 
>> messages:
>>
>> dmesg:
>> [ 967.869183] radeon :00:01.0: 3ba3c813 pin failed
>> [ 967.869212] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to 
>> pin new rbo buffer before flip
>> [ 967.886254] radeon :00:01.0: 51be2216 pin failed
>> [ 967.886282] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to 
>> pin new rbo buffer before flip
>> [ 967.907984] radeon :00:01.0: 3ba3c813 pin failed
>> [ 967.908014] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to 
>> pin new rbo buffer before flip
>> [ 967.925551] radeon :00:01.0: 51be2216 pin failed [ 967.925580] 
>> [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to pin new rbo 
>> buffer before flip
>>
>> Xorg.log
>> [ 47.777] (WW) RADEON(0): flip queue failed: Invalid argument
>> [ 47.777] (WW) RADEON(0): Page flip failed: Invalid argument [ 47.777] (EE) 
>> RADEON(0): present flip failed
>>
>> lspci:
>> 00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
>> Mullins [Radeon R4/R5 Graphics] (rev 01) (prog-if 00 [VGA controller])
>> Subsystem: Fujitsu Technology Solutions Mullins [Radeon R4/R5 Graphics]
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>> Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- > SERR- > Latency: 0, Cache Line Size: 64 bytes
>> Interrupt: pin A routed to IRQ 35
>> Region 0: Memory at f800 (64-bit, prefetchable) [size=64M]
>> Region 2: Memory at fc00 (64-bit, prefetchable) [size=8M]
>> Region 4: I/O ports at f000 [size=256]
>> Region 5: Memory at feb0 (32-bit, non-prefetchable) [size=256K]
>> Expansion ROM at 000c [disabled] [size=128K]
>> Capabilities: [48] Vendor Specific Information: Len=08 
>> Capabilities: [50] Power Management version 3
>> Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>> Capabilities: [58] Express (v2) Root Complex Integrated Endpoint, MSI 00
>> DevCap: MaxPayload 256 bytes, PhantFunc 0
>> ExtTag+ RBE+ FLReset-
>> DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
>> RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
>> MaxPayload 256 bytes, MaxReadReq 512 bytes
>> DevSta: CorrErr- NonFatalErr+ FatalErr- UnsupReq+ AuxPwr- TransPend-
>> DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
>> 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, 
>> MaxEETLPPrefixes 1
>> EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
>> FRS-
>> AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- 
>> OBFF Disabled,
>> AtomicOpsCtl: ReqEn-
>> Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>> Address: fee08004 Data: 0023
>> Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 
>> Capabilities: [270 v1] Secondary PCI Express
>> LnkCtl3: LnkEquIntrruptEn- PerformEqu-
>> LaneErrStat: 0
>> Capabilities: [2b0 v1] Address Translation Service (ATS)
>> ATSCap: Invalidate Queue Depth: 00
>> ATSCtl: Enable-, Smallest Translation Unit: 00
>> Capabilities: [2c0 v1] Page Request Interface (PRI)
>> PRICtl: Enable- Reset-
>> PRISta: RF- UPRGI- Stopped+
>> Page Request Capacity: 0020, Page Request Allocation: 
>> Capabilities: [2d0 v1] Process Address Space ID (PASID)
>> PASIDCap: Exec+ Priv+, Max PASID Width: 10
>> PASIDCtl: Enable- Exec- Priv-
>> Kernel driver in use: radeon  Kernel modules: radeon, amdgpu
>>
>> I'm a bit confused. Is this an kernel issue, Xorg issue or something else?
>
> Well, what is connected to the GPU? E.g. what monitor, projector or other 
> display device do you use?
>
> What happens is that only 64MiB is assigned to the device and for some reason 
> a process (maybe the compositor?) Is trying to do double or triple buffering 
> and doesn't has enough memory for that.
>
> So it falls back to single buffering, that might work but will probably not 
> look as good and you see tons of error messages.
>
> You most likely can go into the BIOS and reserve more memory to the GPU (it 
> will just be stolen from system memory).
>
> Regards,
> Christian.
>
>> kernel:
>> Linux debian 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 
>> (2023-09-29) x86_64 GNU/Linux
>>
>> Xorg:
>> [ 33.144]
>> X.Org X Server 1.21.1.7
>> X Protocol Version 11, Revision 0
>> [ 33.144] Current Operating System: Linux debian 6.1.0-13-amd64 #1 SMP 
>> PREEMPT_DYNAMIC

Re: [PATCH v2 0/8] drm/plane-helpers: Minor clean ups

2023-12-04 Thread Alex Deucher

On Mon, Dec 4, 2023 at 4:09 AM Thomas Zimmermann  wrote:
>
> Move drm_plane_helper_atomic_check() into udl, which is the only
> driver using it. Remove several unnecessary include statements for
> .
>
> v2:
> * fix documentation (Sui)
>
> Thomas Zimmermann (8):
>   drm/plane-helper: Move drm_plane_helper_atomic_check() into udl
>   drm/amdgpu: Do not include 
>   drm/loongson: Do not include 
>   drm/shmobile: Do not include 
>   drm/solomon: Do not include 
>   drm/ofdrm: Do not include 
>   drm/simpledrm: Do not include 
>   drm/xlnx: Do not include 

Series is:
Reviewed-by: Alex Deucher 

>
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  1 -
>  drivers/gpu/drm/drm_crtc_helper.c |  7 ++--
>  drivers/gpu/drm/drm_plane_helper.c| 32 ---
>  drivers/gpu/drm/loongson/lsdc_plane.c |  1 -
>  .../drm/renesas/shmobile/shmob_drm_plane.c|  1 -
>  drivers/gpu/drm/solomon/ssd130x.h |  1 -
>  drivers/gpu/drm/tiny/ofdrm.c  |  1 -
>  drivers/gpu/drm/tiny/simpledrm.c  |  1 -
>  drivers/gpu/drm/udl/udl_modeset.c | 19 +--
>  drivers/gpu/drm/xlnx/zynqmp_kms.c |  1 -
>  include/drm/drm_plane_helper.h|  2 --
>  11 files changed, 19 insertions(+), 48 deletions(-)
>
> --
> 2.43.0
>

RE: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version and metrics table

2023-12-04 Thread Deucher, Alexander

[Public]

> -Original Message-
> From: Ma, Li 
> Sent: Monday, December 4, 2023 3:52 AM
> To: Deucher, Alexander ; amd-
> g...@lists.freedesktop.org
> Cc: Koenig, Christian ; Zhang, Yifan
> ; Yu, Lang ; Wang,
> Yang(Kevin) 
> Subject: RE: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version
> and metrics table
>
> [Public]
>
> Hi Alex,
>
> Sorry for the late reply. Only smu14 used this gpu_metrics_v3_0 struct. And
> the patch has upstream. As far as l know, umr used gpu_metrics_v3_0 and I
> will submit a patch to umr.
> Does this struct need to be back compatible currently? If yes, I will revert 
> this
> patch and add a new gpu_metrics_v3_1.

Ok.  If we don't yet have a released kernel with v3_0 support we should be 
fine.  I'll just include the updates in 6.7.

Alex

>
> Best Regards,
> Li
>
> -Original Message-
> From: Deucher, Alexander 
> Sent: Tuesday, November 28, 2023 4:47 AM
> To: Ma, Li ; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian ; Zhang, Yifan
> ; Yu, Lang 
> Subject: RE: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version
> and metrics table
>
> [Public]
>
> > -Original Message-
> > From: Ma, Li 
> > Sent: Thursday, November 23, 2023 5:07 AM
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Deucher, Alexander ; Koenig, Christian
> > ; Zhang, Yifan ; Yu,
> > Lang ; Ma, Li 
> > Subject: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version
> > and metrics table
> >
> > Increment the driver if version and add new mems to the mertics table.
> >
> > Signed-off-by: Li Ma 
> > ---
> >  .../gpu/drm/amd/include/kgd_pp_interface.h| 17 
> >  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 10 +++
> >  .../inc/pmfw_if/smu14_driver_if_v14_0_0.h | 77 +++
> >  .../drm/amd/pm/swsmu/smu14/smu_v14_0_0_ppt.c  | 46 ++-
> >  4 files changed, 115 insertions(+), 35 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > index 8ebba87f4289..eaea1c65e526 100644
> > --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> > @@ -1086,6 +1086,10 @@ struct gpu_metrics_v3_0 {
> >   uint16_taverage_dram_reads;
> >   /* time filtered DRAM write bandwidth [MB/sec] */
> >   uint16_taverage_dram_writes;
> > + /* time filtered IPU read bandwidth [MB/sec] */
> > + uint16_taverage_ipu_reads;
> > + /* time filtered IPU write bandwidth [MB/sec] */
> > + uint16_taverage_ipu_writes;
> >
> >   /* Driver attached timestamp (in ns) */
> >   uint64_tsystem_clock_counter;
> > @@ -1105,6 +1109,8 @@ struct gpu_metrics_v3_0 {
> >   uint32_taverage_all_core_power;
> >   /* calculated core power [mW] */
> >   uint16_taverage_core_power[16];
> > + /* time filtered total system power [mW] */
> > + uint16_taverage_sys_power;
> >   /* maximum IRM defined STAPM power limit [mW] */
> >   uint16_tstapm_power_limit;
> >   /* time filtered STAPM power limit [mW] */ @@ -1117,6 +1123,8 @@
> > struct gpu_metrics_v3_0 {
> >   uint16_taverage_ipuclk_frequency;
> >   uint16_taverage_fclk_frequency;
> >   uint16_taverage_vclk_frequency;
> > + uint16_taverage_uclk_frequency;
> > + uint16_taverage_mpipu_frequency;
> >
> >   /* Current clocks */
> >   /* target core frequency [MHz] */ @@ -1126,6 +1134,15 @@ struct
> > gpu_metrics_v3_0 {
> >   /* GFXCLK frequency limit enforced on GFX [MHz] */
> >   uint16_tcurrent_gfx_maxfreq;
> >
> > + /* Throttle Residency (ASIC dependent) */
> > + uint32_t throttle_residency_prochot;
> > + uint32_t throttle_residency_spl;
> > + uint32_t throttle_residency_fppt;
> > + uint32_t throttle_residency_sppt;
> > + uint32_t throttle_residency_thm_core;
> > + uint32_t throttle_residency_thm_gfx;
> > + uint32_t throttle_residency_thm_soc;
> > +
> >   /* Metrics table alpha filter time constant [us] */
> >   uint32_ttime_filter_alphavalue;
> >  };
>
> Is anything else besides smu14 using v3 of this struct?  If so, we can't 
> change
> the layout otherwise it will break existing tools.  If so, bump the version 
> minor
> and append the new items to the end.
>
> Alex
>
>
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > index c125253df20b..c2265e027ca8 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > @@ -1418,6 +1418,16 @@ typedef enum {
> >

Re: radeon flooding "flip queue failed" in Xorg and system log

2023-12-04 Thread Christian König


Am 02.12.23 um 20:00 schrieb smesgr:

Hi,

I have installed Debian 12 on my old Fujitsu S920 with AMD GX-424CC.
After the installation completed, my logs are flooded by the following 
error messages:


dmesg:
[  967.869183] radeon :00:01.0: 3ba3c813 pin failed
[  967.869212] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip

[  967.886254] radeon :00:01.0: 51be2216 pin failed
[  967.886282] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip

[  967.907984] radeon :00:01.0: 3ba3c813 pin failed
[  967.908014] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip

[  967.925551] radeon :00:01.0: 51be2216 pin failed
[  967.925580] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* 
failed to pin new rbo buffer before flip


Xorg.log
[  47.777] (WW) RADEON(0): flip queue failed: Invalid argument
[    47.777] (WW) RADEON(0): Page flip failed: Invalid argument
[    47.777] (EE) RADEON(0): present flip failed

lspci:
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Mullins [Radeon R4/R5 Graphics] (rev 01) (prog-if 00 [VGA 
controller])

Subsystem: Fujitsu Technology Solutions Mullins [Radeon R4/R5 Graphics]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
SERR- 
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 35
Region 0: Memory at f800 (64-bit, prefetchable) [size=64M]
Region 2: Memory at fc00 (64-bit, prefetchable) [size=8M]
Region 4: I/O ports at f000 [size=256]
Region 5: Memory at feb0 (32-bit, non-prefetchable) [size=256K]
Expansion ROM at 000c [disabled] [size=128K]
Capabilities: [48] Vendor Specific Information: Len=08 
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag+ RBE+ FLReset-
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr+ FatalErr- UnsupReq+ AuxPwr- TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, 
MaxEETLPPrefixes 1

EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 
10BitTagReq- OBFF Disabled,

AtomicOpsCtl: ReqEn-
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: fee08004  Data: 0023
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 
Len=010 

Capabilities: [270 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [2b0 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [2c0 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 0020, Page Request Allocation: 
Capabilities: [2d0 v1] Process Address Space ID (PASID)
PASIDCap: Exec+ Priv+, Max PASID Width: 10
PASIDCtl: Enable- Exec- Priv-
Kernel driver in use: radeon
Kernel modules: radeon, amdgpu

I'm a bit confused. Is this an kernel issue, Xorg issue or something else?


Well, what is connected to the GPU? E.g. what monitor, projector or 
other display device do you use?


What happens is that only 64MiB is assigned to the device and for some 
reason a process (maybe the compositor?) Is trying to do double or 
triple buffering and doesn't has enough memory for that.


So it falls back to single buffering, that might work but will probably 
not look as good and you see tons of error messages.


You most likely can go into the BIOS and reserve more memory to the GPU 
(it will just be stolen from system memory).


Regards,
Christian.



kernel:
Linux debian 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 
(2023-09-29) x86_64 GNU/Linux


Xorg:
[  33.144]
X.Org  X Server 1.21.1.7
X Protocol Version 11, Revision 0
[    33.144] Current Operating System: Linux debian 6.1.0-13-amd64 #1 
SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64
[    33.144] Kernel command line: 
BOOT_IMAGE=/boot/vmlinuz-6.1.0-13-amd64 
root=UUID=4101c568-e073-42f4-8dfe-08373fceb13c ro quiet

[    33.144] xorg-server 2:21.1.7-3 (https://www.debian.org/support)
[    33.144] Current version of pixman: 0.42.2
[    33.144] Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.

I searched bug tracker and could only find this issue:

[PATCH] drm/amdkfd: Use partial hmm page walk during buffer validation in SVM

2023-12-04 Thread Xiaogang . Chen

From: Xiaogang Chen 

SVM uses hmm page walk to valid buffer before map to gpu vm. After have partial
migration/mapping do validation on same vm range as migration/map do instead of
whole svm range that can be very large. This change is expected to improve svm
code performance.

Signed-off-by: Xiaogang Chen
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 123 +++
 1 file changed, 89 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 2834fb351818..f670d5f6bcdf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -158,13 +158,12 @@ svm_is_valid_dma_mapping_addr(struct device *dev, 
dma_addr_t dma_addr)
 static int
 svm_range_dma_map_dev(struct amdgpu_device *adev, struct svm_range *prange,
  unsigned long offset, unsigned long npages,
- unsigned long *hmm_pfns, uint32_t gpuidx, uint64_t 
*vram_pages)
+ unsigned long *hmm_pfns, uint32_t gpuidx)
 {
enum dma_data_direction dir = DMA_BIDIRECTIONAL;
dma_addr_t *addr = prange->dma_addr[gpuidx];
struct device *dev = adev->dev;
struct page *page;
-   uint64_t vram_pages_dev;
int i, r;
 
if (!addr) {
@@ -174,7 +173,6 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
prange->dma_addr[gpuidx] = addr;
}
 
-   vram_pages_dev = 0;
addr += offset;
for (i = 0; i < npages; i++) {
if (svm_is_valid_dma_mapping_addr(dev, addr[i]))
@@ -184,7 +182,6 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
if (is_zone_device_page(page)) {
struct amdgpu_device *bo_adev = 
prange->svm_bo->node->adev;
 
-   vram_pages_dev++;
addr[i] = (hmm_pfns[i] << PAGE_SHIFT) +
   bo_adev->vm_manager.vram_base_offset -
   bo_adev->kfd.pgmap.range.start;
@@ -201,14 +198,14 @@ svm_range_dma_map_dev(struct amdgpu_device *adev, struct 
svm_range *prange,
pr_debug_ratelimited("dma mapping 0x%llx for page addr 0x%lx\n",
 addr[i] >> PAGE_SHIFT, page_to_pfn(page));
}
-   *vram_pages = vram_pages_dev;
+
return 0;
 }
 
 static int
 svm_range_dma_map(struct svm_range *prange, unsigned long *bitmap,
  unsigned long offset, unsigned long npages,
- unsigned long *hmm_pfns, uint64_t *vram_pages)
+ unsigned long *hmm_pfns)
 {
struct kfd_process *p;
uint32_t gpuidx;
@@ -227,7 +224,7 @@ svm_range_dma_map(struct svm_range *prange, unsigned long 
*bitmap,
}
 
r = svm_range_dma_map_dev(pdd->dev->adev, prange, offset, 
npages,
- hmm_pfns, gpuidx, vram_pages);
+ hmm_pfns, gpuidx);
if (r)
break;
}
@@ -982,11 +979,6 @@ svm_range_split_nodes(struct svm_range *new, struct 
svm_range *old,
new->svm_bo = svm_range_bo_ref(old->svm_bo);
new->ttm_res = old->ttm_res;
 
-   /* set new's vram_pages as old range's now, the acurate vram_pages
-* will be updated during mapping
-*/
-   new->vram_pages = min(old->vram_pages, new->npages);
-
spin_lock(>svm_bo->list_lock);
list_add(>svm_bo_list, >svm_bo->range_list);
spin_unlock(>svm_bo->list_lock);
@@ -1573,7 +1565,6 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
struct svm_validate_context *ctx;
unsigned long start, end, addr;
struct kfd_process *p;
-   uint64_t vram_pages;
void *owner;
int32_t idx;
int r = 0;
@@ -1642,15 +1633,13 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
}
}
 
-   vram_pages = 0;
-   start = prange->start << PAGE_SHIFT;
-   end = (prange->last + 1) << PAGE_SHIFT;
+   start = map_start << PAGE_SHIFT;
+   end = (map_last + 1) << PAGE_SHIFT;
for (addr = start; !r && addr < end; ) {
struct hmm_range *hmm_range;
unsigned long map_start_vma;
unsigned long map_last_vma;
struct vm_area_struct *vma;
-   uint64_t vram_pages_vma;
unsigned long next = 0;
unsigned long offset;
unsigned long npages;
@@ -1677,13 +1666,11 @@ static int svm_range_validate_and_map(struct mm_struct 
*mm,
}
 
if (!r) {
-   offset = (addr - start) >> PAGE_SHIFT;
+   offset = (addr - (prange->start << PAGE_SHIFT)) >> 
PAGE_SHIFT;
r = svm_range_dma_map(prange, ctx->bitmap, offset, 
npages,
-

RE: [PATCH 00/47] DC Patches December 1, 2023

2023-12-04 Thread Wheeler, Daniel

[Public]

Hi all,

This week this patchset was tested on the following systems:
* Lenovo ThinkBook T13s Gen4 with AMD Ryzen 5 6600U
* MSI Gaming X Trio RX 6800
* Gigabyte Gaming OC RX 7900 XTX

These systems were tested on the following display/connection types:
* eDP, (1080p 60hz [5650U]) (1920x1200 60hz [6600U]) (2560x1600 
120hz[6600U])
* VGA and DVI (1680x1050 60hz [DP to VGA/DVI, USB-C to VGA/DVI])
* DP/HDMI/USB-C (1440p 170hz, 4k 60hz, 4k 144hz, 4k 240hz [Includes 
USB-C to DP/HDMI adapters])
* Thunderbolt (LG Ultrafine 5k)
* MST (Startech MST14DP123DP [DP to 3x DP] and 2x 4k 60Hz displays)
* DSC (with Cable Matters 101075 [DP to 3x DP] with 3x 4k60 displays, 
and HP Hook G2 with 1 4k60 display)
* USB 4 (Kensington SD5700T and 1x 4k 60Hz display)
* PCON (Club3D CAC-1085 and 1x 4k 144Hz display [at 4k 120HZ, as that 
is the max the adapter supports])

The testing is a mix of automated and manual tests. Manual testing includes 
(but is not limited to):
* Changing display configurations and settings
* Benchmark testing
* Feature testing (Freesync, etc.)

Automated testing includes (but is not limited to):
* Script testing (scripts to automate some of the manual checks)
* IGT testing

The patchset consists of the amd-staging-drm-next branch (Head commit - 
135b1ad12685 drm/amd/display: Promote DAL to 3.2.262) with new patches added on 
top of it.

Tested on Ubuntu 22.04.3, on Wayland and X11, using KDE Plasma and Gnome.


Tested-by: Daniel Wheeler 


Thank you,

Dan Wheeler
Sr. Technologist | AMD
SW Display
--
1 Commerce Valley Dr E, Thornhill, ON L3T 7X6
amd.com

-Original Message-
From: Siqueira, Rodrigo 
Sent: Friday, December 1, 2023 8:25 AM
To: amd-gfx@lists.freedesktop.org
Cc: Wheeler, Daniel ; Wentland, Harry 
; Li, Sun peng (Leo) ; Lakha, 
Bhawanpreet ; Siqueira, Rodrigo 
; Pillai, Aurabindo ; Zhuo, 
Lillian ; Li, Roman ; Lin, Wayne 
; Wang, Chao-kai (Stylon) ; Chiu, 
Solomon ; Gutierrez, Agustin ; 
Zuo, Jerry ; Mahfooz, Hamza 
Subject: [PATCH 00/47] DC Patches December 1, 2023

This DC patchset brings improvements in multiple areas. In summary, we
highlight:

* Enable writeback.
* Add multiple fixes for DML2 and DCN35.
* Introduce small code style adjustments.

Cc: Daniel Wheeler 

Thanks
Siqueira


Alex Hung (12):
  drm/amd/display: Avoid virtual stream encoder if not explicitly
requested
  drm/amd/display: Initialize writeback connector
  drm/amd/display: Check writeback connectors in
create_validate_stream_for_sink
  drm/amd/display: Hande writeback request from userspace
  drm/amd/display: Add writeback enable/disable in dc
  drm/amd/display: Fix writeback_info never got updated
  drm/amd/display: Validate hw_points_num before using it
  drm/amd/display: Fix writeback_info is not removed
  drm/amd/display: Add writeback enable field (wb_enabled)
  drm/amd/display: Setup for mmhubbub3_warmup_mcif with big buffer
  drm/amd/display: Add new set_fc_enable to struct dwbc_funcs
  drm/amd/display: Disable DWB frame capture to emulate oneshot

Alvin Lee (2):
  drm/amd/display: Optimize fast validation cases
  drm/amd/display: Use channel_width = 2 for vram table 3.0

Aric Cyr (1):
  drm/amd/display: 3.2.263

Charlene Liu (4):
  drm/amd/display: initialize all the dpm level's stutter latency
  drm/amd/display: insert drv-pmfw log + rollback to new context
  drm/amd/display: revert removing otg toggle w/a back when no active
display
  drm/amd/display: keep domain24 power on if eDP not exist

Chris Park (1):
  drm/amd/display: Update BIOS FW info table revision

Daniel Miess (1):
  drm/amd/display: Add missing dcn35 RCO registers

Dennis Chan (1):
  drm/amd/display: Fix Replay Desync Error IRQ handler

Dillon Varone (1):
  drm/amd/display: Add dml2 copy functions

George Shen (1):
  drm/amd/display: Skip DPIA-specific DP LL automation flag for non-DPIA
links

Harry Wentland (7):
  drm/amd/display: Skip entire amdgpu_dm build if !CONFIG_DRM_AMD_DC
  drm/amd/display: Create one virtual connector in DC
  drm/amd/display: Skip writeback connector when we get
amdgpu_dm_connector
  drm/amd/display: Return drm_connector from
find_first_crtc_matching_connector
  drm/amd/display: Use drm_connector in create_stream_for_sink
  drm/amd/display: Create amdgpu_dm_wb_connector
  drm/amd/display: Create fake sink and stream for writeback connector

Ilya Bakoulin (1):
  drm/amd/display: Fix MST PBN/X.Y value calculations

Ivan Lipski (1):
  drm/amd/display: Add monitor patch for specific eDP

Johnson Chen (1):
  drm/amd/display: Fix null pointer

Josip Pavic (1):
  drm/amd/display: Increase scratch buffer size

Krunoslav Kovac (1):
  drm/amd/display: Change dither policy for 10bpc to round

Lewis Huang (1):
  drm/amd/display: Pass pwrseq inst for

Re: Radeon regression in 6.6 kernel

2023-12-04 Thread Alex Deucher

On Sun, Dec 3, 2023 at 3:40 PM Phillip Susi  wrote:
>
> Alex Deucher  writes:
>
> > Phillip,
> >
> > Can you test this patch?  I was not able to repro the issue on the
> > navi2x card I had handy, but I think it should fix it.
>
> I pulled -rc4 and it still had the problem.  I applied this patch, and
> yes, it fixed it!
>

Great!  I'll include it this week.

Alex

[PATCH] drm/amdgpu: fix buffer funcs setting order on suspend

2023-12-04 Thread Alex Deucher

We need to disable this after the last eviction
call, but before we disable the SDMA IP.

Fixes: b70438004a14 ("drm/amdgpu: move buffer funcs setting up a level")
Link: https://lore.kernel.org/r/87edgv4x3i@vps.thesusis.net
Signed-off-by: Alex Deucher 
Cc: Phillip Susi 
Cc: Luben Tuikov 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index f29d0faf956e..b76ec5ec04c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4593,8 +4593,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
 
amdgpu_ras_suspend(adev);
 
-   amdgpu_ttm_set_buffer_funcs_status(adev, false);
-
amdgpu_device_ip_suspend_phase1(adev);
 
if (!adev->in_s0ix)
@@ -4604,6 +4602,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
if (r)
return r;
 
+   amdgpu_ttm_set_buffer_funcs_status(adev, false);
+
amdgpu_fence_driver_hw_fini(adev);
 
amdgpu_device_ip_suspend_phase2(adev);
-- 
2.42.0

Re: Re: [PATCH] drm/radeon/dpm: fix a memleak in sumo_parse_power_table

2023-12-04 Thread alexious

> Am 03.12.23 um 18:16 schrieb Zhipeng Lu:
> > The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
> > following error-handling path. However, in the error-handling of
> > rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
> > resulting in a memleak in this function.
> >
> > Fixes: 80ea2c129c76 ("drm/radeon/kms: add dpm support for sumo asics (v2)")
> > Signed-off-by: Zhipeng Lu 
> > ---
> >   drivers/gpu/drm/radeon/sumo_dpm.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/radeon/sumo_dpm.c 
> > b/drivers/gpu/drm/radeon/sumo_dpm.c
> > index f74f381af05f..bde640053708 100644
> > --- a/drivers/gpu/drm/radeon/sumo_dpm.c
> > +++ b/drivers/gpu/drm/radeon/sumo_dpm.c
> > @@ -1494,6 +1494,7 @@ static int sumo_parse_power_table(struct 
> > radeon_device *rdev)
> > non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)
> > 
> > _clock_info_array->nonClockInfo[non_clock_array_index];
> > if (!rdev->pm.power_state[i].clock_info)
> > +   kfree(rdev->pm.dpm.ps);
> > return -EINVAL;
> 
> That change is obviously not correct since you now always return -EINVAL.
> 
> You need to at least add {} here.
> 

I'm sorry for my mistake and I'll send a new patch soon.

Regards,
Zhipeng

Re: [RFC PATCH 2/6] mm/gmem: add arch-independent abstraction to track address mapping status

2023-12-04 Thread David Hildenbrand


On 02.12.23 15:50, Pedro Falcato wrote:

On Fri, Dec 1, 2023 at 9:23 AM David Hildenbrand  wrote:


On 28.11.23 13:50, Weixi Zhu wrote:

This patch adds an abstraction layer, struct vm_object, that maintains
per-process virtual-to-physical mapping status stored in struct gm_mapping.
For example, a virtual page may be mapped to a CPU physical page or to a
device physical page. Struct vm_object effectively maintains an
arch-independent page table, which is defined as a "logical page table".
While arch-dependent page table used by a real MMU is named a "physical
page table". The logical page table is useful if Linux core MM is extended
to handle a unified virtual address space with external accelerators using
customized MMUs.


Which raises the question why we are dealing with anonymous memory at
all? Why not go for shmem if you are already only special-casing VMAs
with a MMAP flag right now?

That would maybe avoid having to introduce controversial BSD design
concepts into Linux, that feel like going a step backwards in time to me
and adding *more* MM complexity.



In this patch, struct vm_object utilizes a radix
tree (xarray) to track where a virtual page is mapped to. This adds extra
memory consumption from xarray, but provides a nice abstraction to isolate
mapping status from the machine-dependent layer (PTEs). Besides supporting
accelerators with external MMUs, struct vm_object is planned to further
union with i_pages in struct address_mapping for file-backed memory.


A file already has a tree structure (pagecache) to manage the pages that
are theoretically mapped. It's easy to translate from a VMA to a page
inside that tree structure that is currently not present in page tables.

Why the need for that tree structure if you can just remove anon memory
from the picture?



The idea of struct vm_object is originated from FreeBSD VM design, which
provides a unified abstraction for anonymous memory, file-backed memory,
page cache and etc[1].


:/


Currently, Linux utilizes a set of hierarchical page walk functions to
abstract page table manipulations of different CPU architecture. The
problem happens when a device wants to reuse Linux MM code to manage its
page table -- the device page table may not be accessible to the CPU.
Existing solution like Linux HMM utilizes the MMU notifier mechanisms to
invoke device-specific MMU functions, but relies on encoding the mapping
status on the CPU page table entries. This entangles machine-independent
code with machine-dependent code, and also brings unnecessary restrictions.


Why? we have primitives to walk arch page tables in a non-arch specific
fashion and are using them all over the place.

We even have various mechanisms to map something into the page tables
and get the CPU to fault on it, as if it is inaccessible (PROT_NONE as
used for NUMA balancing, fake swap entries).


The PTE size and format vary arch by arch, which harms the extensibility.


Not really.

We might have some features limited to some architectures because of the
lack of PTE bits. And usually the problem is that people don't care
enough about enabling these features on older architectures.

If we ever *really* need more space for sw-defined data, it would be
possible to allocate auxiliary data for page tables only where required
(where the features apply), instead of crafting a completely new,
auxiliary datastructure with it's own locking.

So far it was not required to enable the feature we need on the
architectures we care about.



[1] https://docs.freebsd.org/en/articles/vm-design/


In the cover letter you have:

"The future plan of logical page table is to provide a generic
abstraction layer that support common anonymous memory (I am looking at
you, transparent huge pages) and file-backed memory."

Which I doubt will happen; there is little interest in making anonymous
memory management slower, more serialized, and wasting more memory on
metadata.


Also worth noting that:

1) Mach VM (which FreeBSD inherited, from the old BSD) vm_objects
aren't quite what's being stated here, rather they are somewhat
replacements for both anon_vma and address_space[1]. Very similarly to
Linux, they take pages from vm_objects and map them in page tables
using pmap (the big difference is anon memory, which has its
bookkeeping in page tables, on Linux)

2) These vm_objects were a horrendous mistake (see CoW chaining) and
FreeBSD has to go to horrendous lengths to make them tolerable. The
UVM paper/dissertation (by Charles Cranor) talks about these issues at
length, and 20 years later it's still true.

3) Despite Linux MM having its warts, it's probably correct to
consider it a solid improvement over FreeBSD MM or NetBSD UVM

And, finally, randomly tacking on core MM concepts from other systems
is at best a *really weird* idea. Particularly when they aren't even
what was stated!


Can you read my mind? :) thanks for noting all that, with which I 100% 
agree.


--
Cheers,

David / dhildenb

[bug report] drm/amdgpu: Workaround to skip kiq ring test during ras gpu recovery

2023-12-04 Thread Dan Carpenter

Hello Stanley.Yang,

The patch b1338a8e71ac: "drm/amdgpu: Workaround to skip kiq ring test
during ras gpu recovery" from Oct 17, 2023 (linux-next), leads to the
following Smatch static checker warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c:604 amdgpu_get_xgmi_hive()
warn: sleeping in atomic context

drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
591 struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device 
*adev)
592 {
593 struct amdgpu_hive_info *hive = NULL;
594 int ret;
595 
596 if (!adev->gmc.xgmi.hive_id)
597 return NULL;
598 
599 if (adev->hive) {
600 kobject_get(>hive->kobj);
601 return adev->hive;
602 }
603 
--> 604 mutex_lock(_mutex);
^^^
Shhh  The mutexes are sleeping.

605 
606 list_for_each_entry(hive, _hive_list, node)  {

The caller is amdgpu_gfx_disable_kcq():

drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
   516  spin_lock(>ring_lock);
^^^
Holding a spin lock.

   517  if (amdgpu_ring_alloc(kiq_ring, kiq->pmf->unmap_queues_size *
   518  adev->gfx.num_compute_rings)) {
   519  spin_unlock(>ring_lock);
   520  return -ENOMEM;
   521  }
   522  
   523  for (i = 0; i < adev->gfx.num_compute_rings; i++) {
   524  j = i + xcc_id * adev->gfx.num_compute_rings;
   525  kiq->pmf->kiq_unmap_queues(kiq_ring,
   526 >gfx.compute_ring[j],
   527 RESET_QUEUES, 0, 0);
   528  }
   529  
   530  /**
   531   * This is workaround: only skip kiq_ring test
   532   * during ras recovery in suspend stage for gfx9.4.3
   533   */
   534  hive = amdgpu_get_xgmi_hive(adev);
   ^^
Can't call a sleeping function when holding a spin_lock.

   535  if (hive) {
   536  hive_ras_recovery = atomic_read(>ras_recovery);
   537  amdgpu_put_xgmi_hive(hive);
   538  }
   539  
   540  ras = amdgpu_ras_get_context(adev);
   541  if ((amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 
3)) &&
   542  ras && (atomic_read(>in_recovery) || 
hive_ras_recovery)) {
   543  spin_unlock(>ring_lock);
   544  return 0;
   545  }
   546  
   547  if (kiq_ring->sched.ready && !adev->job_hang)
   548  r = amdgpu_ring_test_helper(kiq_ring);
   549  spin_unlock(>ring_lock);

regards,
dan carpenter

Re: [PATCH v14 2/9] platform/x86/amd: Add support for AMD ACPI based Wifi band RFI mitigation feature

2023-12-04 Thread Hans de Goede

Hi,

On 11/29/23 10:13, Ma Jun wrote:
> Due to electrical and mechanical constraints in certain platform designs
> there may be likely interference of relatively high-powered harmonics of
> the (G-)DDR memory clocks with local radio module frequency bands used
> by Wifi 6/6e/7.
> 
> To mitigate this, AMD has introduced a mechanism that devices can use to
> notify active use of particular frequencies so that other devices can make
> relative internal adjustments as necessary to avoid this resonance.
> 
> Co-developed-by: Evan Quan 
> Signed-off-by: Evan Quan 
> Signed-off-by: Ma Jun 
> 
> --
> v11:
>  - fix typo(Simon)
> v12:
>  - Fix the code logic (Rafael)
>  - Move amd_wbrf.c to drivers/platform/x86/amd/wbrf.c
>  - Updated Evan's email because he's no longer at AMD.Thanks
> for his work in earlier versions.
> v13:
>  - Fix the format issue (IIpo Jarvinen)
>  - Add comment for some functions
> v14:
>  - Use the apci_check_dsm and acpi_evaluate_dsm (Hans de Goede)

Thank you this is much better.

I notice that the #define ACPI_AMD_WBRF_METHOD  "\\WBRF"
still exists though and that this is still used in
static bool acpi_amd_wbrf_supported_system(void).

I think it might be better to just remove
these 2 all together.

Checking if a DSM with the expected GUID is present
and if that has the correct bits set in its supported
mask should be enough.

And on future systems the implementer may decide to
not have a WBRF helper function at all and instead
handle everything in the _DSM method.

So the "\\WBRF" check seems to be checking for
what really is an implementation detail.

2 other very small remarks:

> +/**
> + * acpi_amd_wbrf_supported_producer - determine if the WBRF can be enabled
> + *for the device as a producer
> + *
> + * @dev: device pointer
> + *
> + * Check if the platform equipped with necessary implementations to
> + * support WBRF for the device as a producer.
> + *
> + * Return:
> + * true if WBRF is supported, otherwise returns false
> + */
> +bool acpi_amd_wbrf_supported_producer(struct device *dev)
> +{
> + struct acpi_device *adev;
> +
> + adev = ACPI_COMPANION(dev);
> + if (!adev)
> + return false;
> +
> + if (!acpi_amd_wbrf_supported_system())
> + return false;
> +
> +
> + return acpi_check_dsm(adev->handle, _acpi_dsm_guid,
> +   WBRF_REVISION, BIT(WBRF_RECORD));
> +}
> +EXPORT_SYMBOL_GPL(acpi_amd_wbrf_supported_producer);

Please don't use double empty lines, one empty line to separate things
is enough.

> +
> +/**
> + * acpi_amd_wbrf_supported_consumer - determine if the WBRF can be enabled
> + *for the device as a consumer
> + *
> + * @dev: device pointer
> + *
> + * Determine if the platform equipped with necessary implementations to
> + * support WBRF for the device as a consumer.
> + *
> + * Return:
> + * true if WBRF is supported, otherwise returns false.
> + */
> +bool acpi_amd_wbrf_supported_consumer(struct device *dev)
> +{
> + struct acpi_device *adev;
> +
> + adev = ACPI_COMPANION(dev);
> + if (!adev)
> + return false;
> +
> + if (!acpi_amd_wbrf_supported_system())
> + return false;
> +
> + return acpi_check_dsm(adev->handle, _acpi_dsm_guid,
> +   WBRF_REVISION, BIT(WBRF_RETRIEVE));
> +}
> +EXPORT_SYMBOL_GPL(acpi_amd_wbrf_supported_consumer);
> +
> +/**
> + * amd_wbrf_retrieve_freq_band - retrieve current active frequency
> + * bands

You may go a bit over the 80 chars limit, please just make this
a single line:

 * amd_wbrf_retrieve_freq_band - retrieve current active frequency bands

> + *
> + * @dev: device pointer
> + * @out: output structure containing all the active frequency bands
> + *
> + * Retrieve the current active frequency bands which were broadcasted
> + * by other producers. The consumer who calls this API should take
> + * proper actions if any of the frequency band may cause RFI with its
> + * own frequency band used.
> + *
> + * Return:
> + * 0 for getting wifi freq band successfully.
> + * Returns a negative error code for failure.
> + */

Regards,

Hans

[PATCH] [v2] drm/radeon/trinity_dpm: fix a memleak in trinity_parse_power_table

2023-12-04 Thread Zhipeng Lu

The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
following error-handling path. However, in the error-handling of
rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
resulting in a memleak in this function.

Fixes: d70229f70447 ("drm/radeon/kms: add dpm support for trinity asics")
Signed-off-by: Zhipeng Lu 
---

Changelog:

v2: Adding {} to correct the if statement.
---
 drivers/gpu/drm/radeon/trinity_dpm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/trinity_dpm.c 
b/drivers/gpu/drm/radeon/trinity_dpm.c
index 08ea1c864cb2..ef1cc7bad20a 100644
--- a/drivers/gpu/drm/radeon/trinity_dpm.c
+++ b/drivers/gpu/drm/radeon/trinity_dpm.c
@@ -1726,8 +1726,10 @@ static int trinity_parse_power_table(struct 
radeon_device *rdev)
non_clock_array_index = power_state->v2.nonClockInfoIndex;
non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)

_clock_info_array->nonClockInfo[non_clock_array_index];
-   if (!rdev->pm.power_state[i].clock_info)
+   if (!rdev->pm.power_state[i].clock_info) {
+   kfree(rdev->pm.dpm.ps);
return -EINVAL;
+   }
ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
if (ps == NULL) {
kfree(rdev->pm.dpm.ps);
-- 
2.34.1

[PATCH] [v2] drm/radeon/dpm: fix a memleak in sumo_parse_power_table

2023-12-04 Thread Zhipeng Lu

The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
following error-handling path. However, in the error-handling of
rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
resulting in a memleak in this function.

Fixes: 80ea2c129c76 ("drm/radeon/kms: add dpm support for sumo asics (v2)")
Signed-off-by: Zhipeng Lu 
---

Changelog:

v2: Adding {} to make if statement correct.
---
 drivers/gpu/drm/radeon/sumo_dpm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/sumo_dpm.c 
b/drivers/gpu/drm/radeon/sumo_dpm.c
index f74f381af05f..d49c145db437 100644
--- a/drivers/gpu/drm/radeon/sumo_dpm.c
+++ b/drivers/gpu/drm/radeon/sumo_dpm.c
@@ -1493,8 +1493,10 @@ static int sumo_parse_power_table(struct radeon_device 
*rdev)
non_clock_array_index = power_state->v2.nonClockInfoIndex;
non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)

_clock_info_array->nonClockInfo[non_clock_array_index];
-   if (!rdev->pm.power_state[i].clock_info)
+   if (!rdev->pm.power_state[i].clock_info) {
+   kfree(rdev->pm.dpm.ps);
return -EINVAL;
+   }
ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
if (ps == NULL) {
kfree(rdev->pm.dpm.ps);
-- 
2.34.1

Re: [PATCH] drm/radeon/trinity_dpm: fix a memleak in trinity_parse_power_table

2023-12-04 Thread Amos Jianjun Kong

On Mon, Dec 4, 2023 at 5:55 PM Zhipeng Lu  wrote:
>
> The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
> following error-handling path. However, in the error-handling of
> rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
> resulting in a memleak in this function.
>
> Fixes: d70229f70447 ("drm/radeon/kms: add dpm support for trinity asics")
> Signed-off-by: Zhipeng Lu 
> ---
>  drivers/gpu/drm/radeon/trinity_dpm.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/radeon/trinity_dpm.c 
> b/drivers/gpu/drm/radeon/trinity_dpm.c
> index 08ea1c864cb2..8bf56fb7b933 100644
> --- a/drivers/gpu/drm/radeon/trinity_dpm.c
> +++ b/drivers/gpu/drm/radeon/trinity_dpm.c
> @@ -1727,6 +1727,7 @@ static int trinity_parse_power_table(struct 
> radeon_device *rdev)
> non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)
> 
> _clock_info_array->nonClockInfo[non_clock_array_index];
> if (!rdev->pm.power_state[i].clock_info)
> +   kfree(rdev->pm.dpm.ps);


I just confirmed the memleak problem in the code. But the fix has a
problem, it lacks the brackets.

diff --git a/drivers/gpu/drm/radeon/trinity_dpm.c
b/drivers/gpu/drm/radeon/trinity_dpm.c
index 08ea1c864cb2..ef1cc7bad20a 100644
--- a/drivers/gpu/drm/radeon/trinity_dpm.c
+++ b/drivers/gpu/drm/radeon/trinity_dpm.c
@@ -1726,8 +1726,10 @@ static int trinity_parse_power_table(struct
radeon_device *rdev)
non_clock_array_index = power_state->v2.nonClockInfoIndex;
non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)

_clock_info_array->nonClockInfo[non_clock_array_index];
-   if (!rdev->pm.power_state[i].clock_info)
+   if (!rdev->pm.power_state[i].clock_info) {
+   kfree(rdev->pm.dpm.ps);
return -EINVAL;
+   }
ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
if (ps == NULL) {
kfree(rdev->pm.dpm.ps);


> return -EINVAL;
> ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
> if (ps == NULL) {
> --
> 2.34.1
>

Re: [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-04 Thread Alistair Popple



Christian König  writes:

> Am 01.12.23 um 06:48 schrieb Zeng, Oak:
>> [SNIP]

>> Besides memory eviction/oversubscription, there are a few other pain points 
>> when I use hmm:
>>
>> 1) hmm doesn't support file-back memory, so it is hard to share
> memory b/t process in a gpu environment. You mentioned you have a
> plan... How hard is it to support file-backed in your approach?
>
> As hard as it is to support it through HMM. That's what I meant that
> this approach doesn't integrate well, as far as I know the problem
> isn't inside HMM or any other solution but rather in the file system
> layer.

In what way does HMM not support file-backed memory? I was under the
impression that at least hmm_range_fault() does.

 - Alistair

> Regards,
> Christian.
>
>> 2)virtual address range based memory attribute/hint: with hmadvise,
> where do you save the memory attribute of a virtual address range? Do
> you need to extend vm_area_struct to save it? With hmm, we have to
> maintain such information at driver. This ends up with pretty
> complicated logic to split/merge those address range. I know core mm
> has similar logic to split/merge vma...
>>
>> Oak
>>
>>
>>> -Weixi
>>>
>>> -Original Message-
>>> From: Christian König
>>> Sent: Thursday, November 30, 2023 4:28 PM
>>> To: Zeng, Oak; Christian König
>>> ; zhuweixi; linux-
>>> m...@kvack.org;linux-ker...@vger.kernel.org;a...@linux-foundation.org;
>>> Danilo Krummrich; Dave Airlie; Daniel
>>> Vetter
>>> Cc:intel-gvt-...@lists.freedesktop.org;rcampb...@nvidia.com;
>>> mhairgr...@nvidia.com;j...@nvidia.com;weixi@openeuler.sh;
>>> jhubb...@nvidia.com;intel-...@lists.freedesktop.org;apop...@nvidia.com;
>>> xinhui@amd.com;amd-gfx@lists.freedesktop.org;
>>> tvrtko.ursu...@linux.intel.com;ogab...@kernel.org;jgli...@redhat.com; dri-
>>> de...@lists.freedesktop.org;z...@nvidia.com; Vivi, Rodrigo
>>> ;alexander.deuc...@amd.com;leo...@nvidia.com;
>>> felix.kuehl...@amd.com; Wang, Zhi A;
>>> mgor...@suse.de
>>> Subject: Re: [RFC PATCH 0/6] Supporting GMEM (generalized memory
>>> management) for external memory devices
>>>
>>> Hi Oak,
>>>
>>> yeah, #4 is indeed a really good point and I think Felix will agree to that 
>>> as well.
>>>
>>> HMM is basically still missing a way to advise device attributes for the CPU
>>> address space. Both migration strategy as well as device specific 
>>> information (like
>>> cache preferences) fall into this category.
>>>
>>> Since there is a device specific component in those attributes as well I 
>>> think
>>> device specific IOCTLs still make sense to update them, but HMM should offer
>>> the functionality to manage and store those information.
>>>
>>> Split and merge of VMAs only become a problem if you attach those 
>>> information
>>> to VMAs, if you keep them completely separate than that doesn't become an
>>> issue either. The down side of this approach is that you don't get 
>>> automatically
>>> extending attribute ranges for growing VMAs for example.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 29.11.23 um 23:23 schrieb Zeng, Oak:
 Hi Weixi,

 Even though Christian has listed reasons rejecting this proposal (yes they 
 are
>>> very reasonable to me), I would open my mind and further explore the 
>>> possibility
>>> here. Since the current GPU driver uses a hmm based implementation (AMD and
>>> NV has done this; At Intel we are catching up), I want to explore how much 
>>> we
>>> can benefit from the proposed approach and how your approach can solve some
>>> pain points of our development. So basically what I am questioning here is: 
>>> what
>>> is the advantage of your approach against hmm.
 To implement a UVM (unified virtual address space b/t cpu and gpu device),
>>> with hmm, driver essentially need to implement below functions:
 1. device page table update. Your approach requires the same because
 this is device specific codes

 2. Some migration functions to migrate memory b/t system memory and GPU
>>> local memory. My understanding is, even though you generalized this a bit, 
>>> such
>>> as modified cpu page fault path, provided "general" gm_dev_fault handler... 
>>> but
>>> device driver still need to provide migration functions because migration
>>> functions have to be device specific (i.e., using device dma/copy engine for
>>> performance purpose). Right?
 3. GPU physical memory management, this part is now in drm/buddy, shared
>>> by all drivers. I think with your approach, driver still need to provide 
>>> callback
>>> functions to allocate/free physical pages. Right? Or do you let linux core 
>>> mm
>>> buddy manage device memory directly?
 4. madvise/hints/virtual address range management. This has been pain point
>>> for us. Right now device driver has to maintain certain virtual address 
>>> range data
>>> structure to maintain hints and other virtual address range based memory
>>> attributes. Driver need to sync with linux vma. Driver need to explicitly

radeon flooding "flip queue failed" in Xorg and system log

2023-12-04 Thread smesgr

Hi,

I have updated my old Fujitsu S920 with AMD GX-424CC from Ubuntu 22.04 LTS to 
Debian 12.
The system worked fine before, after the installation completed, my logs are 
flooded by the following error messages:

dmesg:
[ 967.869183] radeon :00:01.0: 3ba3c813 pin failed
[ 967.869212] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to pin 
new rbo buffer before flip
[ 967.886254] radeon :00:01.0: 51be2216 pin failed
[ 967.886282] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to pin 
new rbo buffer before flip
[ 967.907984] radeon :00:01.0: 3ba3c813 pin failed
[ 967.908014] [drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to pin 
new rbo buffer before flip
[ 967.925551] radeon :00:01.0: 51be2216 pin failed[ 967.925580] 
[drm:radeon_crtc_page_flip_target [radeon]] *ERROR* failed to pin new rbo 
buffer before flip

Xorg.log
[ 47.777] (WW) RADEON(0): flip queue failed: Invalid argument
[ 47.777] (WW) RADEON(0): Page flip failed: Invalid argument[ 47.777] (EE) 
RADEON(0): present flip failed

lspci:
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] 
Mullins [Radeon R4/R5 Graphics] (rev 01) (prog-if 00 [VGA controller])
Subsystem: Fujitsu Technology Solutions Mullins [Radeon R4/R5 Graphics]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1+,D2+,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag+ RBE+ FLReset-
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr+ FatalErr- UnsupReq+ AuxPwr- TransPend-
DevCap2: Completion Timeout: Not Supported, TimeoutDis- NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt+ EETLPPrefix+, 
MaxEETLPPrefixes 1
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF 
Disabled,
AtomicOpsCtl: ReqEn-
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: fee08004 Data: 0023
Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 
Capabilities: [270 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [2b0 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
Capabilities: [2c0 v1] Page Request Interface (PRI)
PRICtl: Enable- Reset-
PRISta: RF- UPRGI- Stopped+
Page Request Capacity: 0020, Page Request Allocation: 
Capabilities: [2d0 v1] Process Address Space ID (PASID)
PASIDCap: Exec+ Priv+, Max PASID Width: 10
PASIDCtl: Enable- Exec- Priv-
Kernel driver in use: radeon Kernel modules: radeon, amdgpu

I'm a bit confused. Is this an kernel issue, Xorg issue or something else?

kernel:
Linux debian 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) 
x86_64 GNU/Linux

Xorg:
[ 33.144]
X.Org X Server 1.21.1.7
X Protocol Version 11, Revision 0
[ 33.144] Current Operating System: Linux debian 6.1.0-13-amd64 #1 SMP 
PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64
[ 33.144] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.1.0-13-amd64 
root=UUID=4101c568-e073-42f4-8dfe-08373fceb13c ro quiet
[ 33.144] xorg-server 2:21.1.7-3 (https://www.debian.org/support)
[ 33.144] Current version of pixman: 0.42.2
[ 33.144] Before reporting problems, check http://wiki.x.org to make sure that 
you have the latest version.

I searched bug tracker and could only find this issue: 
https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/-/issues/180 but I 
don't have a black screen. The system is rather sluggish but seems to work.

Any hints where to go next is very appreciated.

Best Regards

Re: [RFC PATCH 2/6] mm/gmem: add arch-independent abstraction to track address mapping status

2023-12-04 Thread Pedro Falcato

On Fri, Dec 1, 2023 at 9:23 AM David Hildenbrand  wrote:
>
> On 28.11.23 13:50, Weixi Zhu wrote:
> > This patch adds an abstraction layer, struct vm_object, that maintains
> > per-process virtual-to-physical mapping status stored in struct gm_mapping.
> > For example, a virtual page may be mapped to a CPU physical page or to a
> > device physical page. Struct vm_object effectively maintains an
> > arch-independent page table, which is defined as a "logical page table".
> > While arch-dependent page table used by a real MMU is named a "physical
> > page table". The logical page table is useful if Linux core MM is extended
> > to handle a unified virtual address space with external accelerators using
> > customized MMUs.
>
> Which raises the question why we are dealing with anonymous memory at
> all? Why not go for shmem if you are already only special-casing VMAs
> with a MMAP flag right now?
>
> That would maybe avoid having to introduce controversial BSD design
> concepts into Linux, that feel like going a step backwards in time to me
> and adding *more* MM complexity.
>
> >
> > In this patch, struct vm_object utilizes a radix
> > tree (xarray) to track where a virtual page is mapped to. This adds extra
> > memory consumption from xarray, but provides a nice abstraction to isolate
> > mapping status from the machine-dependent layer (PTEs). Besides supporting
> > accelerators with external MMUs, struct vm_object is planned to further
> > union with i_pages in struct address_mapping for file-backed memory.
>
> A file already has a tree structure (pagecache) to manage the pages that
> are theoretically mapped. It's easy to translate from a VMA to a page
> inside that tree structure that is currently not present in page tables.
>
> Why the need for that tree structure if you can just remove anon memory
> from the picture?
>
> >
> > The idea of struct vm_object is originated from FreeBSD VM design, which
> > provides a unified abstraction for anonymous memory, file-backed memory,
> > page cache and etc[1].
>
> :/
>
> > Currently, Linux utilizes a set of hierarchical page walk functions to
> > abstract page table manipulations of different CPU architecture. The
> > problem happens when a device wants to reuse Linux MM code to manage its
> > page table -- the device page table may not be accessible to the CPU.
> > Existing solution like Linux HMM utilizes the MMU notifier mechanisms to
> > invoke device-specific MMU functions, but relies on encoding the mapping
> > status on the CPU page table entries. This entangles machine-independent
> > code with machine-dependent code, and also brings unnecessary restrictions.
>
> Why? we have primitives to walk arch page tables in a non-arch specific
> fashion and are using them all over the place.
>
> We even have various mechanisms to map something into the page tables
> and get the CPU to fault on it, as if it is inaccessible (PROT_NONE as
> used for NUMA balancing, fake swap entries).
>
> > The PTE size and format vary arch by arch, which harms the extensibility.
>
> Not really.
>
> We might have some features limited to some architectures because of the
> lack of PTE bits. And usually the problem is that people don't care
> enough about enabling these features on older architectures.
>
> If we ever *really* need more space for sw-defined data, it would be
> possible to allocate auxiliary data for page tables only where required
> (where the features apply), instead of crafting a completely new,
> auxiliary datastructure with it's own locking.
>
> So far it was not required to enable the feature we need on the
> architectures we care about.
>
> >
> > [1] https://docs.freebsd.org/en/articles/vm-design/
>
> In the cover letter you have:
>
> "The future plan of logical page table is to provide a generic
> abstraction layer that support common anonymous memory (I am looking at
> you, transparent huge pages) and file-backed memory."
>
> Which I doubt will happen; there is little interest in making anonymous
> memory management slower, more serialized, and wasting more memory on
> metadata.

Also worth noting that:

1) Mach VM (which FreeBSD inherited, from the old BSD) vm_objects
aren't quite what's being stated here, rather they are somewhat
replacements for both anon_vma and address_space[1]. Very similarly to
Linux, they take pages from vm_objects and map them in page tables
using pmap (the big difference is anon memory, which has its
bookkeeping in page tables, on Linux)

2) These vm_objects were a horrendous mistake (see CoW chaining) and
FreeBSD has to go to horrendous lengths to make them tolerable. The
UVM paper/dissertation (by Charles Cranor) talks about these issues at
length, and 20 years later it's still true.

3) Despite Linux MM having its warts, it's probably correct to
consider it a solid improvement over FreeBSD MM or NetBSD UVM

And, finally, randomly tacking on core MM concepts from other systems
is at best a *really weird* idea. Particularly when they

[PATCH] drm/radeon/dpm: fix a memleak in sumo_parse_power_table

2023-12-04 Thread Zhipeng Lu

The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
following error-handling path. However, in the error-handling of
rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
resulting in a memleak in this function.

Fixes: 80ea2c129c76 ("drm/radeon/kms: add dpm support for sumo asics (v2)")
Signed-off-by: Zhipeng Lu 
---
 drivers/gpu/drm/radeon/sumo_dpm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/radeon/sumo_dpm.c 
b/drivers/gpu/drm/radeon/sumo_dpm.c
index f74f381af05f..bde640053708 100644
--- a/drivers/gpu/drm/radeon/sumo_dpm.c
+++ b/drivers/gpu/drm/radeon/sumo_dpm.c
@@ -1494,6 +1494,7 @@ static int sumo_parse_power_table(struct radeon_device 
*rdev)
non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)

_clock_info_array->nonClockInfo[non_clock_array_index];
if (!rdev->pm.power_state[i].clock_info)
+   kfree(rdev->pm.dpm.ps);
return -EINVAL;
ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
if (ps == NULL) {
-- 
2.34.1

[PATCH] drm/radeon/trinity_dpm: fix a memleak in trinity_parse_power_table

2023-12-04 Thread Zhipeng Lu

The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
following error-handling path. However, in the error-handling of
rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
resulting in a memleak in this function.

Fixes: d70229f70447 ("drm/radeon/kms: add dpm support for trinity asics")
Signed-off-by: Zhipeng Lu 
---
 drivers/gpu/drm/radeon/trinity_dpm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/radeon/trinity_dpm.c 
b/drivers/gpu/drm/radeon/trinity_dpm.c
index 08ea1c864cb2..8bf56fb7b933 100644
--- a/drivers/gpu/drm/radeon/trinity_dpm.c
+++ b/drivers/gpu/drm/radeon/trinity_dpm.c
@@ -1727,6 +1727,7 @@ static int trinity_parse_power_table(struct radeon_device 
*rdev)
non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)

_clock_info_array->nonClockInfo[non_clock_array_index];
if (!rdev->pm.power_state[i].clock_info)
+   kfree(rdev->pm.dpm.ps);
return -EINVAL;
ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
if (ps == NULL) {
-- 
2.34.1

Re: Radeon regression in 6.6 kernel

2023-12-04 Thread Phillip Susi

Alex Deucher  writes:

> Phillip,
>
> Can you test this patch?  I was not able to repro the issue on the
> navi2x card I had handy, but I think it should fix it.

I pulled -rc4 and it still had the problem.  I applied this patch, and
yes, it fixed it!

Re: [PATCH v2 7/8] drm/simpledrm: Do not include

2023-12-04 Thread Javier Martinez Canillas

Thomas Zimmermann  writes:

> Remove unnecessary include statements for .
> The file contains helpers for non-atomic code and should not be
> required by most drivers. No functional changes.
>
> Signed-off-by: Thomas Zimmermann 
> Reviewed-by: Sui Jingfeng 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat

Re: [RFC PATCH 0/6] Supporting GMEM (generalized memory management) for external memory devices

2023-12-04 Thread Christian König


Am 04.12.23 um 00:32 schrieb Alistair Popple:

Christian König  writes:


Am 01.12.23 um 06:48 schrieb Zeng, Oak:

[SNIP]
Besides memory eviction/oversubscription, there are a few other pain points 
when I use hmm:

1) hmm doesn't support file-back memory, so it is hard to share

memory b/t process in a gpu environment. You mentioned you have a
plan... How hard is it to support file-backed in your approach?

As hard as it is to support it through HMM. That's what I meant that
this approach doesn't integrate well, as far as I know the problem
isn't inside HMM or any other solution but rather in the file system
layer.

In what way does HMM not support file-backed memory? I was under the
impression that at least hmm_range_fault() does.


Oh, well file-backed memory is indeed supported by HMM. IIRC KFD 
actually allows this for the SVM implementation.


It's just that the way the file system layer (for example) does 
writeback absolutely doesn't fit well with how GPUs and other 
acceleration devices work.


The general assumption in the kernel seems to be that page faults and 
preemption are extremely cheap. So things like copy on write is used 
quite extensively.


For a CPU this basically means you just need to context change into the 
kernel once to get the new address of a page into your PTEs on write, 
while for acceleration devices this always require a complete CPU round 
trip for each initial write access for a 4k page. The performance impact 
is just horrible.


Regards,
Christian.








  - Alistair


Regards,
Christian.


2)virtual address range based memory attribute/hint: with hmadvise,

where do you save the memory attribute of a virtual address range? Do
you need to extend vm_area_struct to save it? With hmm, we have to
maintain such information at driver. This ends up with pretty
complicated logic to split/merge those address range. I know core mm
has similar logic to split/merge vma...

Oak



-Weixi

-Original Message-
From: Christian König
Sent: Thursday, November 30, 2023 4:28 PM
To: Zeng, Oak; Christian König
; zhuweixi; linux-
m...@kvack.org;linux-ker...@vger.kernel.org;a...@linux-foundation.org;
Danilo Krummrich; Dave Airlie; Daniel
Vetter
Cc:intel-gvt-...@lists.freedesktop.org;rcampb...@nvidia.com;
mhairgr...@nvidia.com;j...@nvidia.com;weixi@openeuler.sh;
jhubb...@nvidia.com;intel-...@lists.freedesktop.org;apop...@nvidia.com;
xinhui@amd.com;amd-gfx@lists.freedesktop.org;
tvrtko.ursu...@linux.intel.com;ogab...@kernel.org;jgli...@redhat.com; dri-
de...@lists.freedesktop.org;z...@nvidia.com; Vivi, Rodrigo
;alexander.deuc...@amd.com;leo...@nvidia.com;
felix.kuehl...@amd.com; Wang, Zhi A;
mgor...@suse.de
Subject: Re: [RFC PATCH 0/6] Supporting GMEM (generalized memory
management) for external memory devices

Hi Oak,

yeah, #4 is indeed a really good point and I think Felix will agree to that as 
well.

HMM is basically still missing a way to advise device attributes for the CPU
address space. Both migration strategy as well as device specific information 
(like
cache preferences) fall into this category.

Since there is a device specific component in those attributes as well I think
device specific IOCTLs still make sense to update them, but HMM should offer
the functionality to manage and store those information.

Split and merge of VMAs only become a problem if you attach those information
to VMAs, if you keep them completely separate than that doesn't become an
issue either. The down side of this approach is that you don't get automatically
extending attribute ranges for growing VMAs for example.

Regards,
Christian.

Am 29.11.23 um 23:23 schrieb Zeng, Oak:

Hi Weixi,

Even though Christian has listed reasons rejecting this proposal (yes they are

very reasonable to me), I would open my mind and further explore the possibility
here. Since the current GPU driver uses a hmm based implementation (AMD and
NV has done this; At Intel we are catching up), I want to explore how much we
can benefit from the proposed approach and how your approach can solve some
pain points of our development. So basically what I am questioning here is: what
is the advantage of your approach against hmm.

To implement a UVM (unified virtual address space b/t cpu and gpu device),

with hmm, driver essentially need to implement below functions:

1. device page table update. Your approach requires the same because
this is device specific codes

2. Some migration functions to migrate memory b/t system memory and GPU

local memory. My understanding is, even though you generalized this a bit, such
as modified cpu page fault path, provided "general" gm_dev_fault handler... but
device driver still need to provide migration functions because migration
functions have to be device specific (i.e., using device dma/copy engine for
performance purpose). Right?

3. GPU physical memory management, this part is now in drm/buddy, shared

by all drivers. I think with your approach, driver still need to provide

[PATCH v2 7/8] drm/simpledrm: Do not include

2023-12-04 Thread Thomas Zimmermann

Remove unnecessary include statements for .
The file contains helpers for non-atomic code and should not be
required by most drivers. No functional changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Sui Jingfeng 
---
 drivers/gpu/drm/tiny/simpledrm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/tiny/simpledrm.c b/drivers/gpu/drm/tiny/simpledrm.c
index 34bbbd7b53dd9..7ce1c46176750 100644
--- a/drivers/gpu/drm/tiny/simpledrm.c
+++ b/drivers/gpu/drm/tiny/simpledrm.c
@@ -25,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #define DRIVER_NAME"simpledrm"
-- 
2.43.0

[PATCH v2 8/8] drm/xlnx: Do not include

2023-12-04 Thread Thomas Zimmermann

Remove unnecessary include statements for .
The file contains helpers for non-atomic code and should not be
required by most drivers. No functional changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Laurent Pinchart 
---
 drivers/gpu/drm/xlnx/zynqmp_kms.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/xlnx/zynqmp_kms.c 
b/drivers/gpu/drm/xlnx/zynqmp_kms.c
index a7f8611be6f42..db3bb4afbfc46 100644
--- a/drivers/gpu/drm/xlnx/zynqmp_kms.c
+++ b/drivers/gpu/drm/xlnx/zynqmp_kms.c
@@ -27,7 +27,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
-- 
2.43.0

[PATCH v2 6/8] drm/ofdrm: Do not include

2023-12-04 Thread Thomas Zimmermann

Remove unnecessary include statements for .
The file contains helpers for non-atomic code and should not be
required by most drivers. No functional changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Sui Jingfeng 
---
 drivers/gpu/drm/tiny/ofdrm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/tiny/ofdrm.c b/drivers/gpu/drm/tiny/ofdrm.c
index 05a72473cfc65..ab89b7fc7bf61 100644
--- a/drivers/gpu/drm/tiny/ofdrm.c
+++ b/drivers/gpu/drm/tiny/ofdrm.c
@@ -19,7 +19,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
-- 
2.43.0

[PATCH v2 5/8] drm/solomon: Do not include

2023-12-04 Thread Thomas Zimmermann

Remove unnecessary include statements for .
The file contains helpers for non-atomic code and should not be
required by most drivers. No functional changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Geert Uytterhoeven 
Reviewed-by: Javier Martinez Canillas 
---
 drivers/gpu/drm/solomon/ssd130x.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/solomon/ssd130x.h 
b/drivers/gpu/drm/solomon/ssd130x.h
index acf7cedf0c1ab..075c5c3ee75ac 100644
--- a/drivers/gpu/drm/solomon/ssd130x.h
+++ b/drivers/gpu/drm/solomon/ssd130x.h
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 
-- 
2.43.0

[PATCH v2 0/8] drm/plane-helpers: Minor clean ups

2023-12-04 Thread Thomas Zimmermann

Move drm_plane_helper_atomic_check() into udl, which is the only
driver using it. Remove several unnecessary include statements for
.

v2:
* fix documentation (Sui)

Thomas Zimmermann (8):
  drm/plane-helper: Move drm_plane_helper_atomic_check() into udl
  drm/amdgpu: Do not include 
  drm/loongson: Do not include 
  drm/shmobile: Do not include 
  drm/solomon: Do not include 
  drm/ofdrm: Do not include 
  drm/simpledrm: Do not include 
  drm/xlnx: Do not include 

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  1 -
 drivers/gpu/drm/drm_crtc_helper.c |  7 ++--
 drivers/gpu/drm/drm_plane_helper.c| 32 ---
 drivers/gpu/drm/loongson/lsdc_plane.c |  1 -
 .../drm/renesas/shmobile/shmob_drm_plane.c|  1 -
 drivers/gpu/drm/solomon/ssd130x.h |  1 -
 drivers/gpu/drm/tiny/ofdrm.c  |  1 -
 drivers/gpu/drm/tiny/simpledrm.c  |  1 -
 drivers/gpu/drm/udl/udl_modeset.c | 19 +--
 drivers/gpu/drm/xlnx/zynqmp_kms.c |  1 -
 include/drm/drm_plane_helper.h|  2 --
 11 files changed, 19 insertions(+), 48 deletions(-)

-- 
2.43.0

[PATCH v2 2/8] drm/amdgpu: Do not include

2023-12-04 Thread Thomas Zimmermann

Remove unnecessary include statements for .
The file contains helpers for non-atomic code and should not be
required by most drivers. No functional changes.

Signed-off-by: Thomas Zimmermann 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index aa43f1761acd3..b8c3a9b104a41 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -92,7 +92,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 
-- 
2.43.0

[PATCH v2 4/8] drm/shmobile: Do not include

2023-12-04 Thread Thomas Zimmermann

Remove unnecessary include statements for .
The file contains helpers for non-atomic code and should not be
required by most drivers. No functional changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Geert Uytterhoeven 
Acked-by: Geert Uytterhoeven 
---
 drivers/gpu/drm/renesas/shmobile/shmob_drm_plane.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/renesas/shmobile/shmob_drm_plane.c 
b/drivers/gpu/drm/renesas/shmobile/shmob_drm_plane.c
index 8f9a728affde8..07ad17d24294d 100644
--- a/drivers/gpu/drm/renesas/shmobile/shmob_drm_plane.c
+++ b/drivers/gpu/drm/renesas/shmobile/shmob_drm_plane.c
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "shmob_drm_drv.h"
 #include "shmob_drm_kms.h"
-- 
2.43.0

[PATCH v2 1/8] drm/plane-helper: Move drm_plane_helper_atomic_check() into udl

2023-12-04 Thread Thomas Zimmermann

The udl driver is the only caller of drm_plane_helper_atomic_check().
Move the function into the driver. No functional changes.

v2:
* fix documenation (Sui)

Signed-off-by: Thomas Zimmermann 
Acked-by: Sui Jingfeng 
---
 drivers/gpu/drm/drm_crtc_helper.c  |  7 ++-
 drivers/gpu/drm/drm_plane_helper.c | 32 --
 drivers/gpu/drm/udl/udl_modeset.c  | 19 --
 include/drm/drm_plane_helper.h |  2 --
 4 files changed, 19 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc_helper.c 
b/drivers/gpu/drm/drm_crtc_helper.c
index a209659a996c7..2dafc39a27cb9 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -439,11 +439,8 @@ EXPORT_SYMBOL(drm_crtc_helper_set_mode);
  * @state: atomic state object
  *
  * Provides a default CRTC-state check handler for CRTCs that only have
- * one primary plane attached to it.
- *
- * This is often the case for the CRTC of simple framebuffers. See also
- * drm_plane_helper_atomic_check() for the respective plane-state check
- * helper function.
+ * one primary plane attached to it. This is often the case for the CRTC
+ * of simple framebuffers.
  *
  * RETURNS:
  * Zero on success, or an errno code otherwise.
diff --git a/drivers/gpu/drm/drm_plane_helper.c 
b/drivers/gpu/drm/drm_plane_helper.c
index 5e95089676ff8..7982be4b0306d 100644
--- a/drivers/gpu/drm/drm_plane_helper.c
+++ b/drivers/gpu/drm/drm_plane_helper.c
@@ -279,35 +279,3 @@ void drm_plane_helper_destroy(struct drm_plane *plane)
kfree(plane);
 }
 EXPORT_SYMBOL(drm_plane_helper_destroy);
-
-/**
- * drm_plane_helper_atomic_check() - Helper to check plane atomic-state
- * @plane: plane to check
- * @state: atomic state object
- *
- * Provides a default plane-state check handler for planes whose atomic-state
- * scale and positioning are not expected to change since the plane is always
- * a fullscreen scanout buffer.
- *
- * This is often the case for the primary plane of simple framebuffers. See
- * also drm_crtc_helper_atomic_check() for the respective CRTC-state check
- * helper function.
- *
- * RETURNS:
- * Zero on success, or an errno code otherwise.
- */
-int drm_plane_helper_atomic_check(struct drm_plane *plane, struct 
drm_atomic_state *state)
-{
-   struct drm_plane_state *new_plane_state = 
drm_atomic_get_new_plane_state(state, plane);
-   struct drm_crtc *new_crtc = new_plane_state->crtc;
-   struct drm_crtc_state *new_crtc_state = NULL;
-
-   if (new_crtc)
-   new_crtc_state = drm_atomic_get_new_crtc_state(state, new_crtc);
-
-   return drm_atomic_helper_check_plane_state(new_plane_state, 
new_crtc_state,
-  DRM_PLANE_NO_SCALING,
-  DRM_PLANE_NO_SCALING,
-  false, false);
-}
-EXPORT_SYMBOL(drm_plane_helper_atomic_check);
diff --git a/drivers/gpu/drm/udl/udl_modeset.c 
b/drivers/gpu/drm/udl/udl_modeset.c
index 40876bcdd79a4..7702359c90c22 100644
--- a/drivers/gpu/drm/udl/udl_modeset.c
+++ b/drivers/gpu/drm/udl/udl_modeset.c
@@ -21,7 +21,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -261,6 +260,22 @@ static const uint64_t udl_primary_plane_fmtmods[] = {
DRM_FORMAT_MOD_INVALID
 };
 
+static int udl_primary_plane_helper_atomic_check(struct drm_plane *plane,
+struct drm_atomic_state *state)
+{
+   struct drm_plane_state *new_plane_state = 
drm_atomic_get_new_plane_state(state, plane);
+   struct drm_crtc *new_crtc = new_plane_state->crtc;
+   struct drm_crtc_state *new_crtc_state = NULL;
+
+   if (new_crtc)
+   new_crtc_state = drm_atomic_get_new_crtc_state(state, new_crtc);
+
+   return drm_atomic_helper_check_plane_state(new_plane_state, 
new_crtc_state,
+  DRM_PLANE_NO_SCALING,
+  DRM_PLANE_NO_SCALING,
+  false, false);
+}
+
 static void udl_primary_plane_helper_atomic_update(struct drm_plane *plane,
   struct drm_atomic_state 
*state)
 {
@@ -296,7 +311,7 @@ static void udl_primary_plane_helper_atomic_update(struct 
drm_plane *plane,
 
 static const struct drm_plane_helper_funcs udl_primary_plane_helper_funcs = {
DRM_GEM_SHADOW_PLANE_HELPER_FUNCS,
-   .atomic_check = drm_plane_helper_atomic_check,
+   .atomic_check = udl_primary_plane_helper_atomic_check,
.atomic_update = udl_primary_plane_helper_atomic_update,
 };
 
diff --git a/include/drm/drm_plane_helper.h b/include/drm/drm_plane_helper.h
index 3a574e8cd22f4..75f9c4830564a 100644
--- a/include/drm/drm_plane_helper.h
+++ b/include/drm/drm_plane_helper.h
@@ -26,7 +26,6 @@
 
 #include 
 
-struct drm_atomic_state;
 struct drm_crtc;
 struct

[PATCH v2 3/8] drm/loongson: Do not include

2023-12-04 Thread Thomas Zimmermann

Remove unnecessary include statements for .
The file contains helpers for non-atomic code and should not be
required by most drivers. No functional changes.

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Sui Jingfeng 
Tested-by: Sui Jingfeng 
---
 drivers/gpu/drm/loongson/lsdc_plane.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/loongson/lsdc_plane.c 
b/drivers/gpu/drm/loongson/lsdc_plane.c
index 0d50946332229..d227a2c1dcf16 100644
--- a/drivers/gpu/drm/loongson/lsdc_plane.c
+++ b/drivers/gpu/drm/loongson/lsdc_plane.c
@@ -9,7 +9,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "lsdc_drv.h"
 #include "lsdc_regs.h"
-- 
2.43.0

RE: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version and metrics table

2023-12-04 Thread Ma, Li

[Public]

Hi Alex,

Sorry for the late reply. Only smu14 used this gpu_metrics_v3_0 struct. And the 
patch has upstream. As far as l know, umr used gpu_metrics_v3_0 and I will 
submit a patch to umr.
Does this struct need to be back compatible currently? If yes, I will revert 
this patch and add a new gpu_metrics_v3_1.

Best Regards,
Li

-Original Message-
From: Deucher, Alexander 
Sent: Tuesday, November 28, 2023 4:47 AM
To: Ma, Li ; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian ; Zhang, Yifan 
; Yu, Lang 
Subject: RE: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version and 
metrics table

[Public]

> -Original Message-
> From: Ma, Li 
> Sent: Thursday, November 23, 2023 5:07 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Koenig, Christian
> ; Zhang, Yifan ; Yu,
> Lang ; Ma, Li 
> Subject: [PATCH] drm/amd/swsmu: update smu v14_0_0 driver if version
> and metrics table
>
> Increment the driver if version and add new mems to the mertics table.
>
> Signed-off-by: Li Ma 
> ---
>  .../gpu/drm/amd/include/kgd_pp_interface.h| 17 
>  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 10 +++
>  .../inc/pmfw_if/smu14_driver_if_v14_0_0.h | 77 +++
>  .../drm/amd/pm/swsmu/smu14/smu_v14_0_0_ppt.c  | 46 ++-
>  4 files changed, 115 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> index 8ebba87f4289..eaea1c65e526 100644
> --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
> @@ -1086,6 +1086,10 @@ struct gpu_metrics_v3_0 {
>   uint16_taverage_dram_reads;
>   /* time filtered DRAM write bandwidth [MB/sec] */
>   uint16_taverage_dram_writes;
> + /* time filtered IPU read bandwidth [MB/sec] */
> + uint16_taverage_ipu_reads;
> + /* time filtered IPU write bandwidth [MB/sec] */
> + uint16_taverage_ipu_writes;
>
>   /* Driver attached timestamp (in ns) */
>   uint64_tsystem_clock_counter;
> @@ -1105,6 +1109,8 @@ struct gpu_metrics_v3_0 {
>   uint32_taverage_all_core_power;
>   /* calculated core power [mW] */
>   uint16_taverage_core_power[16];
> + /* time filtered total system power [mW] */
> + uint16_taverage_sys_power;
>   /* maximum IRM defined STAPM power limit [mW] */
>   uint16_tstapm_power_limit;
>   /* time filtered STAPM power limit [mW] */ @@ -1117,6 +1123,8 @@
> struct gpu_metrics_v3_0 {
>   uint16_taverage_ipuclk_frequency;
>   uint16_taverage_fclk_frequency;
>   uint16_taverage_vclk_frequency;
> + uint16_taverage_uclk_frequency;
> + uint16_taverage_mpipu_frequency;
>
>   /* Current clocks */
>   /* target core frequency [MHz] */ @@ -1126,6 +1134,15 @@ struct
> gpu_metrics_v3_0 {
>   /* GFXCLK frequency limit enforced on GFX [MHz] */
>   uint16_tcurrent_gfx_maxfreq;
>
> + /* Throttle Residency (ASIC dependent) */
> + uint32_t throttle_residency_prochot;
> + uint32_t throttle_residency_spl;
> + uint32_t throttle_residency_fppt;
> + uint32_t throttle_residency_sppt;
> + uint32_t throttle_residency_thm_core;
> + uint32_t throttle_residency_thm_gfx;
> + uint32_t throttle_residency_thm_soc;
> +
>   /* Metrics table alpha filter time constant [us] */
>   uint32_ttime_filter_alphavalue;
>  };

Is anything else besides smu14 using v3 of this struct?  If so, we can't change 
the layout otherwise it will break existing tools.  If so, bump the version 
minor and append the new items to the end.

Alex


> diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> index c125253df20b..c2265e027ca8 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> @@ -1418,6 +1418,16 @@ typedef enum {
>   METRICS_PCIE_WIDTH,
>   METRICS_CURR_FANPWM,
>   METRICS_CURR_SOCKETPOWER,
> + METRICS_AVERAGE_VPECLK,
> + METRICS_AVERAGE_IPUCLK,
> + METRICS_AVERAGE_MPIPUCLK,
> + METRICS_THROTTLER_RESIDENCY_PROCHOT,
> + METRICS_THROTTLER_RESIDENCY_SPL,
> + METRICS_THROTTLER_RESIDENCY_FPPT,
> + METRICS_THROTTLER_RESIDENCY_SPPT,
> + METRICS_THROTTLER_RESIDENCY_THM_CORE,
> + METRICS_THROTTLER_RESIDENCY_THM_GFX,
> + METRICS_THROTTLER_RESIDENCY_THM_SOC,
>  } MetricsMember_t;
>
>  enum smu_cmn2asic_mapping_type {
> diff --git
> a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu14_driver_if_v14_0_0
> .h
>

Re: [PATCH v5 00/32] drm/amd/display: add AMD driver-specific properties for color mgmt

2023-12-04 Thread Maxime Ripard

On Fri, Dec 01, 2023 at 10:20:40AM -0500, Harry Wentland wrote:
> 
> 
> On 2023-11-30 06:34, Daniel Vetter wrote:
> > On Tue, 28 Nov 2023 at 23:11, Harry Wentland  wrote:
> >>
> >> On 2023-11-16 14:57, Melissa Wen wrote:
> >>> Hello,
> >>>
> >>> This series extends the current KMS color management API with AMD
> >>> driver-specific properties to enhance the color management support on
> >>> AMD Steam Deck. The key additions to the color pipeline include:
> >>>
> >>
> >> snip
> >>
> >>> Melissa Wen (18):
> >>>   drm/drm_mode_object: increase max objects to accommodate new color
> >>> props
> >>>   drm/drm_property: make replace_property_blob_from_id a DRM helper
> >>>   drm/drm_plane: track color mgmt changes per plane
> >>
> >> If all patches are merged through amd-staging-drm-next I worry that
> >> conflicts creep in if any code around replace_property_blob_from_id
> >> changes in DRM.
> >>
> >> My plan is to merge DRM patches through drm-misc-next, as well
> >> as include them in the amd-staging-drm-next merge. They should then
> >> fall out at the next amd-staging-drm-next pull and (hopefully)
> >> ensure that there is no conflict.
> >>
> >> If no objections I'll go ahead with that later this week.
> > 
> > Double-merging tends to be the worst because git doesn't realize the
> > commits match, which actually makes the conflicts worse when they
> > happen (because the 3-way merge diff gets absolute confused by all the
> > changed context and misplaces everything to the max). So please don't,
> > _only_ every cherry-pick when a patch in -next is also needed in
> > -fixes, and we didn't put it into the right tree. But even that is a
> > bit tricky and should only be done by maintainers (using dim
> > cherry-pick if it's drm-misc) because the conflicts tend to be bad and
> > need to be sorted out with backmerges sooner than later.
> > 
> > For this case merge everything through one tree with the right acks,
> > pull to drm-next asap and then backmerge into the other tree. Here
> > probably amdgpu-next with drm-misc maintainer acks for the 3 core
> > patches. Or if amdgpu-next pull won't come for a while, put them into
> > drm-misc-next and just wait a week until it's in drm-next, then
> > forward amdgpu-next.
> > 
> 
> Maxime, Maarten, Thomas, could I get an ACK from you for the three
> DRM core patches and an ACK for pulling them through the AMD tree?

Acked-by: Maxime Ripard 

Maxime


signature.asc
Description: PGP signature

Re: [PATCH 2/2] drm/amdgpu: Auto-validate DMABuf imports in compute VMs

2023-12-04 Thread Christian König


Am 15.11.23 um 00:32 schrieb Felix Kuehling:

DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the
process_info->kfd_bo_list. There is no explicit KFD API call to validate
them or add eviction fences to them.

This patch automatically validates and fences dymanic DMABuf imports when
they are added to a compute VM. Revalidation after evictions is handled
in the VM code.

Signed-off-by: Felix Kuehling 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   3 +
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  45 ---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|   6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c   |   4 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  26 
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 122 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h|  12 +-
  drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  12 +-
  8 files changed, 196 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index fcf8a98ad15e..68d534a89942 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -178,6 +178,9 @@ int amdgpu_queue_mask_bit_to_set_resource_bit(struct 
amdgpu_device *adev,
  struct amdgpu_amdkfd_fence *amdgpu_amdkfd_fence_create(u64 context,
struct mm_struct *mm,
struct svm_range_bo *svm_bo);
+int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
+   uint32_t domain,
+   struct dma_fence *fence);
  #if defined(CONFIG_DEBUG_FS)
  int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void *data);
  #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 2e302956a279..bb83ab7a0256 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -423,9 +423,9 @@ static int amdgpu_amdkfd_bo_validate(struct amdgpu_bo *bo, 
uint32_t domain,
return ret;
  }
  
-static int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,

-  uint32_t domain,
-  struct dma_fence *fence)
+int amdgpu_amdkfd_bo_validate_and_fence(struct amdgpu_bo *bo,
+   uint32_t domain,
+   struct dma_fence *fence)
  {
int ret = amdgpu_bo_reserve(bo, false);
  
@@ -461,13 +461,16 @@ static int amdgpu_amdkfd_validate_vm_bo(void *_unused, struct amdgpu_bo *bo)

   * again. Page directories are only updated after updating page
   * tables.
   */
-static int vm_validate_pt_pd_bos(struct amdgpu_vm *vm)
+static int vm_validate_pt_pd_bos(struct amdgpu_vm *vm,
+struct ww_acquire_ctx *ticket)
  {
struct amdgpu_bo *pd = vm->root.bo;
struct amdgpu_device *adev = amdgpu_ttm_adev(pd->tbo.bdev);
int ret;
  
-	ret = amdgpu_vm_validate_pt_bos(adev, vm, amdgpu_amdkfd_validate_vm_bo, NULL);

+   ret = amdgpu_vm_validate_evicted_bos(adev, vm, ticket,
+amdgpu_amdkfd_validate_vm_bo,
+NULL);
if (ret) {
pr_err("failed to validate PT BOs\n");
return ret;
@@ -1347,14 +1350,15 @@ static int map_bo_to_gpuvm(struct kgd_mem *mem,
return ret;
  }
  
-static int process_validate_vms(struct amdkfd_process_info *process_info)

+static int process_validate_vms(struct amdkfd_process_info *process_info,
+   struct ww_acquire_ctx *ticket)
  {
struct amdgpu_vm *peer_vm;
int ret;
  
  	list_for_each_entry(peer_vm, _info->vm_list_head,

vm_list_node) {
-   ret = vm_validate_pt_pd_bos(peer_vm);
+   ret = vm_validate_pt_pd_bos(peer_vm, ticket);
if (ret)
return ret;
}
@@ -1440,7 +1444,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
ret = amdgpu_bo_reserve(vm->root.bo, true);
if (ret)
goto reserve_pd_fail;
-   ret = vm_validate_pt_pd_bos(vm);
+   ret = vm_validate_pt_pd_bos(vm, NULL);
if (ret) {
pr_err("validate_pt_pd_bos() failed\n");
goto validate_pd_fail;
@@ -2075,7 +2079,7 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
bo->tbo.resource->mem_type == TTM_PL_SYSTEM)
is_invalid_userptr = true;
  
-	ret = vm_validate_pt_pd_bos(avm);

+   ret = vm_validate_pt_pd_bos(avm, NULL);
if (unlikely(ret))
goto out_unreserve;
  
@@ -2154,7 +2158,7 @@ int amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu(

goto unreserve_out;
}
  
-	ret =

Re: [PATCH] drm/radeon/dpm: fix a memleak in sumo_parse_power_table

2023-12-04 Thread Christian König


Am 03.12.23 um 18:16 schrieb Zhipeng Lu:

The rdev->pm.dpm.ps allocated by kcalloc should be freed in every
following error-handling path. However, in the error-handling of
rdev->pm.power_state[i].clock_info the rdev->pm.dpm.ps is not freed,
resulting in a memleak in this function.

Fixes: 80ea2c129c76 ("drm/radeon/kms: add dpm support for sumo asics (v2)")
Signed-off-by: Zhipeng Lu 
---
  drivers/gpu/drm/radeon/sumo_dpm.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/radeon/sumo_dpm.c 
b/drivers/gpu/drm/radeon/sumo_dpm.c
index f74f381af05f..bde640053708 100644
--- a/drivers/gpu/drm/radeon/sumo_dpm.c
+++ b/drivers/gpu/drm/radeon/sumo_dpm.c
@@ -1494,6 +1494,7 @@ static int sumo_parse_power_table(struct radeon_device 
*rdev)
non_clock_info = (struct _ATOM_PPLIB_NONCLOCK_INFO *)

_clock_info_array->nonClockInfo[non_clock_array_index];
if (!rdev->pm.power_state[i].clock_info)
+   kfree(rdev->pm.dpm.ps);
return -EINVAL;


That change is obviously not correct since you now always return -EINVAL.

You need to at least add {} here.

Regards,
Christian.


ps = kzalloc(sizeof(struct sumo_ps), GFP_KERNEL);
if (ps == NULL) {

54 matches

Mail list logo