Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler

2022-01-21 Thread Lazar, Lijo




On 1/22/2022 2:04 AM, Sharma, Shashank wrote:

 From 899ec6060eb7d8a3d4d56ab439e4e6cdd74190a4 Mon Sep 17 00:00:00 2001
From: Somalapuram Amaranath 
Date: Fri, 21 Jan 2022 14:19:42 +0530
Subject: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler

This patch adds a GPU reset handler for Navi ASIC family, which
typically dumps some of the registersand sends a trace event.

V2: Accomodated call to work function to send uevent

Signed-off-by: Somalapuram Amaranath 
Signed-off-by: Shashank Sharma 
---
  drivers/gpu/drm/amd/amdgpu/nv.c | 28 
  1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c 
b/drivers/gpu/drm/amd/amdgpu/nv.c

index 01efda4398e5..ada35d4c5245 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -528,10 +528,38 @@ nv_asic_reset_method(struct amdgpu_device *adev)
  }
  }

+static void amdgpu_reset_dumps(struct amdgpu_device *adev)
+{
+    int r = 0, i;
+
+    /* original raven doesn't have full asic reset */
+    if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+    !(adev->apu_flags & AMD_APU_IS_RAVEN2))
+    return;
+    for (i = 0; i < adev->num_ip_blocks; i++) {
+    if (!adev->ip_blocks[i].status.valid)
+    continue;
+    if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
+    continue;
+    r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
+
+    if (r)
+    DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n",
+    adev->ip_blocks[i].version->funcs->name, r);
+    }
+
+    /* Schedule work to send uevent */
+    if (!queue_work(system_unbound_wq, >gpu_reset_work))
+    DRM_ERROR("failed to add GPU reset work\n");
+
+    dump_stack();
+}
+
  static int nv_asic_reset(struct amdgpu_device *adev)
  {
  int ret = 0;

+    amdgpu_reset_dumps(adev);


Had a comment on this before. Now there are different reasons (or even 
no reason like a precautionary reset) to perform reset. A user would be 
interested in a trace only if the reason is valid.


To clarify on why a work shouldn't be scheduled on every reset, check here -

https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c#L2188


Thanks,
Lijo


  switch (nv_asic_reset_method(adev)) {
  case AMD_RESET_METHOD_PCI:
  dev_info(adev->dev, "PCI reset\n");


RE: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM

2022-01-21 Thread Lazar, Lijo
[AMD Official Use Only]

Hi James,

Could you provide the pp_dpm_* values in sysfs with and without the patch? 
Also, could you try forcing PCIE to gen3 (through pp_dpm_pcie) if it's not in 
gen3 when the issue happens?

For details on pp_dpm_*, please check 
https://dri.freedesktop.org/docs/drm/gpu/amdgpu.html

Thanks,
Lijo

-Original Message-
From: James Turner  
Sent: Saturday, January 22, 2022 6:21 AM
To: Alex Deucher 
Cc: Thorsten Leemhuis ; Deucher, Alexander 
; Lazar, Lijo ; 
regressi...@lists.linux.dev; k...@vger.kernel.org; Greg KH 
; Pan, Xinhui ; LKML 
; amd-gfx@lists.freedesktop.org; Alex Williamson 
; Koenig, Christian 
Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU 
PCI-passed-through to Windows VM

> Are you ever loading the amdgpu driver in your tests?

Yes, although I'm binding the `vfio-pci` driver to the AMD GPU's PCI devices 
via the kernel command line. (See my initial email.) My understanding is that 
`vfio-pci` is supposed to keep other drivers, such as `amdgpu`, from 
interacting with the GPU, although that's clearly not what's happening.

I've been testing with `amdgpu` included in the `MODULES` list in 
`/etc/mkinitcpio.conf` (which Arch Linux uses to generate the initramfs). 
However, I ran some more tests today (results below), this time without `i915` 
or `amdgpu` in the `MODULES` list. The `amdgpu` kernel module still gets 
loaded. (I think udev loads it automatically?)

Your comment gave me the idea to blacklist the `amdgpu` kernel module.
That does serve as a workaround on my machine – it fixes the behavior for 
f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") and for 
the current Arch Linux prebuilt kernel (5.16.2-arch1-1). That's an acceptable 
workaround for my machine only because the separate GPU used by the host is an 
Intel integrated GPU. That workaround wouldn't work well for someone with two 
AMD GPUs.


# New test results

The following tests are set up the same way as in my initial email, with the 
following exceptions:

- I've updated libvirt to 1:8.0.0-1.

- I've removed `i915` and `amdgpu` from the `MODULES` list in
  `/etc/mkinitcpio.conf`.

For all three of these tests, `lspci` said the following:

% lspci -nnk -d 1002:6981
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
[AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981]
Subsystem: Dell Device [1028:0926]
Kernel driver in use: vfio-pci
Kernel modules: amdgpu

% lspci -nnk -d 1002:aae0
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin 
HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
Subsystem: Dell Device [1028:0926]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel


## Version f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack")

This is the commit immediately preceding the one which introduced the issue.

% sudo dmesg | grep -i amdgpu
[   15.840160] [drm] amdgpu kernel modesetting enabled.
[   15.840884] amdgpu: CRAT table not found
[   15.840885] amdgpu: Virtual CRAT table created for CPU
[   15.840893] amdgpu: Topology: Add CPU node

% lsmod | grep amdgpu
amdgpu   7450624  0
gpu_sched  49152  1 amdgpu
drm_ttm_helper 16384  1 amdgpu
ttm77824  2 amdgpu,drm_ttm_helper
i2c_algo_bit   16384  2 amdgpu,i915
drm_kms_helper303104  2 amdgpu,i915
drm   581632  11 
gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm

The passed-through GPU worked properly in the VM.


## Version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global 
(v2)")

This is the commit which introduced the issue.

% sudo dmesg | grep -i amdgpu
[   15.319023] [drm] amdgpu kernel modesetting enabled.
[   15.329468] amdgpu: CRAT table not found
[   15.329470] amdgpu: Virtual CRAT table created for CPU
[   15.329482] amdgpu: Topology: Add CPU node

% lsmod | grep amdgpu
amdgpu   7450624  0
gpu_sched  49152  1 amdgpu
drm_ttm_helper 16384  1 amdgpu
ttm77824  2 amdgpu,drm_ttm_helper
i2c_algo_bit   16384  2 amdgpu,i915
drm_kms_helper303104  2 amdgpu,i915
drm   581632  11 
gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm

The passed-through GPU did not run above 501 MHz in the VM.


## Blacklisted `amdgpu`, version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS 
structures global (v2)")

For this test, I added `module_blacklist=amdgpu` to kernel command line to 
blacklist the `amdgpu` module.

% sudo dmesg | grep -i amdgpu
[   14.591576] Module amdgpu is blacklisted

% lsmod | grep amdgpu

The passed-through GPU worked properly in the VM.


James


[PATCH v2 0/3] Implement parallel sysfs_emit solution for navi10

2022-01-21 Thread Darren Powell
== Description ==
Scnprintf use within the kernel is not recommended, but simple sysfs_emit 
replacement has
not been successful due to the page alignment requirement of the function. This 
patch
set implements a new api "emit_clock_levels" to facilitate passing both the 
base and
offset to the device rather than just the write pointer.
This patch is only implemented for navi10 for some clocks to seek comment on the
implementation before extending the implementation to other clock values and 
devices.

Needs to be verified on a platform that supports the overclocking

   (v2)
  Update to apply on commit 801771de03
  adjust printing of empty carriage return only if size == 0
  change return from amdgpu_dpm_emit_clock_levels if emit_clock_levels not 
found

== Patch Summary ==
   linux: (g...@gitlab.freedesktop.org:agd5f) origin/amd-staging-drm-next @ 
28907fd9e048
+ f83a3144ede4 amdgpu/pm: Implement new API function "emit" that accepts 
buffer base and write offset
+ 82de36426a1f amdgpu/pm: insert attempt to call emit_clock_levels into 
amdgpu_get_pp_od_clk_voltage
+ 32f0fcf45dd8 amdgpu/pm: Add Error Handling to emit_clocks stack

== System Summary ==
 * DESKTOP(AMD FX-8350 + NAVI10(731F/ca), BIOS: F2)
  + ISO(Ubuntu 20.04.3 LTS)
  + Kernel(5.13.0-fdoagd5f-20220112-g28907fd9e0)

== Test ==
LOGFILE=pp_clk.test.log
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print 
$9}'`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

lspci -nn | grep "VGA\|Display"  > $LOGFILE
FILES="pp_od_clk_voltage
pp_dpm_sclk
pp_dpm_mclk
pp_dpm_pcie
pp_dpm_socclk
pp_dpm_fclk
pp_dpm_dcefclk
pp_dpm_vclk
pp_dpm_dclk "

for f in $FILES
do
  echo === $f === >> $LOGFILE
  cat $HWMON_DIR/device/$f >> $LOGFILE
done
cat $LOGFILE

Darren Powell (3):
  amdgpu/pm: Implement new API function "emit" that accepts buffer base
and write offset
  amdgpu/pm: insert attempt to call emit_clock_levels into
amdgpu_get_pp_od_clk_voltage
  amdgpu/pm: Add Error Handling to emit_clocks stack

 .../gpu/drm/amd/include/kgd_pp_interface.h|   1 +
 drivers/gpu/drm/amd/pm/amdgpu_dpm.c   |  21 ++
 drivers/gpu/drm/amd/pm/amdgpu_pm.c|  49 +++--
 drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h   |   4 +
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  44 +++-
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  14 ++
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 188 ++
 7 files changed, 302 insertions(+), 19 deletions(-)


base-commit: 28907fd9e04859fab86a143271d29ff0ff043154
-- 
2.34.1



[PATCH 2/3] amdgpu/pm: insert attempt to call emit_clock_levels into amdgpu_get_pp_od_clk_voltage

2022-01-21 Thread Darren Powell
  (v1)
Use new API function emit_clock_levels to display to overclocking values, 
with a fallback
to the print_clock_levels if the first call fails.

  (v2)
Update to apply on commit 801771de03
adjust printing of empty carriage return only if size == 0

 == Test ==
 LOGFILE=pp_clk.test.log
 AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
 AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print 
$9}'`
 HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

 lspci -nn | grep "VGA\|Display"  > $LOGFILE
 FILES="pp_od_clk_voltage"

 for f in $FILES
 do
   echo === $f === >> $LOGFILE
   cat $HWMON_DIR/device/$f >> $LOGFILE
 done
 cat $LOGFILE

Signed-off-by: Darren Powell 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 39 ++
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index ec2729b3930e..97a8dcbe6eaf 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -832,8 +832,17 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device 
*dev,
 {
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(ddev);
-   ssize_t size;
+   int size = 0;
int ret;
+   enum pp_clock_type od_clocks[6] = {
+   OD_SCLK,
+   OD_MCLK,
+   OD_VDDC_CURVE,
+   OD_RANGE,
+   OD_VDDGFX_OFFSET,
+   OD_CCLK,
+   };
+   uint clk_index;
 
if (amdgpu_in_reset(adev))
return -EPERM;
@@ -846,16 +855,26 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device 
*dev,
return ret;
}
 
-   size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, buf);
-   if (size > 0) {
-   size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK, buf+size);
-   size += amdgpu_dpm_print_clock_levels(adev, OD_VDDC_CURVE, 
buf+size);
-   size += amdgpu_dpm_print_clock_levels(adev, OD_VDDGFX_OFFSET, 
buf+size);
-   size += amdgpu_dpm_print_clock_levels(adev, OD_RANGE, buf+size);
-   size += amdgpu_dpm_print_clock_levels(adev, OD_CCLK, buf+size);
-   } else {
-   size = sysfs_emit(buf, "\n");
+   ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[0], buf, );
+   if (ret >= 0) {
+   /* Proceed with emit for other od clocks if the first call 
succeeds */
+   for(clk_index = 1 ; clk_index < 6 ; clk_index++)
+   amdgpu_dpm_emit_clock_levels(adev, 
od_clocks[clk_index], buf, );
+   }
+   else {
+   size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, buf);
+   if (size > 0) {
+   size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK, 
buf+size);
+   size += amdgpu_dpm_print_clock_levels(adev, 
OD_VDDC_CURVE, buf+size);
+   size += amdgpu_dpm_print_clock_levels(adev, 
OD_VDDGFX_OFFSET, buf+size);
+   size += amdgpu_dpm_print_clock_levels(adev, OD_RANGE, 
buf+size);
+   size += amdgpu_dpm_print_clock_levels(adev, OD_CCLK, 
buf+size);
+   }
}
+
+   if (size == 0)
+   size = sysfs_emit(buf, "\n");
+
pm_runtime_mark_last_busy(ddev->dev);
pm_runtime_put_autosuspend(ddev->dev);
 
-- 
2.34.1



[PATCH 1/3] amdgpu/pm: Implement new API function "emit" that accepts buffer base and write offset

2022-01-21 Thread Darren Powell
   (v1)
 - new power management function emit_clk_levels implemented by 
navi10_emit_clk_levels()
   This function currently duplicates the functionality of 
navi10_print_clk_levels, where
   snprintf is used write to the sysfs buffer. The first implementation to 
use sysfs_emit
   was unsuccessful due to requirement that buf must be aligned to a 
specific memory
   boundary. This solution passes the buffer base and write offset down the 
stack.
 - new power management function emit_clock_levels implemented by 
smu_emit_ppclk_levels()
   This function combines implementation of smu_print_ppclk_levels and
   smu_print_smuclk_levels and calls emit_clk_levels
 - new helper function smu_convert_to_smuclk called by smu_print_ppclk_levels 
and
   smu_emit_ppclk_levels
 - emit_clock_levels currently returns the length of the string written to the 
buffer,
   maintaining the behaviour of print_clock_levels, but will be upgraded to 
return
   error values in a subsequent patch
 - failure of the emit_clock_levels at the top level (amdgpu_get_pp_dpm_clock) 
triggers a
   fallback to the print_clock_levels, which can be removed after 
emit_clock_levels is
   implemented across all platforms.

   (v2)
  Update to apply on commit 801771de03
  adjust printing of empty carriage return only if size == 0

 == Test ==
 LOGFILE=pp_clk.test.log
 AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
 AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print 
$9}'`
 HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

 lspci -nn | grep "VGA\|Display"  > $LOGFILE
 FILES="pp_od_clk_voltage
 pp_dpm_sclk
 pp_dpm_mclk
 pp_dpm_pcie
 pp_dpm_socclk
 pp_dpm_fclk
 pp_dpm_dcefclk
 pp_dpm_vclk
 pp_dpm_dclk "

 for f in $FILES
 do
   echo === $f === >> $LOGFILE
   cat $HWMON_DIR/device/$f >> $LOGFILE
 done
 cat $LOGFILE

Signed-off-by: Darren Powell 
---
 .../gpu/drm/amd/include/kgd_pp_interface.h|   1 +
 drivers/gpu/drm/amd/pm/amdgpu_dpm.c   |  21 ++
 drivers/gpu/drm/amd/pm/amdgpu_pm.c|  11 +-
 drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h   |   4 +
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  46 -
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  10 +
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 189 ++
 7 files changed, 273 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h 
b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
index a8eec91c0995..8a8eb9411cdf 100644
--- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
@@ -321,6 +321,7 @@ struct amd_pm_funcs {
int (*get_fan_speed_pwm)(void *handle, u32 *speed);
int (*force_clock_level)(void *handle, enum pp_clock_type type, 
uint32_t mask);
int (*print_clock_levels)(void *handle, enum pp_clock_type type, char 
*buf);
+   int (*emit_clock_levels)(void *handle, enum pp_clock_type type, char 
*buf, int *offset);
int (*force_performance_level)(void *handle, enum amd_dpm_forced_level 
level);
int (*get_sclk_od)(void *handle);
int (*set_sclk_od)(void *handle, uint32_t value);
diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
index 728b6e10f302..e45578124928 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
@@ -906,6 +906,27 @@ int amdgpu_dpm_print_clock_levels(struct amdgpu_device 
*adev,
return ret;
 }
 
+int amdgpu_dpm_emit_clock_levels(struct amdgpu_device *adev,
+ enum pp_clock_type type,
+ char *buf,
+ int *offset)
+{
+   const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs;
+   int ret = 0;
+
+   if (!pp_funcs->emit_clock_levels)
+   return 0;
+
+   mutex_lock(>pm.mutex);
+   ret = pp_funcs->emit_clock_levels(adev->powerplay.pp_handle,
+  type,
+  buf,
+  offset);
+   mutex_unlock(>pm.mutex);
+
+   return ret;
+}
+
 int amdgpu_dpm_set_ppfeature_status(struct amdgpu_device *adev,
uint64_t ppfeature_masks)
 {
diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index d2823aaeca09..ec2729b3930e 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -980,8 +980,8 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev,
 {
struct drm_device *ddev = dev_get_drvdata(dev);
struct amdgpu_device *adev = drm_to_adev(ddev);
-   ssize_t size;
-   int ret;
+   int size = 0;
+   int ret = 0;
 
if (amdgpu_in_reset(adev))
return -EPERM;
@@ -994,8 +994,11 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev,

[PATCH 3/3] amdgpu/pm: Add Error Handling to emit_clocks stack

2022-01-21 Thread Darren Powell
 Previous implementation of print_clocks required return of bytes written
 to calling function via return value. Passing this value in by reference
 allows us now to pass back error codes up the calling stack.

 (v1)
 - Errors from get_current_clk_freq, get_dpm_level_count & get_dpm_freq
  now passed back up the stack
 - Added -EOPNOTSUPP when encountering for OD clocks
  !od_enabled
  missing od_table or od_settings
 - OD_RANGE continues to print any subset of the ODCAP settings it can find
  without reporting error for any missing
 - smu_emit_ppclk returns ENOENT if emit_clk_levels is not supported by the
  device
 - modified calling logic so fallback to print_clock_levels is only attempted
  if emit_clk_levels is not (yet) supported by the device

 (v2)
 - change return from amdgpu_dpm_emit_clock_levels if emit_clock_levels not 
found
 - updated documentation of pptable_func members print_clk_levels, 
emit_clk_levels

 == Test ==
 LOGFILE=pp_clk.test.log
 AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
 AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print 
$9}'`
 HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

 lspci -nn | grep "VGA\|Display"  > $LOGFILE
 FILES="pp_od_clk_voltage
 pp_dpm_sclk
 pp_dpm_mclk
 pp_dpm_pcie
 pp_dpm_socclk
 pp_dpm_fclk
 pp_dpm_dcefclk
 pp_dpm_vclk
 pp_dpm_dclk "

 for f in $FILES
 do
   echo === $f === >> $LOGFILE
   cat $HWMON_DIR/device/$f >> $LOGFILE
 done
 cat $LOGFILE

Signed-off-by: Darren Powell 
---
 drivers/gpu/drm/amd/pm/amdgpu_dpm.c   |  2 +-
 drivers/gpu/drm/amd/pm/amdgpu_pm.c| 13 ++--
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  8 +++
 drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  6 +-
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 21 +--
 5 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
index e45578124928..03a690a3abb7 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c
@@ -915,7 +915,7 @@ int amdgpu_dpm_emit_clock_levels(struct amdgpu_device *adev,
int ret = 0;
 
if (!pp_funcs->emit_clock_levels)
-   return 0;
+   return -ENOENT;
 
mutex_lock(>pm.mutex);
ret = pp_funcs->emit_clock_levels(adev->powerplay.pp_handle,
diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index 97a8dcbe6eaf..a11def0ee761 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -855,13 +855,12 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device 
*dev,
return ret;
}
 
-   ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[0], buf, );
-   if (ret >= 0) {
-   /* Proceed with emit for other od clocks if the first call 
succeeds */
-   for(clk_index = 1 ; clk_index < 6 ; clk_index++)
-   amdgpu_dpm_emit_clock_levels(adev, 
od_clocks[clk_index], buf, );
+   for(clk_index = 0 ; clk_index < 6 ; clk_index++) {
+   ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], 
buf, );
+   if (ret)
+   break;
}
-   else {
+   if (ret == -ENOENT) {
size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, buf);
if (size > 0) {
size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK, 
buf+size);
@@ -1014,7 +1013,7 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev,
}
 
ret = amdgpu_dpm_emit_clock_levels(adev, type, buf, );
-   if (ret < 0)
+   if (ret == -ENOENT)
size = amdgpu_dpm_print_clock_levels(adev, type, buf);
 
if (size == 0)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index d82ea7ee223f..29c101a93dc4 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -2454,12 +2454,10 @@ static int smu_emit_ppclk_levels(void *handle, enum 
pp_clock_type type, char *bu
if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled)
return -EOPNOTSUPP;
 
-   if (smu->ppt_funcs->emit_clk_levels) {
+   if (smu->ppt_funcs->emit_clk_levels)
ret = smu->ppt_funcs->emit_clk_levels(smu, clk_type, buf, 
offset);
-   }
-   else {
-   ret = -EOPNOTSUPP;
-   }
+   else
+   ret = -ENOENT;
 
return ret;
 
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h 
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 224b869eda70..1429581dca9c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -612,6 +612,7 @@ struct pptable_funcs {
 *to buffer. Star current level.
 *
 * Used for sysfs interfaces.
+  

Re: [PATCH] drm/amd/display: Fix memory leak

2022-01-21 Thread Deucher, Alexander
[Public]

Applied.  Thanks!

From: Wentland, Harry 
Sent: Friday, January 21, 2022 2:03 PM
To: Yongzhi Liu ; Li, Sun peng (Leo) ; 
Siqueira, Rodrigo ; Deucher, Alexander 
; Koenig, Christian ; Pan, 
Xinhui ; airl...@linux.ie ; 
dan...@ffwll.ch ; Lipski, Mikita ; Lin, 
Wayne ; Kazlauskas, Nicholas ; 
Zuo, Jerry ; anson.ja...@amd.com ; 
eryk.b...@amd.com ; Pillai, Aurabindo 
; Das, Nirmoy 
Cc: amd-gfx@lists.freedesktop.org ; 
dri-de...@lists.freedesktop.org ; 
linux-ker...@vger.kernel.org 
Subject: Re: [PATCH] drm/amd/display: Fix memory leak

On 2022-01-21 06:26, Yongzhi Liu wrote:
> [why]
> Resource release is needed on the error handling path
> to prevent memory leak.
>
> [how]
> Fix this by adding kfree on the error handling path.
>
> Signed-off-by: Yongzhi Liu 

Reviewed-by: Harry Wentland 

Harry

> ---
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c  | 80 
> --
>  1 file changed, 60 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> index ded64d0..e463d46 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> @@ -227,8 +227,10 @@ static ssize_t dp_link_settings_read(struct file *f, 
> char __user *buf,
>break;
>
>r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>return r; /* r = -EFAULT */
> + }
>
>buf += 1;
>size -= 1;
> @@ -389,8 +391,10 @@ static ssize_t dp_phy_settings_read(struct file *f, char 
> __user *buf,
>break;
>
>r = put_user((*(rd_buf + result)), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>return r; /* r = -EFAULT */
> + }
>
>buf += 1;
>size -= 1;
> @@ -1359,8 +1363,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, 
> char __user *buf,
>break;
>}
>
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>return -ENXIO;
> + }
>
>dsc = pipe_ctx->stream_res.dsc;
>if (dsc)
> @@ -1376,8 +1382,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, 
> char __user *buf,
>break;
>
>r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>return r; /* r = -EFAULT */
> + }
>
>buf += 1;
>size -= 1;
> @@ -1546,8 +1554,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, 
> char __user *buf,
>break;
>}
>
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>return -ENXIO;
> + }
>
>dsc = pipe_ctx->stream_res.dsc;
>if (dsc)
> @@ -1563,8 +1573,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, 
> char __user *buf,
>break;
>
>r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>return r; /* r = -EFAULT */
> + }
>
>buf += 1;
>size -= 1;
> @@ -1731,8 +1743,10 @@ static ssize_t dp_dsc_slice_height_read(struct file 
> *f, char __user *buf,
>break;
>}
>
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>return -ENXIO;
> + }
>
>dsc = pipe_ctx->stream_res.dsc;
>if (dsc)
> @@ -1748,8 +1762,10 @@ static ssize_t dp_dsc_slice_height_read(struct file 
> *f, char __user *buf,
>break;
>
>r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>return r; /* r = -EFAULT */
> + }
>
>buf += 1;
>size -= 1;
> @@ -1912,8 +1928,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file 
> *f, char __user *buf,
>break;
>}
>
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>return -ENXIO;
> + }
>
>dsc = pipe_ctx->stream_res.dsc;
>if (dsc)
> @@ -1929,8 +1947,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file 
> *f, char __user *buf,
>break;
>
>r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>return r; /* r = -EFAULT */
> + 

Re: [PATCH] drm/amd/pm: remove useless if

2022-01-21 Thread Alex Deucher
Applied.  Thanks!

Alex

On Fri, Jan 21, 2022 at 6:48 AM Jiapeng Chong
 wrote:
>
> Clean the following coccicheck warning:
>
> ./drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c:7035:2-4: WARNING: possible
> condition with no effect (if == else).
>
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 
> ---
>  drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c 
> b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> index 23ff0d812e4b..7427c50409d4 100644
> --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
> @@ -7032,10 +7032,7 @@ static int si_power_control_set_level(struct 
> amdgpu_device *adev)
> ret = si_resume_smc(adev);
> if (ret)
> return ret;
> -   ret = si_set_sw_state(adev);
> -   if (ret)
> -   return ret;
> -   return 0;
> +   return si_set_sw_state(adev);
>  }
>
>  static void si_set_vce_clock(struct amdgpu_device *adev,
> --
> 2.20.1.7.g153144c
>


Re: [PATCH] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj

2022-01-21 Thread Alex Deucher
On Fri, Jan 21, 2022 at 2:45 AM Christian König
 wrote:
>
> Am 21.01.22 um 06:28 schrieb Xin Xiong:
> > This issue takes place in an error path in
> > amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into
> > default case, the function simply returns -EINVAL, forgetting to
> > decrement the reference count of a dma_fence obj, which is bumped
> > earlier by amdgpu_cs_get_fence(). This may result in reference count
> > leaks.
> >
> > Fix it by decreasing the refcount of specific object before returning
> > the error code.
> >
> > Signed-off-by: Xin Xiong 
> > Signed-off-by: Xin Tan 
>
> Good catch. Reviewed-by: Christian König 

Applied manually.  Strangely I never got this on any of my emails, and
I don't see it in the archives.

Alex

>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 0311d799a..894869789 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1510,6 +1510,7 @@ int amdgpu_cs_fence_to_handle_ioctl(struct drm_device 
> > *dev, void *data,
> >   return 0;
> >
> >   default:
> > + dma_fence_put(fence);
> >   return -EINVAL;
> >   }
> >   }
>


[PATCH 2/4] drm/amdgpu: add work function for GPU reset

2022-01-21 Thread Sharma, Shashank

From c598dd586dd15fc5ae0a883a2e6f4094ec024085 Mon Sep 17 00:00:00 2001
From: Shashank Sharma 
Date: Fri, 21 Jan 2022 17:33:10 +0100
Subject: [PATCH 2/4] drm/amdgpu: add work function for GPU reset

This patch adds a new work function, which will get scheduled
in event of a GPU reset, and will send a uevent to indicate the same.
The userspace can do some post-processing work like collecting data
from a trace event.

Cc: Alexander Deucher 
Cc: Christian Koenig 
Signed-off-by: Shashank Sharma 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h|  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 ++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

index 269437b01328..79192f43bb71 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -1057,6 +1057,8 @@ struct amdgpu_device {
struct work_struct  xgmi_reset_work;
struct list_headreset_list;

+   struct work_struct  gpu_reset_work;
+
longgfx_timeout;
longsdma_timeout;
longvideo_timeout;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

index af9bdf16eefd..e29e58240869 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -72,6 +72,7 @@
 #include 

 #include 
+#include 

 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin");
@@ -3274,6 +3275,18 @@ bool amdgpu_device_has_dc_support(struct 
amdgpu_device *adev)

return amdgpu_device_asic_has_dc_support(adev->asic_type);
 }

+static void amdgpu_device_gpu_reset_func(struct work_struct *__work)
+{
+   struct amdgpu_device *adev =
+   container_of(__work, struct amdgpu_device, gpu_reset_work);
+
+   /*
+* Inform userspace that a GPU reset happened, and it should collect
+* data from the trace event.
+*/
+   drm_sysfs_gpu_reset_event(>ddev);
+}
+
 static void amdgpu_device_xgmi_reset_func(struct work_struct *__work)
 {
struct amdgpu_device *adev =
@@ -3506,6 +3519,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  amdgpu_device_delay_enable_gfx_off);

INIT_WORK(>xgmi_reset_work, amdgpu_device_xgmi_reset_func);
+   INIT_WORK(>gpu_reset_work, amdgpu_device_gpu_reset_func);

adev->gfx.gfx_off_req_count = 1;
adev->pm.ac_power = power_supply_is_system_supplied() > 0;
--
2.32.0



[PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler

2022-01-21 Thread Sharma, Shashank

From 899ec6060eb7d8a3d4d56ab439e4e6cdd74190a4 Mon Sep 17 00:00:00 2001
From: Somalapuram Amaranath 
Date: Fri, 21 Jan 2022 14:19:42 +0530
Subject: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler

This patch adds a GPU reset handler for Navi ASIC family, which
typically dumps some of the registersand sends a trace event.

V2: Accomodated call to work function to send uevent

Signed-off-by: Somalapuram Amaranath 
Signed-off-by: Shashank Sharma 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c 
b/drivers/gpu/drm/amd/amdgpu/nv.c

index 01efda4398e5..ada35d4c5245 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -528,10 +528,38 @@ nv_asic_reset_method(struct amdgpu_device *adev)
}
 }

+static void amdgpu_reset_dumps(struct amdgpu_device *adev)
+{
+   int r = 0, i;
+
+   /* original raven doesn't have full asic reset */
+   if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+   !(adev->apu_flags & AMD_APU_IS_RAVEN2))
+   return;
+   for (i = 0; i < adev->num_ip_blocks; i++) {
+   if (!adev->ip_blocks[i].status.valid)
+   continue;
+   if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
+   continue;
+   r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
+
+   if (r)
+   DRM_ERROR("reset_reg_dumps of IP block <%s> failed 
%d\n",
+   
adev->ip_blocks[i].version->funcs->name, r);
+   }
+
+   /* Schedule work to send uevent */
+   if (!queue_work(system_unbound_wq, >gpu_reset_work))
+   DRM_ERROR("failed to add GPU reset work\n");
+
+   dump_stack();
+}
+
 static int nv_asic_reset(struct amdgpu_device *adev)
 {
int ret = 0;

+   amdgpu_reset_dumps(adev);
switch (nv_asic_reset_method(adev)) {
case AMD_RESET_METHOD_PCI:
dev_info(adev->dev, "PCI reset\n");
--
2.32.0



[PATCH 3/4] drm/amdgpu: add reset register trace dump function

2022-01-21 Thread Sharma, Shashank

From 1c5c552eeddaffd9fb3e7d45ece1b2b28fccc575 Mon Sep 17 00:00:00 2001
From: Somalapuram Amaranath 
Date: Fri, 21 Jan 2022 14:19:10 +0530
Subject: [PATCH 3/4] drm/amdgpu: add reset register trace dump function for
 gfx_v10_0

Implementation of register trace dump function on the AMD GPU resets

Signed-off-by: Somalapuram Amaranath 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h |  8 
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 53 ++-
 drivers/gpu/drm/amd/include/amd_shared.h  |  1 +
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h

index d855cb53c7e0..c97b53b54333 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -537,6 +537,14 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
  __entry->seqno)
 );

+TRACE_EVENT(gfx_v10_0_reset_reg_dumps,
+   TP_PROTO(char *reg_dumps),
+   TP_ARGS(reg_dumps),
+   TP_STRUCT__entry(__string(dumps, reg_dumps)),
+   TP_fast_assign(__assign_str(dumps, reg_dumps);),
+   TP_printk("amdgpu register dump {%s}", __get_str(dumps))
+);
+
 #undef AMDGPU_JOB_GET_TIMELINE_NAME
 #endif

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c

index 16dbe593cba2..05974ed5416d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -46,7 +46,7 @@
 #include "v10_structs.h"
 #include "gfx_v10_0.h"
 #include "nbio_v2_3.h"
-
+#include "amdgpu_trace.h"
 /*
  * Navi10 has two graphic rings to share each graphic pipe.
  * 1. Primary ring
@@ -188,6 +188,12 @@
 #define RLCG_ERROR_REPORT_ENABLED(adev) \
 	(amdgpu_sriov_reg_indirect_mmhub(adev) || 
amdgpu_sriov_reg_indirect_gc(adev))


+#define N_REGS (17)
+#define DUMP_REG(addr) do {\
+   (dump)[i][0] = (addr);  \
+   (dump)[i++][1] =  RREG32_SOC15_IP(GC, addr);\
+   } while (0)
+
 MODULE_FIRMWARE("amdgpu/navi10_ce.bin");
 MODULE_FIRMWARE("amdgpu/navi10_pfp.bin");
 MODULE_FIRMWARE("amdgpu/navi10_me.bin");
@@ -7488,7 +7494,6 @@ static int gfx_v10_0_hw_init(void *handle)
 {
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
if (!amdgpu_emu_mode)
gfx_v10_0_init_golden_registers(adev);

@@ -7602,6 +7607,49 @@ static int gfx_v10_0_hw_fini(void *handle)
return 0;
 }

+static int gfx_v10_0_reset_reg_dumps(void *handle)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   uint32_t i = 0;
+   uint32_t (*dump)[2], n_regs = 0;
+   char *reg_dumps;
+
+   dump = kmalloc(N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
+   reg_dumps = kmalloc(1024, GFP_KERNEL);
+
+   if (dump == NULL || reg_dumps == NULL)
+   return -ENOMEM;
+
+   DUMP_REG(mmGRBM_STATUS2);
+   DUMP_REG(mmGRBM_STATUS_SE0);
+   DUMP_REG(mmGRBM_STATUS_SE1);
+   DUMP_REG(mmGRBM_STATUS_SE2);
+   DUMP_REG(mmGRBM_STATUS_SE3);
+   DUMP_REG(mmSDMA0_STATUS_REG);
+   DUMP_REG(mmSDMA1_STATUS_REG);
+   DUMP_REG(mmCP_STAT);
+   DUMP_REG(mmCP_STALLED_STAT1);
+   DUMP_REG(mmCP_STALLED_STAT1);
+   DUMP_REG(mmCP_STALLED_STAT3);
+   DUMP_REG(mmCP_CPC_STATUS);
+   DUMP_REG(mmCP_CPC_BUSY_STAT);
+   DUMP_REG(mmCP_CPC_STALLED_STAT1);
+   DUMP_REG(mmCP_CPF_STATUS);
+   DUMP_REG(mmCP_CPF_BUSY_STAT);
+   DUMP_REG(mmCP_CPF_STALLED_STAT1);
+
+   n_regs = i;
+
+   for (i = 0; i < n_regs; i++)
+		sprintf(reg_dumps + strlen(reg_dumps), "%08x: %08x, ", dump[i][0], 
dump[i][1]);

+
+   trace_gfx_v10_0_reset_reg_dumps(reg_dumps);
+   kfree(dump);
+   kfree(reg_dumps);
+
+   return 0;
+}
+
 static int gfx_v10_0_suspend(void *handle)
 {
return gfx_v10_0_hw_fini(handle);
@@ -9367,6 +9415,7 @@ static const struct amd_ip_funcs 
gfx_v10_0_ip_funcs = {

.set_clockgating_state = gfx_v10_0_set_clockgating_state,
.set_powergating_state = gfx_v10_0_set_powergating_state,
.get_clockgating_state = gfx_v10_0_get_clockgating_state,
+   .reset_reg_dumps = gfx_v10_0_reset_reg_dumps,
 };

 static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_gfx = {
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h

index 257f280d3d53..76d3a55278df 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -296,6 +296,7 @@ struct amd_ip_funcs {
 enum amd_powergating_state state);
void (*get_clockgating_state)(void *handle, u32 *flags);
int (*enable_umd_pstate)(void *handle, enum amd_dpm_forced_level 
*level);
+   int (*reset_reg_dumps)(void *handle);
 };


--
2.32.0



[PATCH 1/4] drm: add a new drm event for GPU reset

2022-01-21 Thread Sharma, Shashank

From 8144e60d9f80941dd9f8a53e4b468582aaa849b4 Mon Sep 17 00:00:00 2001
From: Shashank Sharma 
Date: Fri, 21 Jan 2022 17:23:41 +0100
Subject: [PATCH 1/4] drm: add a new drm event for GPU reset

This patch adds a DRM uevent to indicate GPU reset event. A
userspace can register to this event and do some postprocessing
if supported by native driver.

An example usage has been added into AMDGPU driver in this series.

Cc: Alexander Deucher 
Cc: Christian Koenig 
Signed-off-by: Shashank Sharma 
---
 drivers/gpu/drm/drm_sysfs.c | 19 +++
 include/drm/drm_sysfs.h |  1 +
 2 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 968a9560b4aa..0d721d536bb1 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -343,6 +343,25 @@ void drm_sysfs_hotplug_event(struct drm_device *dev)
 }
 EXPORT_SYMBOL(drm_sysfs_hotplug_event);

+/**
+ * drm_sysfs_gpu_reset_event - generate a GPU reset uevent
+ * @dev: DRM device
+ *
+ * Send a uevent for the DRM device specified by @dev. This event will
+ * inform userspace that a GPU reset happened, and it can do some
+ * postprocessing (eg: data collecting from a trace event).
+ */
+void drm_sysfs_gpu_reset_event(struct drm_device *dev)
+{
+   char *event_string = "RESET=1";
+   char *envp[] = { event_string, NULL };
+
+   DRM_DEBUG("generating gpu reset event\n");
+
+   kobject_uevent_env(>primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+EXPORT_SYMBOL(drm_sysfs_gpu_reset_event);
+
 /**
  * drm_sysfs_connector_status_event - generate a DRM uevent for connector
  * property status change
diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h
index d454ef617b2c..312956a0a1bf 100644
--- a/include/drm/drm_sysfs.h
+++ b/include/drm/drm_sysfs.h
@@ -11,6 +11,7 @@ int drm_class_device_register(struct device *dev);
 void drm_class_device_unregister(struct device *dev);

 void drm_sysfs_hotplug_event(struct drm_device *dev);
+void drm_sysfs_gpu_reset_event(struct drm_device *dev);
 void drm_sysfs_connector_status_event(struct drm_connector *connector,
  struct drm_property *property);
 #endif
--
2.32.0



[PATCH 5/5] drm/amdgpu: convert amdgpu_display_supported_domains() to IP versions

2022-01-21 Thread Alex Deucher
Check IP versions rather than asic types.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 29 -
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 82011e75ed85..6cad39c31c58 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -510,19 +510,24 @@ uint32_t amdgpu_display_supported_domains(struct 
amdgpu_device *adev,
case CHIP_STONEY:
domain |= AMDGPU_GEM_DOMAIN_GTT;
break;
-   case CHIP_RAVEN:
-   /* enable S/G on PCO and RV2 */
-   if ((adev->apu_flags & AMD_APU_IS_RAVEN2) ||
-   (adev->apu_flags & AMD_APU_IS_PICASSO))
-   domain |= AMDGPU_GEM_DOMAIN_GTT;
-   break;
-   case CHIP_RENOIR:
-   case CHIP_VANGOGH:
-   case CHIP_YELLOW_CARP:
-   domain |= AMDGPU_GEM_DOMAIN_GTT;
-   break;
-
default:
+   switch (adev->ip_versions[DCE_HWIP][0]) {
+   case IP_VERSION(1, 0, 0):
+   case IP_VERSION(1, 0, 1):
+   /* enable S/G on PCO and RV2 */
+   if ((adev->apu_flags & AMD_APU_IS_RAVEN2) ||
+   (adev->apu_flags & AMD_APU_IS_PICASSO))
+   domain |= AMDGPU_GEM_DOMAIN_GTT;
+   break;
+   case IP_VERSION(2, 1, 0):
+   case IP_VERSION(3, 0, 1):
+   case IP_VERSION(3, 1, 2):
+   case IP_VERSION(3, 1, 3):
+   domain |= AMDGPU_GEM_DOMAIN_GTT;
+   break;
+   default:
+   break;
+   }
break;
}
}
-- 
2.34.1



[PATCH 4/5] drm/amdgpu: handle BACO synchronization with secondary funcs

2022-01-21 Thread Alex Deucher
Extend secondary function handling for runtime pm beyond audio
to USB and UCSI.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 30 +++--
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 89c3578bc818..119a5798623e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1968,19 +1968,25 @@ static bool amdgpu_is_fw_framebuffer(resource_size_t 
base,
return found;
 }
 
-static void amdgpu_get_audio_func(struct amdgpu_device *adev)
+static void amdgpu_get_secondary_funcs(struct amdgpu_device *adev)
 {
struct pci_dev *p = NULL;
+   int i;
 
-   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
-   adev->pdev->bus->number, 1);
-   if (p) {
-   pm_runtime_get_sync(>dev);
-
-   pm_runtime_mark_last_busy(>dev);
-   pm_runtime_put_autosuspend(>dev);
-
-   pci_dev_put(p);
+   /* 0 - GPU
+* 1 - audio
+* 2 - USB
+* 3 - UCSI
+*/
+   for (i = 1; i < 4; i++) {
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, i);
+   if (p) {
+   pm_runtime_get_sync(>dev);
+   pm_runtime_mark_last_busy(>dev);
+   pm_runtime_put_autosuspend(>dev);
+   pci_dev_put(p);
+   }
}
 }
 
@@ -2148,14 +2154,14 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
 * be no PMFW-aware D-state transition(D0->D3) on runpm
 * suspend. Thus the BACO will be not correctly kicked in.
 *
-* Via amdgpu_get_audio_func(), the audio dev is put
+* Via amdgpu_get_secondary_funcs(), the audio dev is put
 * into D0 state. Then there will be a PMFW-aware D-state
 * transition(D0->D3) on runpm suspend.
 */
if (amdgpu_device_supports_baco(ddev) &&
!(adev->flags & AMD_IS_APU) &&
(adev->asic_type >= CHIP_NAVI10))
-   amdgpu_get_audio_func(adev);
+   amdgpu_get_secondary_funcs(adev);
}
 
return 0;
-- 
2.34.1



[PATCH 3/5] drm/amdgpu: move runtime pm init after drm and fbdev init

2022-01-21 Thread Alex Deucher
Seems more logical to enable runtime pm at the end of
the init sequence so we don't end up entering runtime
suspend before init is finished.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 65 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 69 +
 2 files changed, 66 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 75ceb43392b1..89c3578bc818 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1968,6 +1968,22 @@ static bool amdgpu_is_fw_framebuffer(resource_size_t 
base,
return found;
 }
 
+static void amdgpu_get_audio_func(struct amdgpu_device *adev)
+{
+   struct pci_dev *p = NULL;
+
+   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
+   adev->pdev->bus->number, 1);
+   if (p) {
+   pm_runtime_get_sync(>dev);
+
+   pm_runtime_mark_last_busy(>dev);
+   pm_runtime_put_autosuspend(>dev);
+
+   pci_dev_put(p);
+   }
+}
+
 static int amdgpu_pci_probe(struct pci_dev *pdev,
const struct pci_device_id *ent)
 {
@@ -2100,6 +2116,48 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
if (ret)
DRM_ERROR("Creating debugfs files failed (%d).\n", ret);
 
+   if (adev->runpm) {
+   /* only need to skip on ATPX */
+   if (amdgpu_device_supports_px(ddev))
+   dev_pm_set_driver_flags(ddev->dev, 
DPM_FLAG_NO_DIRECT_COMPLETE);
+   /* we want direct complete for BOCO */
+   if (amdgpu_device_supports_boco(ddev))
+   dev_pm_set_driver_flags(ddev->dev, 
DPM_FLAG_SMART_PREPARE |
+   DPM_FLAG_SMART_SUSPEND |
+   DPM_FLAG_MAY_SKIP_RESUME);
+   pm_runtime_use_autosuspend(ddev->dev);
+   pm_runtime_set_autosuspend_delay(ddev->dev, 5000);
+
+   pm_runtime_allow(ddev->dev);
+
+   pm_runtime_mark_last_busy(ddev->dev);
+   pm_runtime_put_autosuspend(ddev->dev);
+
+   /*
+* For runpm implemented via BACO, PMFW will handle the
+* timing for BACO in and out:
+*   - put ASIC into BACO state only when both video and
+* audio functions are in D3 state.
+*   - pull ASIC out of BACO state when either video or
+* audio function is in D0 state.
+* Also, at startup, PMFW assumes both functions are in
+* D0 state.
+*
+* So if snd driver was loaded prior to amdgpu driver
+* and audio function was put into D3 state, there will
+* be no PMFW-aware D-state transition(D0->D3) on runpm
+* suspend. Thus the BACO will be not correctly kicked in.
+*
+* Via amdgpu_get_audio_func(), the audio dev is put
+* into D0 state. Then there will be a PMFW-aware D-state
+* transition(D0->D3) on runpm suspend.
+*/
+   if (amdgpu_device_supports_baco(ddev) &&
+   !(adev->flags & AMD_IS_APU) &&
+   (adev->asic_type >= CHIP_NAVI10))
+   amdgpu_get_audio_func(adev);
+   }
+
return 0;
 
 err_pci:
@@ -2111,8 +2169,15 @@ static void
 amdgpu_pci_remove(struct pci_dev *pdev)
 {
struct drm_device *dev = pci_get_drvdata(pdev);
+   struct amdgpu_device *adev = drm_to_adev(dev);
 
drm_dev_unplug(dev);
+
+   if (adev->runpm) {
+   pm_runtime_get_sync(dev->dev);
+   pm_runtime_forbid(dev->dev);
+   }
+
amdgpu_driver_unload_kms(dev);
 
/*
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 16a6da09c924..4c437ad2dd9a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -87,11 +87,6 @@ void amdgpu_driver_unload_kms(struct drm_device *dev)
if (adev->rmmio == NULL)
return;
 
-   if (adev->runpm) {
-   pm_runtime_get_sync(dev->dev);
-   pm_runtime_forbid(dev->dev);
-   }
-
if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DRV_UNLOAD))
DRM_WARN("smart shift update failed\n");
 
@@ -124,22 +119,6 @@ void amdgpu_register_gpu_instance(struct amdgpu_device 
*adev)
mutex_unlock(_info.mutex);
 }
 
-static void amdgpu_get_audio_func(struct amdgpu_device *adev)
-{
-   struct pci_dev *p = NULL;
-
-   p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus),
-   adev->pdev->bus->number, 1);
-   if (p) {
-   

[PATCH 1/5] drm/amdgpu: set APU flag based on IP discovery table

2022-01-21 Thread Alex Deucher
Use the IP versions to set the APU flag when necessary.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index e6a26b554254..ddbe13c9e4c7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -1253,6 +1253,19 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev)
return -EINVAL;
}
 
+   switch (adev->ip_versions[GC_HWIP][0]) {
+   case IP_VERSION(9, 1, 0):
+   case IP_VERSION(9, 2, 2):
+   case IP_VERSION(9, 3, 0):
+   case IP_VERSION(10, 1, 3):
+   case IP_VERSION(10, 3, 1):
+   case IP_VERSION(10, 3, 3):
+   adev->flags |= AMD_IS_APU;
+   break;
+   default:
+   break;
+   }
+
if (adev->ip_versions[XGMI_HWIP][0] == IP_VERSION(4, 8, 0))
adev->gmc.xgmi.supported = true;
 
-- 
2.34.1



[PATCH 2/5] drm/amdgpu: move PX checking into amdgpu_device_ip_early_init

2022-01-21 Thread Alex Deucher
We need to set the APU flag from IP discovery before
we evaluate this code.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 13 -
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 061ea30650cc..df78adecf157 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -2073,6 +2074,8 @@ static int amdgpu_device_parse_gpu_info_fw(struct 
amdgpu_device *adev)
  */
 static int amdgpu_device_ip_early_init(struct amdgpu_device *adev)
 {
+   struct drm_device *dev = adev_to_drm(adev);
+   struct pci_dev *parent;
int i, r;
 
amdgpu_device_enable_virtual_display(adev);
@@ -2137,6 +2140,18 @@ static int amdgpu_device_ip_early_init(struct 
amdgpu_device *adev)
break;
}
 
+   if (amdgpu_has_atpx() &&
+   (amdgpu_is_atpx_hybrid() ||
+amdgpu_has_atpx_dgpu_power_cntl()) &&
+   ((adev->flags & AMD_IS_APU) == 0) &&
+   !pci_is_thunderbolt_attached(to_pci_dev(dev->dev)))
+   adev->flags |= AMD_IS_PX;
+
+   if (!(adev->flags & AMD_IS_APU)) {
+   parent = pci_upstream_bridge(adev->pdev);
+   adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
+   }
+
amdgpu_amdkfd_device_probe(adev);
 
adev->pm.pp_feature = amdgpu_pp_feature_mask;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index 56c2d988694a..16a6da09c924 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -152,23 +152,10 @@ static void amdgpu_get_audio_func(struct amdgpu_device 
*adev)
 int amdgpu_driver_load_kms(struct amdgpu_device *adev, unsigned long flags)
 {
struct drm_device *dev;
-   struct pci_dev *parent;
int r, acpi_status;
 
dev = adev_to_drm(adev);
 
-   if (amdgpu_has_atpx() &&
-   (amdgpu_is_atpx_hybrid() ||
-amdgpu_has_atpx_dgpu_power_cntl()) &&
-   ((flags & AMD_IS_APU) == 0) &&
-   !pci_is_thunderbolt_attached(to_pci_dev(dev->dev)))
-   flags |= AMD_IS_PX;
-
-   if (!(flags & AMD_IS_APU)) {
-   parent = pci_upstream_bridge(adev->pdev);
-   adev->has_pr3 = parent ? pci_pr3_present(parent) : false;
-   }
-
/* amdgpu_device_init should report only fatal error
 * like memory allocation failure or iomapping failure,
 * or memory manager initialization failure, it must
-- 
2.34.1



[PATCH] drm/amdgpu/pm/smu7: drop message about VI performance levels

2022-01-21 Thread Alex Deucher
Earlier chips only had two performance levels, but newer
ones potentially had more.  The message is harmless.  Drop the
message to avoid spamming the log.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1874
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
index cd99db0dc2be..a1e11037831a 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
@@ -3295,10 +3295,6 @@ static int smu7_apply_state_adjust_rules(struct pp_hwmgr 
*hwmgr,
request_ps->classification.ui_label);
data->mclk_ignore_signal = false;
 
-   PP_ASSERT_WITH_CODE(smu7_ps->performance_level_count == 2,
-"VI should always have 2 performance levels",
-   );
-
max_limits = adev->pm.ac_power ?
&(hwmgr->dyn_state.max_clock_voltage_on_ac) :
&(hwmgr->dyn_state.max_clock_voltage_on_dc);
-- 
2.34.1



Re: [PATCH] drm/amd/display: Fix memory leak

2022-01-21 Thread Harry Wentland
On 2022-01-21 06:26, Yongzhi Liu wrote:
> [why]
> Resource release is needed on the error handling path
> to prevent memory leak.
> 
> [how]
> Fix this by adding kfree on the error handling path.
> 
> Signed-off-by: Yongzhi Liu 

Reviewed-by: Harry Wentland 

Harry

> ---
>  .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c  | 80 
> --
>  1 file changed, 60 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> index ded64d0..e463d46 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> @@ -227,8 +227,10 @@ static ssize_t dp_link_settings_read(struct file *f, 
> char __user *buf,
>   break;
>  
>   r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>   return r; /* r = -EFAULT */
> + }
>  
>   buf += 1;
>   size -= 1;
> @@ -389,8 +391,10 @@ static ssize_t dp_phy_settings_read(struct file *f, char 
> __user *buf,
>   break;
>  
>   r = put_user((*(rd_buf + result)), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>   return r; /* r = -EFAULT */
> + }
>  
>   buf += 1;
>   size -= 1;
> @@ -1359,8 +1363,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, 
> char __user *buf,
>   break;
>   }
>  
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>   return -ENXIO;
> + }
>  
>   dsc = pipe_ctx->stream_res.dsc;
>   if (dsc)
> @@ -1376,8 +1382,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, 
> char __user *buf,
>   break;
>  
>   r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>   return r; /* r = -EFAULT */
> + }
>  
>   buf += 1;
>   size -= 1;
> @@ -1546,8 +1554,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, 
> char __user *buf,
>   break;
>   }
>  
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>   return -ENXIO;
> + }
>  
>   dsc = pipe_ctx->stream_res.dsc;
>   if (dsc)
> @@ -1563,8 +1573,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, 
> char __user *buf,
>   break;
>  
>   r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>   return r; /* r = -EFAULT */
> + }
>  
>   buf += 1;
>   size -= 1;
> @@ -1731,8 +1743,10 @@ static ssize_t dp_dsc_slice_height_read(struct file 
> *f, char __user *buf,
>   break;
>   }
>  
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>   return -ENXIO;
> + }
>  
>   dsc = pipe_ctx->stream_res.dsc;
>   if (dsc)
> @@ -1748,8 +1762,10 @@ static ssize_t dp_dsc_slice_height_read(struct file 
> *f, char __user *buf,
>   break;
>  
>   r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>   return r; /* r = -EFAULT */
> + }
>  
>   buf += 1;
>   size -= 1;
> @@ -1912,8 +1928,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file 
> *f, char __user *buf,
>   break;
>   }
>  
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>   return -ENXIO;
> + }
>  
>   dsc = pipe_ctx->stream_res.dsc;
>   if (dsc)
> @@ -1929,8 +1947,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file 
> *f, char __user *buf,
>   break;
>  
>   r = put_user(*(rd_buf + result), buf);
> - if (r)
> + if (r) {
> + kfree(rd_buf);
>   return r; /* r = -EFAULT */
> + }
>  
>   buf += 1;
>   size -= 1;
> @@ -2088,8 +2108,10 @@ static ssize_t dp_dsc_pic_width_read(struct file *f, 
> char __user *buf,
>   break;
>   }
>  
> - if (!pipe_ctx)
> + if (!pipe_ctx) {
> + kfree(rd_buf);
>   return -ENXIO;
> + }
>  
>   dsc = pipe_ctx->stream_res.dsc;
>   if (dsc)
> @@ -2105,8 +2127,10 @@ static ssize_t dp_dsc_pic_width_read(struct file *f, 
> char __user *buf,
>   break;
>  
>   r = put_user(*(rd_buf + result), 

Re: [PATCH v2 0/8] HMM profiler interface

2022-01-21 Thread Deucher, Alexander
[Public]

Please provide a link to the proposed userspace branch that makes use of this.

Alex

From: amd-gfx  on behalf of Philip Yang 

Sent: Thursday, January 20, 2022 6:13 PM
To: amd-gfx@lists.freedesktop.org 
Cc: Yang, Philip ; Kuehling, Felix 
Subject: [PATCH v2 0/8] HMM profiler interface

The ROCm profiler would expose the data from KFD profiling APIs to
application developers to tune the applications based on how the address
range attributes affect the behavior and performance.

Per process event log use the existing SMI (system management interface)
event API. Each event log is one line of text with the event specific
information.

v2:
 * Keep existing events behaviour
 * Use ktime_get_boottime_ns() as timestamp to correlate with other APIs
 * Use compact message layout, stick with existing message convention
 * Add unmap from GPU event

Philip Yang (8):
  drm/amdkfd: Correct SMI event read size
  drm/amdkfd: Add KFD SMI event IDs and triggers
  drm/amdkfd: Enable per process SMI event
  drm/amdkfd: Add GPU recoverable fault SMI event
  drm/amdkfd: add migration SMI event
  drm/amdkfd: Add user queue eviction restore SMI event
  drm/amdkfd: Add unmap from GPU SMI event
  drm/amdkfd: Bump KFD API version for SMI profiling event

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|   7 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  11 +-
 drivers/gpu/drm/amd/amdkfd/kfd_device.c   |   4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c  |  67 ---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h  |   5 +-
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_process.c  |  37 +++-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c   | 163 +-
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h   |  19 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c  |  63 +--
 include/uapi/linux/kfd_ioctl.h|  30 +++-
 11 files changed, 343 insertions(+), 65 deletions(-)

--
2.17.1



RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems

2022-01-21 Thread Limonciello, Mario
[Public]

> The S2idle suspend/resume process seems also depends on the
> CONFIG_SUSPEND. Moreover, why this check function still return true even
> when BIOS/AMDPMC not configured correctly? You know we still looking into
> some S0ix abort issue and system will run into such problem when mark those
> misconfigured case also as s0ix.

If return false then the driver expects to go the "S3 path" which "may" work now
after the patches that reset GPU on resume landed.

If you think this is wrong and it should be returning false in this case - then 
I think
we need another function that checks if s3 is actually a valid target.  Then in
suspend we test if we're doing s3, and we test if we're (safely) doing s0ix.
If we can't do either then the GPU stays fully powered up across the suspend
cycle and you see these warnings.
Thoughts on that?

> 
> Thanks,
> Prike
> > -Original Message-
> > From: Limonciello, Mario 
> > Sent: Friday, January 21, 2022 3:59 AM
> > To: amd-gfx@lists.freedesktop.org; Lazar, Lijo ; Liang,
> > Prike 
> > Cc: Bjoren Dasse 
> > Subject: RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems
> >
> > [Public]
> >
> > Add back on Lijo and Prike, my mistake they got dropped from CC.
> >
> > > -Original Message-
> > > From: Limonciello, Mario 
> > > Sent: Tuesday, January 18, 2022 21:41
> > > To: amd-gfx@lists.freedesktop.org
> > > Cc: Limonciello, Mario ; Bjoren Dasse
> > > 
> > > Subject: [PATCH v4] drm/amd: Warn users about potential s0ix problems
> > >
> > > On some OEM setups users can configure the BIOS for S3 or S2idle.
> > > When configured to S3 users can still choose 's2idle' in the kernel by
> > > using `/sys/power/mem_sleep`.  Before commit 6dc8265f9803
> > ("drm/amdgpu:
> > > always reset the asic in suspend (v2)"), the GPU would crash.  Now
> > > when configured this way, the system should resume but will use more
> > power.
> > >
> > > As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about
> > > potential power consumption issues during their first attempt at
> > > suspending.
> > >
> > > Reported-by: Bjoren Dasse 
> > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824
> > > Signed-off-by: Mario Limonciello 
> > > ---
> > > v3->v4:
> > >  * Add back in CONFIG_SUSPEND check
> > > v2->v3:
> > >  * Better direct users how to recover in the bad cases
> > > v1->v2:
> > >  * Only show messages in s2idle cases
> > >
> > >  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 21 +++-
> > -
> > >  1 file changed, 15 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > > index 4811b0faafd9..2531da6cbec3 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > > @@ -1040,11 +1040,20 @@ void amdgpu_acpi_detect(void)
> > >   */
> > >  bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev)  { -#if
> > > IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_SUSPEND)
> > > - if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) {
> > > - if (adev->flags & AMD_IS_APU)
> > > - return pm_suspend_target_state ==
> > > PM_SUSPEND_TO_IDLE;
> > > - }
> > > -#endif
> > > +#if IS_ENABLED(CONFIG_SUSPEND)
> > > + if (!(adev->flags & AMD_IS_APU) ||
> > > + pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
> > > + return false;
> > > +#else
> > >   return false;
> > > +#endif
> > > + if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
> > > + dev_warn_once(adev->dev,
> > > +   "Power consumption will be higher as BIOS has
> > not
> > > been configured for suspend-to-idle.\n"
> > > +   "To use suspend-to-idle change the sleep mode in
> > > BIOS setup.\n");
> > > +#if !IS_ENABLED(CONFIG_AMD_PMC)
> > > + dev_warn_once(adev->dev,
> > > +   "Power consumption will be higher as the kernel has not
> > > been compiled with CONFIG_AMD_PMC.\n");
> > > +#endif
> > > + return true;
> > >  }
> > > --
> > > 2.25.1


Re: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in dp_wait_for_training_aux_rd_interval

2022-01-21 Thread Deucher, Alexander
[Public]

It just changes the limit for when we use msleep vs udelay, not the units.

Alex

From: Chen, Guchun 
Sent: Thursday, January 20, 2022 8:49 PM
To: Deucher, Alexander ; 
amd-gfx@lists.freedesktop.org 
Cc: Deucher, Alexander 
Subject: RE: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in 
dp_wait_for_training_aux_rd_interval

[Public]

If we change if condition, how about the division by "wait_in_micro_secs/1000", 
as the sleep time is less now. Shall we adjust it as well?

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Friday, January 21, 2022 2:04 AM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander 
Subject: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in 
dp_wait_for_training_aux_rd_interval

Some architectures (e.g., ARM) have relatively low udelay limits.
On most architectures, anything longer than 2000us is not recommended.
Change the check to align with other similar checks in DC.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 1f8831156bc4..aa1c67c3c386 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -202,7 +202,7 @@ void dp_wait_for_training_aux_rd_interval(
 uint32_t wait_in_micro_secs)
 {
 #if defined(CONFIG_DRM_AMD_DC_DCN)
-   if (wait_in_micro_secs > 16000)
+   if (wait_in_micro_secs > 1000)
 msleep(wait_in_micro_secs/1000);
 else
 udelay(wait_in_micro_secs);
--
2.34.1


Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM

2022-01-21 Thread Alex Deucher
On Fri, Jan 21, 2022 at 3:35 AM Thorsten Leemhuis
 wrote:
>
> Hi, this is your Linux kernel regression tracker speaking.
>
> On 21.01.22 03:13, James Turner wrote:
> >
> > I finished the bisection (log below). The issue was introduced in
> > f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)").
>
> FWIW, that was:
>
> > drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)
> > They are global ACPI methods, so maybe the structures
> > global in the driver. This simplified a number of things
> > in the handling of these methods.
> >
> > v2: reset the handle if verify interface fails (Lijo)
> > v3: fix compilation when ACPI is not defined.
> >
> > Reviewed-by: Lijo Lazar 
> > Signed-off-by: Alex Deucher 
>
> In that case we need to get those two and the maintainers for the driver
> involved by addressing them with this mail. And to make it easy for them
> here is a link and a quote from the original report:
>
> https://lore.kernel.org/all/87ee57c8fu@turner.link/

Are you ever loading the amdgpu driver in your tests?  If not, I don't
see how this patch would affect anything as the driver code would
never have executed.  It would appear not based on your example.

Alex

>
> ```
> > Hi,
> >
> > With newer kernels, starting with the v5.14 series, when using a MS
> > Windows 10 guest VM with PCI passthrough of an AMD Radeon Pro WX 3200
> > discrete GPU, the passed-through GPU will not run above 501 MHz, even
> > when it is under 100% load and well below the temperature limit. As a
> > result, GPU-intensive software (such as video games) runs unusably
> > slowly in the VM.
> >
> > In contrast, with older kernels, the passed-through GPU runs at up to
> > 1295 MHz (the correct hardware limit), so GPU-intensive software runs at
> > a reasonable speed in the VM.
> >
> > I've confirmed that the issue exists with the following kernel versions:
> >
> > - v5.16
> > - v5.14
> > - v5.14-rc1
> >
> > The issue does not exist with the following kernels:
> >
> > - v5.13
> > - various packaged (non-vanilla) 5.10.* Arch Linux `linux-lts` kernels
> >
> > So, the issue was introduced between v5.13 and v5.14-rc1. I'm willing to
> > bisect the commit history to narrow it down further, if that would be
> > helpful.
> >
> > The configuration details and test results are provided below. In
> > summary, for the kernels with this issue, the GPU core stays at a
> > constant 0.8 V, the GPU core clock ranges from 214 MHz to 501 MHz, and
> > the GPU memory stays at a constant 625 MHz, in the VM. For the correctly
> > working kernels, the GPU core ranges from 0.85 V to 1.0 V, the GPU core
> > clock ranges from 214 MHz to 1295 MHz, and the GPU memory stays at 1500
> > MHz, in the VM.
> >
> > Please let me know if additional information would be helpful.
> >
> > Regards,
> > James Turner
> >
> > # Configuration Details
> >
> > Hardware:
> >
> > - Dell Precision 7540 laptop
> > - CPU: Intel Core i7-9750H (x86-64)
> > - Discrete GPU: AMD Radeon Pro WX 3200
> > - The internal display is connected to the integrated GPU, and external
> >   displays are connected to the discrete GPU.
> >
> > Software:
> >
> > - KVM host: Arch Linux
> >   - self-built vanilla kernel (built using Arch Linux `PKGBUILD`
> > modified to use vanilla kernel sources from git.kernel.org)
> >   - libvirt 1:7.10.0-2
> >   - qemu 6.2.0-2
> >
> > - KVM guest: Windows 10
> >   - GPU driver: Radeon Pro Software Version 21.Q3 (Note that I also
> > experienced this issue with the 20.Q4 driver, using packaged
> > (non-vanilla) Arch Linux kernels on the host, before updating to the
> > 21.Q3 driver.)
> >
> > Kernel config:
> >
> > - For v5.13, v5.14-rc1, and v5.14, I used
> >   
> > https://github.com/archlinux/svntogit-packages/blob/89c24952adbfa645d9e1a6f12c572929f7e4e3c7/trunk/config
> >   (The build script ran `make olddefconfig` on that config file.)
> >
> > - For v5.16, I used
> >   
> > https://github.com/archlinux/svntogit-packages/blob/94f84e1ad8a530e54aa34cadbaa76e8dcc439d10/trunk/config
> >   (The build script ran `make olddefconfig` on that config file.)
> >
> > I set up the VM with PCI passthrough according to the instructions at
> > https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
> >
> > I'm passing through the following PCI devices to the VM, as listed by
> > `lspci -D -nn`:
> >
> >   :01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, 
> > Inc. [AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981]
> >   :01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] 
> > Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
> >
> > The host kernel command line includes the following relevant options:
> >
> >   intel_iommu=on vfio-pci.ids=1002:6981,1002:aae0
> >
> > to enable IOMMU and bind the `vfio-pci` driver to the PCI devices.
> >
> > My `/etc/mkinitcpio.conf` includes the following line:
> >
> >   MODULES=(vfio_pci vfio vfio_iommu_type1 vfio_virqfd i915 amdgpu)
> >
> > to load 

[PATCH] drm/amdkfd: enable heavy-weight TLB flush on Vega20

2022-01-21 Thread Eric Huang
It is to meet the requirement for memory allocation
optimization on MI50.

Signed-off-by: Eric Huang 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 5b8ae0795c0a..d708f1a502cf 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1582,7 +1582,8 @@ static int kfd_ioctl_free_memory_of_gpu(struct file 
*filep,
 static bool kfd_flush_tlb_after_unmap(struct kfd_dev *dev) {
return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2) ||
   (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1) &&
-   dev->adev->sdma.instance[0].fw_version >= 18);
+   dev->adev->sdma.instance[0].fw_version >= 18) ||
+   KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 0);
 }
 
 static int kfd_ioctl_map_memory_to_gpu(struct file *filep,
-- 
2.25.1



[PATCH] drm/amd/pm: remove useless if

2022-01-21 Thread Jiapeng Chong
Clean the following coccicheck warning:

./drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c:7035:2-4: WARNING: possible
condition with no effect (if == else).

Reported-by: Abaci Robot 
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c 
b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
index 23ff0d812e4b..7427c50409d4 100644
--- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
+++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c
@@ -7032,10 +7032,7 @@ static int si_power_control_set_level(struct 
amdgpu_device *adev)
ret = si_resume_smc(adev);
if (ret)
return ret;
-   ret = si_set_sw_state(adev);
-   if (ret)
-   return ret;
-   return 0;
+   return si_set_sw_state(adev);
 }
 
 static void si_set_vce_clock(struct amdgpu_device *adev,
-- 
2.20.1.7.g153144c



Re: [PATCH v9 3/6] drm: implement top-down allocation method

2022-01-21 Thread Matthew Auld

On 19/01/2022 11:37, Arunpravin wrote:

Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.

The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.

v2:
   - Fix alignment issues(Matthew Auld)
   - Remove unnecessary list_empty check(Matthew Auld)
   - merged the below patch to see the feature in action
  - add top-down alloc support to i915 driver

Signed-off-by: Arunpravin 
---
  drivers/gpu/drm/drm_buddy.c   | 36 ---
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  3 ++
  include/drm/drm_buddy.h   |  1 +
  3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 954e31962c74..6aa5c1ce25bf 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -371,6 +371,26 @@ alloc_range_bias(struct drm_buddy *mm,
return ERR_PTR(err);
  }
  
+static struct drm_buddy_block *

+get_maxblock(struct list_head *head)
+{
+   struct drm_buddy_block *max_block = NULL, *node;
+
+   max_block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!max_block)
+   return NULL;
+
+   list_for_each_entry(node, head, link) {
+   if (drm_buddy_block_offset(node) >
+   drm_buddy_block_offset(max_block))
+   max_block = node;
+   }


If we feed in the knowledge of the visible_size(or perhaps implement 
that generically as "zones"), I think this can be done more efficiently. 
It could also be useful to track directly in the allocator how much of 
the visible_size is still available, rather than having to do that in 
the upper levels by scanning the entire list. But hopefully in practice 
this should be good enough for our needs,

Reviewed-by: Matthew Auld 


+
+   return max_block;
+}
+
  static struct drm_buddy_block *
  alloc_from_freelist(struct drm_buddy *mm,
unsigned int order,
@@ -381,11 +401,17 @@ alloc_from_freelist(struct drm_buddy *mm,
int err;
  
  	for (i = order; i <= mm->max_order; ++i) {

-   block = list_first_entry_or_null(>free_list[i],
-struct drm_buddy_block,
-link);
-   if (block)
-   break;
+   if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) {
+   block = get_maxblock(>free_list[i]);
+   if (block)
+   break;
+   } else {
+   block = list_first_entry_or_null(>free_list[i],
+struct drm_buddy_block,
+link);
+   if (block)
+   break;
+   }
}
  
  	if (!block)

diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c 
b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
index 1411f4cf1f21..3662434b64bb 100644
--- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
+++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c
@@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct 
ttm_resource_manager *man,
INIT_LIST_HEAD(_res->blocks);
bman_res->mm = mm;
  
+	if (place->flags & TTM_PL_FLAG_TOPDOWN)

+   bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION;
+
if (place->fpfn || lpfn != man->size)
bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION;
  
diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h

index 865664b90a8a..424fc443115e 100644
--- a/include/drm/drm_buddy.h
+++ b/include/drm/drm_buddy.h
@@ -28,6 +28,7 @@
  })
  
  #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0)

+#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1)
  
  struct drm_buddy_block {

  #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)



Re: [PATCH v9 2/6] drm: improve drm_buddy_alloc function

2022-01-21 Thread Matthew Auld

On 19/01/2022 11:37, Arunpravin wrote:

- Make drm_buddy_alloc a single function to handle
   range allocation and non-range allocation demands

- Implemented a new function alloc_range() which allocates
   the requested power-of-two block comply with range limitations

- Moved order computation and memory alignment logic from
   i915 driver to drm buddy

v2:
   merged below changes to keep the build unbroken
- drm_buddy_alloc_range() becomes obsolete and may be removed
- enable ttm range allocation (fpfn / lpfn) support in i915 driver
- apply enhanced drm_buddy_alloc() function to i915 driver

v3(Matthew Auld):
   - Fix alignment issues and remove unnecessary list_empty check
   - add more validation checks for input arguments
   - make alloc_range() block allocations as bottom-up
   - optimize order computation logic
   - replace uint64_t with u64, which is preferred in the kernel

v4(Matthew Auld):
   - keep drm_buddy_alloc_range() function implementation for generic
 actual range allocations
   - keep alloc_range() implementation for end bias allocations

v5(Matthew Auld):
   - modify drm_buddy_alloc() passing argument place->lpfn to lpfn
 as place->lpfn will currently always be zero for i915

v6(Matthew Auld):
   - fixup potential uaf - If we are unlucky and can't allocate
 enough memory when splitting blocks, where we temporarily
 end up with the given block and its buddy on the respective
 free list, then we need to ensure we delete both blocks,
 and no just the buddy, before potentially freeing them


Hmm, not sure we really want to squash existing bug fixes into this 
patch. Perhaps bring in [1] to the start of your series? i915_buddy is 
gone now. Alternatively I can resend such that it applies on top 
drm_buddy. Your choice.


[1] https://patchwork.freedesktop.org/patch/469806/?series=98953=1



   - fix warnings reported by kernel test robot 

Signed-off-by: Arunpravin 
---
  drivers/gpu/drm/drm_buddy.c   | 326 +-
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  67 ++--
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.h |   2 +
  include/drm/drm_buddy.h   |  22 +-
  4 files changed, 293 insertions(+), 124 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index d60878bc9c20..954e31962c74 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -282,23 +282,99 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct 
list_head *objects)
  }
  EXPORT_SYMBOL(drm_buddy_free_list);
  
-/**

- * drm_buddy_alloc_blocks - allocate power-of-two blocks
- *
- * @mm: DRM buddy manager to allocate from
- * @order: size of the allocation
- *
- * The order value here translates to:
- *
- * 0 = 2^0 * mm->chunk_size
- * 1 = 2^1 * mm->chunk_size
- * 2 = 2^2 * mm->chunk_size
- *
- * Returns:
- * allocated ptr to the _buddy_block on success
- */
-struct drm_buddy_block *
-drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order)
+static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= e2 && e1 >= s2;
+}
+
+static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2)
+{
+   return s1 <= s2 && e1 >= e2;
+}
+
+static struct drm_buddy_block *
+alloc_range_bias(struct drm_buddy *mm,
+u64 start, u64 end,
+unsigned int order)
+{
+   struct drm_buddy_block *block;
+   struct drm_buddy_block *buddy;
+   LIST_HEAD(dfs);
+   int err;
+   int i;
+
+   end = end - 1;
+
+   for (i = 0; i < mm->n_roots; ++i)
+   list_add_tail(>roots[i]->tmp_link, );
+
+   do {
+   u64 block_start;
+   u64 block_end;
+
+   block = list_first_entry_or_null(,
+struct drm_buddy_block,
+tmp_link);
+   if (!block)
+   break;
+
+   list_del(>tmp_link);
+
+   if (drm_buddy_block_order(block) < order)
+   continue;
+
+   block_start = drm_buddy_block_offset(block);
+   block_end = block_start + drm_buddy_block_size(mm, block) - 1;
+
+   if (!overlaps(start, end, block_start, block_end))
+   continue;
+
+   if (drm_buddy_block_is_allocated(block))
+   continue;
+
+   if (contains(start, end, block_start, block_end) &&
+   order == drm_buddy_block_order(block)) {
+   /*
+* Find the free block within the range.
+*/
+   if (drm_buddy_block_is_free(block))
+   return block;
+
+   continue;
+   }
+
+   if (!drm_buddy_block_is_split(block)) {
+   err = split_block(mm, block);
+   if 

RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems

2022-01-21 Thread Liang, Prike
The S2idle suspend/resume process seems also depends on the CONFIG_SUSPEND. 
Moreover, why this check function still return true even when BIOS/AMDPMC not 
configured correctly? You know we still looking into some S0ix abort issue and 
system will run into such problem when mark those misconfigured case also as 
s0ix.   

Thanks,
Prike
> -Original Message-
> From: Limonciello, Mario 
> Sent: Friday, January 21, 2022 3:59 AM
> To: amd-gfx@lists.freedesktop.org; Lazar, Lijo ; Liang,
> Prike 
> Cc: Bjoren Dasse 
> Subject: RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems
> 
> [Public]
> 
> Add back on Lijo and Prike, my mistake they got dropped from CC.
> 
> > -Original Message-
> > From: Limonciello, Mario 
> > Sent: Tuesday, January 18, 2022 21:41
> > To: amd-gfx@lists.freedesktop.org
> > Cc: Limonciello, Mario ; Bjoren Dasse
> > 
> > Subject: [PATCH v4] drm/amd: Warn users about potential s0ix problems
> >
> > On some OEM setups users can configure the BIOS for S3 or S2idle.
> > When configured to S3 users can still choose 's2idle' in the kernel by
> > using `/sys/power/mem_sleep`.  Before commit 6dc8265f9803
> ("drm/amdgpu:
> > always reset the asic in suspend (v2)"), the GPU would crash.  Now
> > when configured this way, the system should resume but will use more
> power.
> >
> > As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about
> > potential power consumption issues during their first attempt at
> > suspending.
> >
> > Reported-by: Bjoren Dasse 
> > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824
> > Signed-off-by: Mario Limonciello 
> > ---
> > v3->v4:
> >  * Add back in CONFIG_SUSPEND check
> > v2->v3:
> >  * Better direct users how to recover in the bad cases
> > v1->v2:
> >  * Only show messages in s2idle cases
> >
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 21 +++-
> -
> >  1 file changed, 15 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > index 4811b0faafd9..2531da6cbec3 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
> > @@ -1040,11 +1040,20 @@ void amdgpu_acpi_detect(void)
> >   */
> >  bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev)  { -#if
> > IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_SUSPEND)
> > -   if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) {
> > -   if (adev->flags & AMD_IS_APU)
> > -   return pm_suspend_target_state ==
> > PM_SUSPEND_TO_IDLE;
> > -   }
> > -#endif
> > +#if IS_ENABLED(CONFIG_SUSPEND)
> > +   if (!(adev->flags & AMD_IS_APU) ||
> > +   pm_suspend_target_state != PM_SUSPEND_TO_IDLE)
> > +   return false;
> > +#else
> > return false;
> > +#endif
> > +   if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
> > +   dev_warn_once(adev->dev,
> > + "Power consumption will be higher as BIOS has
> not
> > been configured for suspend-to-idle.\n"
> > + "To use suspend-to-idle change the sleep mode in
> > BIOS setup.\n");
> > +#if !IS_ENABLED(CONFIG_AMD_PMC)
> > +   dev_warn_once(adev->dev,
> > + "Power consumption will be higher as the kernel has not
> > been compiled with CONFIG_AMD_PMC.\n");
> > +#endif
> > +   return true;
> >  }
> > --
> > 2.25.1


RE: [PATCH Review 1/1] drm/amdgpu: fix channel index mapping for SIENNA_CICHLID

2022-01-21 Thread Zhou1, Tao
[AMD Official Use Only]



> -Original Message-
> From: Stanley.Yang 
> Sent: Friday, January 21, 2022 5:35 PM
> To: amd-gfx@lists.freedesktop.org; Zhang, Hawking
> ; Ziya, Mohammad zafar
> ; Clements, John
> ; Zhou1, Tao 
> Cc: Yang, Stanley 
> Subject: [PATCH Review 1/1] drm/amdgpu: fix channel index mapping for
> SIENNA_CICHLID
> 
> Pmfw read ecc info registers in the following order,
>  umc0: ch_inst 0, 1, 2 ... 7
>  umc1: ch_inst 0, 1, 2 ... 7
> The position of the register value stored in eccinfo table is calculated 
> according
> to the below formula,
>  channel_index = umc_inst * channel_in_umc + ch_inst Driver directly use 
> the
[Tao]: use -> uses

> index of eccinfo table array as channel index, it's not correct, driver need
[Tao]: need -> needs to

The patch itself is OK for me, with the comments above fixed, it's:

Reviewed-by: Tao Zhou 

> convert eccinfo table array index to channel index according to 
> channel_idx_tbl.
> 
> Change-Id: I26c3a99d161a00a69f7d00bd177942b6cd65a0de
> Signed-off-by: Stanley.Yang 
> ---
>  drivers/gpu/drm/amd/amdgpu/umc_v8_7.c | 29 ---
>  1 file changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c
> b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c
> index cd57f39df7d1..d70417196662 100644
> --- a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c
> +++ b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c
> @@ -55,29 +55,36 @@ static inline uint32_t
> get_umc_v8_7_channel_index(struct amdgpu_device *adev,  }
> 
>  static void umc_v8_7_ecc_info_query_correctable_error_count(struct
> amdgpu_device *adev,
> - uint32_t channel_index,
> + uint32_t umc_inst, uint32_t
> ch_inst,
>   unsigned long *error_count)
>  {
>   uint64_t mc_umc_status;
> + uint32_t eccinfo_table_idx;
>   struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> +
> + eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst;
> +
>   /* check for SRAM correctable error
>* MCUMC_STATUS is a 64 bit register
>*/
> - mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status;
> + mc_umc_status = ras-
> >umc_ecc.ecc[eccinfo_table_idx].mca_umc_status;
>   if (REG_GET_FIELD(mc_umc_status,
> MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1 &&
>   REG_GET_FIELD(mc_umc_status,
> MCA_UMC_UMC0_MCUMC_STATUST0, CECC) == 1)
>   *error_count += 1;
>  }
> 
>  static void umc_v8_7_ecc_info_querry_uncorrectable_error_count(struct
> amdgpu_device *adev,
> - uint32_t
> channel_index,
> + uint32_t umc_inst,
> uint32_t ch_inst,
>   unsigned long
> *error_count)
>  {
>   uint64_t mc_umc_status;
> + uint32_t eccinfo_table_idx;
>   struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> 
> + eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst;
> +
>   /* check the MCUMC_STATUS */
> - mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status;
> + mc_umc_status = ras-
> >umc_ecc.ecc[eccinfo_table_idx].mca_umc_status;
>   if ((REG_GET_FIELD(mc_umc_status,
> MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1) &&
>   (REG_GET_FIELD(mc_umc_status,
> MCA_UMC_UMC0_MCUMC_STATUST0, Deferred) == 1 ||
>   REG_GET_FIELD(mc_umc_status,
> MCA_UMC_UMC0_MCUMC_STATUST0, UECC) == 1 || @@ -94,20 +101,16 @@
> static void umc_v8_7_ecc_info_query_ras_error_count(struct amdgpu_device
> *adev,
> 
>   uint32_t umc_inst= 0;
>   uint32_t ch_inst = 0;
> - uint32_t channel_index   = 0;
> 
>   /* TODO: driver needs to toggle DF Cstate to ensure
>* safe access of UMC registers. Will add the protection
>*/
>   LOOP_UMC_INST_AND_CH(umc_inst, ch_inst) {
> - channel_index = get_umc_v8_7_channel_index(adev,
> - umc_inst,
> - ch_inst);
>   umc_v8_7_ecc_info_query_correctable_error_count(adev,
> - channel_index,
> + umc_inst, ch_inst,
>   &(err_data-
> >ce_count));
>   umc_v8_7_ecc_info_querry_uncorrectable_error_count(adev,
> - channel_index,
> + umc_inst, ch_inst,
>   &(err_data-
> >ue_count));
>   }
>  }
> @@ -120,12 +123,14 @@ static void
> umc_v8_7_ecc_info_query_error_address(struct amdgpu_device *adev,
>   uint64_t mc_umc_status, err_addr, retired_page;
>   struct eeprom_table_record 

Re: [PATCH] drm/amdgpu: drop WARN_ON in amdgpu_gart_bind/unbind

2022-01-21 Thread Christian König

Am 21.01.22 um 09:47 schrieb Guchun Chen:

NULL pointer check has guarded it already.

calltrace:
amdgpu_ttm_gart_bind+0x49/0xa0 [amdgpu]
amdgpu_ttm_alloc_gart+0x13f/0x180 [amdgpu]
amdgpu_bo_create_reserved+0x139/0x2c0 [amdgpu]
? amdgpu_ttm_debugfs_init+0x120/0x120 [amdgpu]
amdgpu_bo_create_kernel+0x17/0x80 [amdgpu]
amdgpu_ttm_init+0x542/0x5e0 [amdgpu]

Fixes: f0239505d6c4("drm/amdgpu: remove gart.ready flag")
Signed-off-by: Guchun Chen 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 53cc844346f0..91d8207336c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -161,7 +161,7 @@ void amdgpu_gart_unbind(struct amdgpu_device *adev, 
uint64_t offset,
uint64_t flags = 0;
int idx;
  
-	if (WARN_ON(!adev->gart.ptr))

+   if (!adev->gart.ptr)
return;
  
  	if (!drm_dev_enter(adev_to_drm(adev), ))

@@ -241,7 +241,7 @@ void amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t 
offset,
 int pages, dma_addr_t *dma_addr,
 uint64_t flags)
  {
-   if (WARN_ON(!adev->gart.ptr))
+   if (!adev->gart.ptr)
return;
  
  	amdgpu_gart_map(adev, offset, pages, dma_addr, flags, adev->gart.ptr);




Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure

2022-01-21 Thread Lazar, Lijo




On 1/21/2022 2:56 PM, Quan, Evan wrote:

[AMD Official Use Only]




-Original Message-
From: Lazar, Lijo 
Sent: Friday, January 21, 2022 4:52 PM
To: Quan, Evan ; amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander ; Chen, Guchun
; Huang, Ray 
Subject: Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported'
member of smu_feature structure



On 1/21/2022 1:14 PM, Evan Quan wrote:

As it has exactly the same value as the 'enabled' member and also do
the same thing.



I believe the original intention is different. We need to cache the features
which are really supported by PMFW on that device on init. When a request
comes through sysfs node for enable/disable of a feature, we should check
against this and disallow those which are not supported.

[Quan, Evan] Uh, I doubt it was designed with such intention. As if it was, there 
should be checks in ->get_allowed_feature_mask(e.g. 
navi10_get_allowed_feature_mask) whether driver tries to enable/disable different 
features from PMFW.
" When a request comes through sysfs node for enable/disable of a feature" If the sysfs 
node mentioned is "pp_features", I might be able to provide some insights why there is no 
such checks added when I made them. Considering there may be some dependency between those dpm 
features(e.g. gfx ulv depends on gfx dpm), we actually cannot guard user whether their settings 
will succeed. E.g. If PMFW supports both GFX ULV and DPM but user wants ULV only, the checks 
against pmfw supported features seem fine. But actually that cannot work due to the dependency 
between them. So, instead of applying some checks which actually cannot guard anything, we just let 
user take the risks.



Below is one example

> -  if (!smu_cmn_feature_is_supported(smu,
SMU_FEATURE_FAN_CONTROL_BIT))
> +  if (!smu_cmn_feature_is_enabled(smu,
SMU_FEATURE_FAN_CONTROL_BIT))

Let's say user switched to manual mode, that time we disable fan control 
and move to manual mode. Next time when user requests auto mode, we 
check if fan control is originally supported on that platform and switch 
to auto.


Either way, we should cache the features which are originally supported 
on the platform (through a combination of PPTable features and allowed 
feature masks on ASICs which are applicable) and disallow anything 
outside of that.


Thanks,
Lijo


BR
Evan


Thanks,
Lijo


Signed-off-by: Evan Quan 
Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a
---
   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  1 -
   drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  1 -
   .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 
   .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 10 +-
   .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 19 ---



   .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  5 +
   .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  5 +
   .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  3 ---
   drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 17 -
   drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  3 ---
   10 files changed, 19 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 5ace30434e60..d3237b89f2c5 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -949,7 +949,6 @@ static int smu_sw_init(void *handle)

smu->pool_size = adev->pm.smu_prv_buffer_size;
smu->smu_feature.feature_num = SMU_FEATURE_MAX;
-   bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);

diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
index 18f24db7d202..3c0360772822 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -388,7 +388,6 @@ struct smu_power_context {
   struct smu_feature
   {
uint32_t feature_num;
-   DECLARE_BITMAP(supported, SMU_FEATURE_MAX);
DECLARE_BITMAP(allowed, SMU_FEATURE_MAX);
DECLARE_BITMAP(enabled, SMU_FEATURE_MAX);
   };
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 84834c24a7e9..9fb290f9aaeb 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -1625,8 +1625,8 @@ static int navi10_display_config_changed(struct

smu_context *smu)

int ret = 0;

if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
-   smu_cmn_feature_is_supported(smu,

SMU_FEATURE_DPM_DCEFCLK_BIT) &&

-   smu_cmn_feature_is_supported(smu,

SMU_FEATURE_DPM_SOCCLK_BIT)) {

+   smu_cmn_feature_is_enabled(smu,

SMU_FEATURE_DPM_DCEFCLK_BIT) &&

+   smu_cmn_feature_is_enabled(smu,

SMU_FEATURE_DPM_SOCCLK_BIT)) {


[PATCH Review 1/1] drm/amdgpu: fix channel index mapping for SIENNA_CICHLID

2022-01-21 Thread Stanley . Yang
Pmfw read ecc info registers in the following order,
 umc0: ch_inst 0, 1, 2 ... 7
 umc1: ch_inst 0, 1, 2 ... 7
The position of the register value stored in eccinfo
table is calculated according to the below formula,
 channel_index = umc_inst * channel_in_umc + ch_inst
Driver directly use the index of eccinfo table array
as channel index, it's not correct, driver need convert
eccinfo table array index to channel index according to
channel_idx_tbl.

Change-Id: I26c3a99d161a00a69f7d00bd177942b6cd65a0de
Signed-off-by: Stanley.Yang 
---
 drivers/gpu/drm/amd/amdgpu/umc_v8_7.c | 29 ---
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c 
b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c
index cd57f39df7d1..d70417196662 100644
--- a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c
@@ -55,29 +55,36 @@ static inline uint32_t get_umc_v8_7_channel_index(struct 
amdgpu_device *adev,
 }
 
 static void umc_v8_7_ecc_info_query_correctable_error_count(struct 
amdgpu_device *adev,
-   uint32_t channel_index,
+   uint32_t umc_inst, uint32_t 
ch_inst,
unsigned long *error_count)
 {
uint64_t mc_umc_status;
+   uint32_t eccinfo_table_idx;
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
+
+   eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst;
+
/* check for SRAM correctable error
 * MCUMC_STATUS is a 64 bit register
 */
-   mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status;
+   mc_umc_status = ras->umc_ecc.ecc[eccinfo_table_idx].mca_umc_status;
if (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1 
&&
REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, CECC) == 
1)
*error_count += 1;
 }
 
 static void umc_v8_7_ecc_info_querry_uncorrectable_error_count(struct 
amdgpu_device *adev,
-   uint32_t channel_index,
+   uint32_t umc_inst, 
uint32_t ch_inst,
unsigned long 
*error_count)
 {
uint64_t mc_umc_status;
+   uint32_t eccinfo_table_idx;
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
 
+   eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst;
+
/* check the MCUMC_STATUS */
-   mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status;
+   mc_umc_status = ras->umc_ecc.ecc[eccinfo_table_idx].mca_umc_status;
if ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 
1) &&
(REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, 
Deferred) == 1 ||
REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UECC) == 
1 ||
@@ -94,20 +101,16 @@ static void umc_v8_7_ecc_info_query_ras_error_count(struct 
amdgpu_device *adev,
 
uint32_t umc_inst= 0;
uint32_t ch_inst = 0;
-   uint32_t channel_index   = 0;
 
/* TODO: driver needs to toggle DF Cstate to ensure
 * safe access of UMC registers. Will add the protection
 */
LOOP_UMC_INST_AND_CH(umc_inst, ch_inst) {
-   channel_index = get_umc_v8_7_channel_index(adev,
-   umc_inst,
-   ch_inst);
umc_v8_7_ecc_info_query_correctable_error_count(adev,
-   channel_index,
+   umc_inst, ch_inst,
&(err_data->ce_count));
umc_v8_7_ecc_info_querry_uncorrectable_error_count(adev,
-   channel_index,
+   umc_inst, ch_inst,
&(err_data->ue_count));
}
 }
@@ -120,12 +123,14 @@ static void umc_v8_7_ecc_info_query_error_address(struct 
amdgpu_device *adev,
uint64_t mc_umc_status, err_addr, retired_page;
struct eeprom_table_record *err_rec;
uint32_t channel_index;
+   uint32_t eccinfo_table_idx;
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
 
+   eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst;
channel_index =
adev->umc.channel_idx_tbl[umc_inst * adev->umc.channel_inst_num 
+ ch_inst];
 
-   mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status;
+   mc_umc_status = ras->umc_ecc.ecc[eccinfo_table_idx].mca_umc_status;
 
if (mc_umc_status == 0)
return;
@@ -140,7 +145,7 @@ static void 

RE: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure

2022-01-21 Thread Quan, Evan
[AMD Official Use Only]



> -Original Message-
> From: Lazar, Lijo 
> Sent: Friday, January 21, 2022 4:52 PM
> To: Quan, Evan ; amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Chen, Guchun
> ; Huang, Ray 
> Subject: Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported'
> member of smu_feature structure
> 
> 
> 
> On 1/21/2022 1:14 PM, Evan Quan wrote:
> > As it has exactly the same value as the 'enabled' member and also do
> > the same thing.
> >
> 
> I believe the original intention is different. We need to cache the features
> which are really supported by PMFW on that device on init. When a request
> comes through sysfs node for enable/disable of a feature, we should check
> against this and disallow those which are not supported.
[Quan, Evan] Uh, I doubt it was designed with such intention. As if it was, 
there should be checks in ->get_allowed_feature_mask(e.g. 
navi10_get_allowed_feature_mask) whether driver tries to enable/disable 
different features from PMFW.
" When a request comes through sysfs node for enable/disable of a feature" If 
the sysfs node mentioned is "pp_features", I might be able to provide some 
insights why there is no such checks added when I made them. Considering there 
may be some dependency between those dpm features(e.g. gfx ulv depends on gfx 
dpm), we actually cannot guard user whether their settings will succeed. E.g. 
If PMFW supports both GFX ULV and DPM but user wants ULV only, the checks 
against pmfw supported features seem fine. But actually that cannot work due to 
the dependency between them. So, instead of applying some checks which actually 
cannot guard anything, we just let user take the risks.

BR
Evan
> 
> Thanks,
> Lijo
> 
> > Signed-off-by: Evan Quan 
> > Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a
> > ---
> >   drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  1 -
> >   drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  1 -
> >   .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 
> >   .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 10 +-
> >   .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 19 ---
> 
> >   .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  5 +
> >   .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  5 +
> >   .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  3 ---
> >   drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 17 -
> >   drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  3 ---
> >   10 files changed, 19 insertions(+), 53 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > index 5ace30434e60..d3237b89f2c5 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
> > @@ -949,7 +949,6 @@ static int smu_sw_init(void *handle)
> >
> > smu->pool_size = adev->pm.smu_prv_buffer_size;
> > smu->smu_feature.feature_num = SMU_FEATURE_MAX;
> > -   bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
> > bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
> > bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);
> >
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > index 18f24db7d202..3c0360772822 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
> > @@ -388,7 +388,6 @@ struct smu_power_context {
> >   struct smu_feature
> >   {
> > uint32_t feature_num;
> > -   DECLARE_BITMAP(supported, SMU_FEATURE_MAX);
> > DECLARE_BITMAP(allowed, SMU_FEATURE_MAX);
> > DECLARE_BITMAP(enabled, SMU_FEATURE_MAX);
> >   };
> > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> > b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> > index 84834c24a7e9..9fb290f9aaeb 100644
> > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> > @@ -1625,8 +1625,8 @@ static int navi10_display_config_changed(struct
> smu_context *smu)
> > int ret = 0;
> >
> > if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&
> > -   smu_cmn_feature_is_supported(smu,
> SMU_FEATURE_DPM_DCEFCLK_BIT) &&
> > -   smu_cmn_feature_is_supported(smu,
> SMU_FEATURE_DPM_SOCCLK_BIT)) {
> > +   smu_cmn_feature_is_enabled(smu,
> SMU_FEATURE_DPM_DCEFCLK_BIT) &&
> > +   smu_cmn_feature_is_enabled(smu,
> SMU_FEATURE_DPM_SOCCLK_BIT)) {
> > ret = smu_cmn_send_smc_msg_with_param(smu,
> SMU_MSG_NumOfDisplays,
> >   smu->display_config-
> >num_display,
> >   NULL);
> > @@ -1864,13 +1864,13 @@ static int
> navi10_notify_smc_display_config(struct smu_context *smu)
> > min_clocks.dcef_clock_in_sr = smu->display_config-
> >min_dcef_deep_sleep_set_clk;
> > min_clocks.memory_clock = smu->display_config-
> >min_mem_set_clock;
> >
> > -   if 

Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure

2022-01-21 Thread Lazar, Lijo




On 1/21/2022 1:14 PM, Evan Quan wrote:

As it has exactly the same value as the 'enabled' member and also do
the same thing.



I believe the original intention is different. We need to cache the 
features which are really supported by PMFW on that device on init. When 
a request comes through sysfs node for enable/disable of a feature, we 
should check against this and disallow those which are not supported.


Thanks,
Lijo


Signed-off-by: Evan Quan 
Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a
---
  drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c |  1 -
  drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |  1 -
  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   |  8 
  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 10 +-
  .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 19 ---
  .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c  |  5 +
  .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c|  5 +
  .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c  |  3 ---
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 17 -
  drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h|  3 ---
  10 files changed, 19 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c 
b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
index 5ace30434e60..d3237b89f2c5 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c
@@ -949,7 +949,6 @@ static int smu_sw_init(void *handle)
  
  	smu->pool_size = adev->pm.smu_prv_buffer_size;

smu->smu_feature.feature_num = SMU_FEATURE_MAX;
-   bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX);
bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX);
bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX);
  
diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h

index 18f24db7d202..3c0360772822 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
+++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h
@@ -388,7 +388,6 @@ struct smu_power_context {
  struct smu_feature
  {
uint32_t feature_num;
-   DECLARE_BITMAP(supported, SMU_FEATURE_MAX);
DECLARE_BITMAP(allowed, SMU_FEATURE_MAX);
DECLARE_BITMAP(enabled, SMU_FEATURE_MAX);
  };
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 84834c24a7e9..9fb290f9aaeb 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -1625,8 +1625,8 @@ static int navi10_display_config_changed(struct 
smu_context *smu)
int ret = 0;
  
  	if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&

-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
+   smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
ret = smu_cmn_send_smc_msg_with_param(smu, 
SMU_MSG_NumOfDisplays,
  
smu->display_config->num_display,
  NULL);
@@ -1864,13 +1864,13 @@ static int navi10_notify_smc_display_config(struct 
smu_context *smu)
min_clocks.dcef_clock_in_sr = 
smu->display_config->min_dcef_deep_sleep_set_clk;
min_clocks.memory_clock = smu->display_config->min_mem_set_clock;
  
-	if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {

+   if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) {
clock_req.clock_type = amd_pp_dcef_clock;
clock_req.clock_freq_in_khz = min_clocks.dcef_clock * 10;
  
  		ret = smu_v11_0_display_clock_voltage_request(smu, _req);

if (!ret) {
-   if (smu_cmn_feature_is_supported(smu, 
SMU_FEATURE_DS_DCEFCLK_BIT)) {
+   if (smu_cmn_feature_is_enabled(smu, 
SMU_FEATURE_DS_DCEFCLK_BIT)) {
ret = smu_cmn_send_smc_msg_with_param(smu,
  
SMU_MSG_SetMinDeepSleepDcefclk,
  
min_clocks.dcef_clock_in_sr/100,
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 651fe748e423..d568d6137a00 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -1281,8 +1281,8 @@ static int sienna_cichlid_display_config_changed(struct 
smu_context *smu)
int ret = 0;
  
  	if ((smu->watermarks_bitmap & WATERMARKS_EXIST) &&

-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) &&
-   smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
+   

[PATCH 2/2] add register dump function for nv asic reset

2022-01-21 Thread Somalapuram Amaranath
Register dump call on NV ASIC reset on AMD GPU reset

Signed-off-by: Somalapuram Amaranath 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 2ec1ffb36b1f..b5701772597f 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -529,10 +529,34 @@ nv_asic_reset_method(struct amdgpu_device *adev)
}
 }
 
+static void amdgpu_reset_dumps(struct amdgpu_device *adev)
+{
+   int r = 0, i;
+
+   /* original raven doesn't have full asic reset */
+   if ((adev->apu_flags & AMD_APU_IS_RAVEN) &&
+   !(adev->apu_flags & AMD_APU_IS_RAVEN2))
+   return;
+   for (i = 0; i < adev->num_ip_blocks; i++) {
+   if (!adev->ip_blocks[i].status.valid)
+   continue;
+   if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps)
+   continue;
+   r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev);
+
+   if (r)
+   DRM_ERROR("reset_reg_dumps of IP block <%s> failed 
%d\n",
+   
adev->ip_blocks[i].version->funcs->name, r);
+   }
+
+   dump_stack();
+}
+
 static int nv_asic_reset(struct amdgpu_device *adev)
 {
int ret = 0;
 
+   amdgpu_reset_dumps(adev);
switch (nv_asic_reset_method(adev)) {
case AMD_RESET_METHOD_PCI:
dev_info(adev->dev, "PCI reset\n");
-- 
2.25.1



[PATCH 1/2] drm/amdgpu: add reset register trace dump function for gfx_v10_0

2022-01-21 Thread Somalapuram Amaranath
Implementation of register trace dump function on the AMD GPU resets

Signed-off-by: Somalapuram Amaranath 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h |  8 
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 53 ++-
 drivers/gpu/drm/amd/include/amd_shared.h  |  1 +
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index d855cb53c7e0..c97b53b54333 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -537,6 +537,14 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
  __entry->seqno)
 );
 
+TRACE_EVENT(gfx_v10_0_reset_reg_dumps,
+   TP_PROTO(char *reg_dumps),
+   TP_ARGS(reg_dumps),
+   TP_STRUCT__entry(__string(dumps, reg_dumps)),
+   TP_fast_assign(__assign_str(dumps, reg_dumps);),
+   TP_printk("amdgpu register dump {%s}", __get_str(dumps))
+);
+
 #undef AMDGPU_JOB_GET_TIMELINE_NAME
 #endif
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index dbe7442fb25c..a63cdd0ad53a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -46,7 +46,7 @@
 #include "v10_structs.h"
 #include "gfx_v10_0.h"
 #include "nbio_v2_3.h"
-
+#include "amdgpu_trace.h"
 /*
  * Navi10 has two graphic rings to share each graphic pipe.
  * 1. Primary ring
@@ -188,6 +188,12 @@
 #define RLCG_ERROR_REPORT_ENABLED(adev) \
(amdgpu_sriov_reg_indirect_mmhub(adev) || 
amdgpu_sriov_reg_indirect_gc(adev))
 
+#define N_REGS (17)
+#define DUMP_REG(addr) do {\
+   (dump)[i][0] = (addr);  \
+   (dump)[i++][1] =  RREG32_SOC15_IP(GC, addr);\
+   } while (0)
+
 MODULE_FIRMWARE("amdgpu/navi10_ce.bin");
 MODULE_FIRMWARE("amdgpu/navi10_pfp.bin");
 MODULE_FIRMWARE("amdgpu/navi10_me.bin");
@@ -7466,7 +7472,6 @@ static int gfx_v10_0_hw_init(void *handle)
 {
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-
if (!amdgpu_emu_mode)
gfx_v10_0_init_golden_registers(adev);
 
@@ -7580,6 +7585,49 @@ static int gfx_v10_0_hw_fini(void *handle)
return 0;
 }
 
+static int gfx_v10_0_reset_reg_dumps(void *handle)
+{
+   struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+   uint32_t i = 0;
+   uint32_t (*dump)[2], n_regs = 0;
+   char *reg_dumps;
+
+   dump = kmalloc(N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
+   reg_dumps = kmalloc(1024, GFP_KERNEL);
+
+   if (dump == NULL || reg_dumps == NULL)
+   return -ENOMEM;
+
+   DUMP_REG(mmGRBM_STATUS2);
+   DUMP_REG(mmGRBM_STATUS_SE0);
+   DUMP_REG(mmGRBM_STATUS_SE1);
+   DUMP_REG(mmGRBM_STATUS_SE2);
+   DUMP_REG(mmGRBM_STATUS_SE3);
+   DUMP_REG(mmSDMA0_STATUS_REG);
+   DUMP_REG(mmSDMA1_STATUS_REG);
+   DUMP_REG(mmCP_STAT);
+   DUMP_REG(mmCP_STALLED_STAT1);
+   DUMP_REG(mmCP_STALLED_STAT1);
+   DUMP_REG(mmCP_STALLED_STAT3);
+   DUMP_REG(mmCP_CPC_STATUS);
+   DUMP_REG(mmCP_CPC_BUSY_STAT);
+   DUMP_REG(mmCP_CPC_STALLED_STAT1);
+   DUMP_REG(mmCP_CPF_STATUS);
+   DUMP_REG(mmCP_CPF_BUSY_STAT);
+   DUMP_REG(mmCP_CPF_STALLED_STAT1);
+
+   n_regs = i;
+
+   for (i = 0; i < n_regs; i++)
+   sprintf(reg_dumps + strlen(reg_dumps), "%08x: %08x, ", 
dump[i][0], dump[i][1]);
+
+   trace_gfx_v10_0_reset_reg_dumps(reg_dumps);
+   kfree(dump);
+   kfree(reg_dumps);
+
+   return 0;
+}
+
 static int gfx_v10_0_suspend(void *handle)
 {
return gfx_v10_0_hw_fini(handle);
@@ -9359,6 +9407,7 @@ static const struct amd_ip_funcs gfx_v10_0_ip_funcs = {
.set_clockgating_state = gfx_v10_0_set_clockgating_state,
.set_powergating_state = gfx_v10_0_set_powergating_state,
.get_clockgating_state = gfx_v10_0_get_clockgating_state,
+   .reset_reg_dumps = gfx_v10_0_reset_reg_dumps,
 };
 
 static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_gfx = {
diff --git a/drivers/gpu/drm/amd/include/amd_shared.h 
b/drivers/gpu/drm/amd/include/amd_shared.h
index 4b9e68a79f06..a165680d73a2 100644
--- a/drivers/gpu/drm/amd/include/amd_shared.h
+++ b/drivers/gpu/drm/amd/include/amd_shared.h
@@ -298,6 +298,7 @@ struct amd_ip_funcs {
 enum amd_powergating_state state);
void (*get_clockgating_state)(void *handle, u32 *flags);
int (*enable_umd_pstate)(void *handle, enum amd_dpm_forced_level 
*level);
+   int (*reset_reg_dumps)(void *handle);
 };
 
 
-- 
2.25.1



[PATCH] drm/amdgpu: drop WARN_ON in amdgpu_gart_bind/unbind

2022-01-21 Thread Guchun Chen
NULL pointer check has guarded it already.

calltrace:
amdgpu_ttm_gart_bind+0x49/0xa0 [amdgpu]
amdgpu_ttm_alloc_gart+0x13f/0x180 [amdgpu]
amdgpu_bo_create_reserved+0x139/0x2c0 [amdgpu]
? amdgpu_ttm_debugfs_init+0x120/0x120 [amdgpu]
amdgpu_bo_create_kernel+0x17/0x80 [amdgpu]
amdgpu_ttm_init+0x542/0x5e0 [amdgpu]

Fixes: f0239505d6c4("drm/amdgpu: remove gart.ready flag")
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index 53cc844346f0..91d8207336c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -161,7 +161,7 @@ void amdgpu_gart_unbind(struct amdgpu_device *adev, 
uint64_t offset,
uint64_t flags = 0;
int idx;
 
-   if (WARN_ON(!adev->gart.ptr))
+   if (!adev->gart.ptr)
return;
 
if (!drm_dev_enter(adev_to_drm(adev), ))
@@ -241,7 +241,7 @@ void amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t 
offset,
 int pages, dma_addr_t *dma_addr,
 uint64_t flags)
 {
-   if (WARN_ON(!adev->gart.ptr))
+   if (!adev->gart.ptr)
return;
 
amdgpu_gart_map(adev, offset, pages, dma_addr, flags, adev->gart.ptr);
-- 
2.17.1



Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM

2022-01-21 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker speaking.

On 21.01.22 03:13, James Turner wrote:
> 
> I finished the bisection (log below). The issue was introduced in
> f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)").

FWIW, that was:

> drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)
> They are global ACPI methods, so maybe the structures
> global in the driver. This simplified a number of things
> in the handling of these methods.
> 
> v2: reset the handle if verify interface fails (Lijo)
> v3: fix compilation when ACPI is not defined.
> 
> Reviewed-by: Lijo Lazar 
> Signed-off-by: Alex Deucher 

In that case we need to get those two and the maintainers for the driver
involved by addressing them with this mail. And to make it easy for them
here is a link and a quote from the original report:

https://lore.kernel.org/all/87ee57c8fu@turner.link/

```
> Hi,
> 
> With newer kernels, starting with the v5.14 series, when using a MS
> Windows 10 guest VM with PCI passthrough of an AMD Radeon Pro WX 3200
> discrete GPU, the passed-through GPU will not run above 501 MHz, even
> when it is under 100% load and well below the temperature limit. As a
> result, GPU-intensive software (such as video games) runs unusably
> slowly in the VM.
> 
> In contrast, with older kernels, the passed-through GPU runs at up to
> 1295 MHz (the correct hardware limit), so GPU-intensive software runs at
> a reasonable speed in the VM.
> 
> I've confirmed that the issue exists with the following kernel versions:
> 
> - v5.16
> - v5.14
> - v5.14-rc1
> 
> The issue does not exist with the following kernels:
> 
> - v5.13
> - various packaged (non-vanilla) 5.10.* Arch Linux `linux-lts` kernels
> 
> So, the issue was introduced between v5.13 and v5.14-rc1. I'm willing to
> bisect the commit history to narrow it down further, if that would be
> helpful.
> 
> The configuration details and test results are provided below. In
> summary, for the kernels with this issue, the GPU core stays at a
> constant 0.8 V, the GPU core clock ranges from 214 MHz to 501 MHz, and
> the GPU memory stays at a constant 625 MHz, in the VM. For the correctly
> working kernels, the GPU core ranges from 0.85 V to 1.0 V, the GPU core
> clock ranges from 214 MHz to 1295 MHz, and the GPU memory stays at 1500
> MHz, in the VM.
> 
> Please let me know if additional information would be helpful.
> 
> Regards,
> James Turner
> 
> # Configuration Details
> 
> Hardware:
> 
> - Dell Precision 7540 laptop
> - CPU: Intel Core i7-9750H (x86-64)
> - Discrete GPU: AMD Radeon Pro WX 3200
> - The internal display is connected to the integrated GPU, and external
>   displays are connected to the discrete GPU.
> 
> Software:
> 
> - KVM host: Arch Linux
>   - self-built vanilla kernel (built using Arch Linux `PKGBUILD`
> modified to use vanilla kernel sources from git.kernel.org)
>   - libvirt 1:7.10.0-2
>   - qemu 6.2.0-2
> 
> - KVM guest: Windows 10
>   - GPU driver: Radeon Pro Software Version 21.Q3 (Note that I also
> experienced this issue with the 20.Q4 driver, using packaged
> (non-vanilla) Arch Linux kernels on the host, before updating to the
> 21.Q3 driver.)
> 
> Kernel config:
> 
> - For v5.13, v5.14-rc1, and v5.14, I used
>   
> https://github.com/archlinux/svntogit-packages/blob/89c24952adbfa645d9e1a6f12c572929f7e4e3c7/trunk/config
>   (The build script ran `make olddefconfig` on that config file.)
> 
> - For v5.16, I used
>   
> https://github.com/archlinux/svntogit-packages/blob/94f84e1ad8a530e54aa34cadbaa76e8dcc439d10/trunk/config
>   (The build script ran `make olddefconfig` on that config file.)
> 
> I set up the VM with PCI passthrough according to the instructions at
> https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF
> 
> I'm passing through the following PCI devices to the VM, as listed by
> `lspci -D -nn`:
> 
>   :01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. 
> [AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981]
>   :01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] 
> Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
> 
> The host kernel command line includes the following relevant options:
> 
>   intel_iommu=on vfio-pci.ids=1002:6981,1002:aae0
> 
> to enable IOMMU and bind the `vfio-pci` driver to the PCI devices.
> 
> My `/etc/mkinitcpio.conf` includes the following line:
> 
>   MODULES=(vfio_pci vfio vfio_iommu_type1 vfio_virqfd i915 amdgpu)
> 
> to load `vfio-pci` before the graphics drivers. (Note that removing
> `i915 amdgpu` has no effect on this issue.)
> 
> I'm using libvirt to manage the VM. The relevant portions of the XML
> file are:
> 
>   
> 
>   
> 
>  function="0x0"/>
>   
>   
> 
>   
> 
>  function="0x0"/>
>   
> 
> # Test Results
> 
> For testing, I used the following procedure:
> 
> 1. Boot the host machine and log in.
> 
> 2. Run the following commands to gather information. For 

Re: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation

2022-01-21 Thread Joe Perches
On Wed, 2022-01-19 at 16:00 -0500, Steven Rostedt wrote:
> On Wed, 19 Jan 2022 21:25:08 +0200
> Andy Shevchenko  wrote:
> 
> > > I say keep it one line!
> > > 
> > > Reviewed-by: Steven Rostedt (Google)   
> > 
> > I believe Sakari strongly follows the 80 rule, which means...
> 
> Checkpatch says "100" I think we need to simply update the docs (the
> documentation always lags the code ;-)

checkpatch doesn't say anything normally, it's a stupid script.
It just mindlessly bleats a message when a line exceeds 100 chars...

Just fyi: I think it's nicer on a single line too.