Re: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
On 1/22/2022 2:04 AM, Sharma, Shashank wrote: From 899ec6060eb7d8a3d4d56ab439e4e6cdd74190a4 Mon Sep 17 00:00:00 2001 From: Somalapuram Amaranath Date: Fri, 21 Jan 2022 14:19:42 +0530 Subject: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler This patch adds a GPU reset handler for Navi ASIC family, which typically dumps some of the registersand sends a trace event. V2: Accomodated call to work function to send uevent Signed-off-by: Somalapuram Amaranath Signed-off-by: Shashank Sharma --- drivers/gpu/drm/amd/amdgpu/nv.c | 28 1 file changed, 28 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 01efda4398e5..ada35d4c5245 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -528,10 +528,38 @@ nv_asic_reset_method(struct amdgpu_device *adev) } } +static void amdgpu_reset_dumps(struct amdgpu_device *adev) +{ + int r = 0, i; + + /* original raven doesn't have full asic reset */ + if ((adev->apu_flags & AMD_APU_IS_RAVEN) && + !(adev->apu_flags & AMD_APU_IS_RAVEN2)) + return; + for (i = 0; i < adev->num_ip_blocks; i++) { + if (!adev->ip_blocks[i].status.valid) + continue; + if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps) + continue; + r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev); + + if (r) + DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n", + adev->ip_blocks[i].version->funcs->name, r); + } + + /* Schedule work to send uevent */ + if (!queue_work(system_unbound_wq, >gpu_reset_work)) + DRM_ERROR("failed to add GPU reset work\n"); + + dump_stack(); +} + static int nv_asic_reset(struct amdgpu_device *adev) { int ret = 0; + amdgpu_reset_dumps(adev); Had a comment on this before. Now there are different reasons (or even no reason like a precautionary reset) to perform reset. A user would be interested in a trace only if the reason is valid. To clarify on why a work shouldn't be scheduled on every reset, check here - https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c#L2188 Thanks, Lijo switch (nv_asic_reset_method(adev)) { case AMD_RESET_METHOD_PCI: dev_info(adev->dev, "PCI reset\n");
RE: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM
[AMD Official Use Only] Hi James, Could you provide the pp_dpm_* values in sysfs with and without the patch? Also, could you try forcing PCIE to gen3 (through pp_dpm_pcie) if it's not in gen3 when the issue happens? For details on pp_dpm_*, please check https://dri.freedesktop.org/docs/drm/gpu/amdgpu.html Thanks, Lijo -Original Message- From: James Turner Sent: Saturday, January 22, 2022 6:21 AM To: Alex Deucher Cc: Thorsten Leemhuis ; Deucher, Alexander ; Lazar, Lijo ; regressi...@lists.linux.dev; k...@vger.kernel.org; Greg KH ; Pan, Xinhui ; LKML ; amd-gfx@lists.freedesktop.org; Alex Williamson ; Koenig, Christian Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM > Are you ever loading the amdgpu driver in your tests? Yes, although I'm binding the `vfio-pci` driver to the AMD GPU's PCI devices via the kernel command line. (See my initial email.) My understanding is that `vfio-pci` is supposed to keep other drivers, such as `amdgpu`, from interacting with the GPU, although that's clearly not what's happening. I've been testing with `amdgpu` included in the `MODULES` list in `/etc/mkinitcpio.conf` (which Arch Linux uses to generate the initramfs). However, I ran some more tests today (results below), this time without `i915` or `amdgpu` in the `MODULES` list. The `amdgpu` kernel module still gets loaded. (I think udev loads it automatically?) Your comment gave me the idea to blacklist the `amdgpu` kernel module. That does serve as a workaround on my machine – it fixes the behavior for f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") and for the current Arch Linux prebuilt kernel (5.16.2-arch1-1). That's an acceptable workaround for my machine only because the separate GPU used by the host is an Intel integrated GPU. That workaround wouldn't work well for someone with two AMD GPUs. # New test results The following tests are set up the same way as in my initial email, with the following exceptions: - I've updated libvirt to 1:8.0.0-1. - I've removed `i915` and `amdgpu` from the `MODULES` list in `/etc/mkinitcpio.conf`. For all three of these tests, `lspci` said the following: % lspci -nnk -d 1002:6981 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981] Subsystem: Dell Device [1028:0926] Kernel driver in use: vfio-pci Kernel modules: amdgpu % lspci -nnk -d 1002:aae0 01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0] Subsystem: Dell Device [1028:0926] Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel ## Version f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack") This is the commit immediately preceding the one which introduced the issue. % sudo dmesg | grep -i amdgpu [ 15.840160] [drm] amdgpu kernel modesetting enabled. [ 15.840884] amdgpu: CRAT table not found [ 15.840885] amdgpu: Virtual CRAT table created for CPU [ 15.840893] amdgpu: Topology: Add CPU node % lsmod | grep amdgpu amdgpu 7450624 0 gpu_sched 49152 1 amdgpu drm_ttm_helper 16384 1 amdgpu ttm77824 2 amdgpu,drm_ttm_helper i2c_algo_bit 16384 2 amdgpu,i915 drm_kms_helper303104 2 amdgpu,i915 drm 581632 11 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm The passed-through GPU worked properly in the VM. ## Version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") This is the commit which introduced the issue. % sudo dmesg | grep -i amdgpu [ 15.319023] [drm] amdgpu kernel modesetting enabled. [ 15.329468] amdgpu: CRAT table not found [ 15.329470] amdgpu: Virtual CRAT table created for CPU [ 15.329482] amdgpu: Topology: Add CPU node % lsmod | grep amdgpu amdgpu 7450624 0 gpu_sched 49152 1 amdgpu drm_ttm_helper 16384 1 amdgpu ttm77824 2 amdgpu,drm_ttm_helper i2c_algo_bit 16384 2 amdgpu,i915 drm_kms_helper303104 2 amdgpu,i915 drm 581632 11 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm The passed-through GPU did not run above 501 MHz in the VM. ## Blacklisted `amdgpu`, version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") For this test, I added `module_blacklist=amdgpu` to kernel command line to blacklist the `amdgpu` module. % sudo dmesg | grep -i amdgpu [ 14.591576] Module amdgpu is blacklisted % lsmod | grep amdgpu The passed-through GPU worked properly in the VM. James
[PATCH v2 0/3] Implement parallel sysfs_emit solution for navi10
== Description == Scnprintf use within the kernel is not recommended, but simple sysfs_emit replacement has not been successful due to the page alignment requirement of the function. This patch set implements a new api "emit_clock_levels" to facilitate passing both the base and offset to the device rather than just the write pointer. This patch is only implemented for navi10 for some clocks to seek comment on the implementation before extending the implementation to other clock values and devices. Needs to be verified on a platform that supports the overclocking (v2) Update to apply on commit 801771de03 adjust printing of empty carriage return only if size == 0 change return from amdgpu_dpm_emit_clock_levels if emit_clock_levels not found == Patch Summary == linux: (g...@gitlab.freedesktop.org:agd5f) origin/amd-staging-drm-next @ 28907fd9e048 + f83a3144ede4 amdgpu/pm: Implement new API function "emit" that accepts buffer base and write offset + 82de36426a1f amdgpu/pm: insert attempt to call emit_clock_levels into amdgpu_get_pp_od_clk_voltage + 32f0fcf45dd8 amdgpu/pm: Add Error Handling to emit_clocks stack == System Summary == * DESKTOP(AMD FX-8350 + NAVI10(731F/ca), BIOS: F2) + ISO(Ubuntu 20.04.3 LTS) + Kernel(5.13.0-fdoagd5f-20220112-g28907fd9e0) == Test == LOGFILE=pp_clk.test.log AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1` AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'` HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON} lspci -nn | grep "VGA\|Display" > $LOGFILE FILES="pp_od_clk_voltage pp_dpm_sclk pp_dpm_mclk pp_dpm_pcie pp_dpm_socclk pp_dpm_fclk pp_dpm_dcefclk pp_dpm_vclk pp_dpm_dclk " for f in $FILES do echo === $f === >> $LOGFILE cat $HWMON_DIR/device/$f >> $LOGFILE done cat $LOGFILE Darren Powell (3): amdgpu/pm: Implement new API function "emit" that accepts buffer base and write offset amdgpu/pm: insert attempt to call emit_clock_levels into amdgpu_get_pp_od_clk_voltage amdgpu/pm: Add Error Handling to emit_clocks stack .../gpu/drm/amd/include/kgd_pp_interface.h| 1 + drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 21 ++ drivers/gpu/drm/amd/pm/amdgpu_pm.c| 49 +++-- drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 4 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 44 +++- drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 14 ++ .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 188 ++ 7 files changed, 302 insertions(+), 19 deletions(-) base-commit: 28907fd9e04859fab86a143271d29ff0ff043154 -- 2.34.1
[PATCH 2/3] amdgpu/pm: insert attempt to call emit_clock_levels into amdgpu_get_pp_od_clk_voltage
(v1) Use new API function emit_clock_levels to display to overclocking values, with a fallback to the print_clock_levels if the first call fails. (v2) Update to apply on commit 801771de03 adjust printing of empty carriage return only if size == 0 == Test == LOGFILE=pp_clk.test.log AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1` AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'` HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON} lspci -nn | grep "VGA\|Display" > $LOGFILE FILES="pp_od_clk_voltage" for f in $FILES do echo === $f === >> $LOGFILE cat $HWMON_DIR/device/$f >> $LOGFILE done cat $LOGFILE Signed-off-by: Darren Powell --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 39 ++ 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index ec2729b3930e..97a8dcbe6eaf 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -832,8 +832,17 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev, { struct drm_device *ddev = dev_get_drvdata(dev); struct amdgpu_device *adev = drm_to_adev(ddev); - ssize_t size; + int size = 0; int ret; + enum pp_clock_type od_clocks[6] = { + OD_SCLK, + OD_MCLK, + OD_VDDC_CURVE, + OD_RANGE, + OD_VDDGFX_OFFSET, + OD_CCLK, + }; + uint clk_index; if (amdgpu_in_reset(adev)) return -EPERM; @@ -846,16 +855,26 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev, return ret; } - size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, buf); - if (size > 0) { - size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK, buf+size); - size += amdgpu_dpm_print_clock_levels(adev, OD_VDDC_CURVE, buf+size); - size += amdgpu_dpm_print_clock_levels(adev, OD_VDDGFX_OFFSET, buf+size); - size += amdgpu_dpm_print_clock_levels(adev, OD_RANGE, buf+size); - size += amdgpu_dpm_print_clock_levels(adev, OD_CCLK, buf+size); - } else { - size = sysfs_emit(buf, "\n"); + ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[0], buf, ); + if (ret >= 0) { + /* Proceed with emit for other od clocks if the first call succeeds */ + for(clk_index = 1 ; clk_index < 6 ; clk_index++) + amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], buf, ); + } + else { + size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, buf); + if (size > 0) { + size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK, buf+size); + size += amdgpu_dpm_print_clock_levels(adev, OD_VDDC_CURVE, buf+size); + size += amdgpu_dpm_print_clock_levels(adev, OD_VDDGFX_OFFSET, buf+size); + size += amdgpu_dpm_print_clock_levels(adev, OD_RANGE, buf+size); + size += amdgpu_dpm_print_clock_levels(adev, OD_CCLK, buf+size); + } } + + if (size == 0) + size = sysfs_emit(buf, "\n"); + pm_runtime_mark_last_busy(ddev->dev); pm_runtime_put_autosuspend(ddev->dev); -- 2.34.1
[PATCH 1/3] amdgpu/pm: Implement new API function "emit" that accepts buffer base and write offset
(v1) - new power management function emit_clk_levels implemented by navi10_emit_clk_levels() This function currently duplicates the functionality of navi10_print_clk_levels, where snprintf is used write to the sysfs buffer. The first implementation to use sysfs_emit was unsuccessful due to requirement that buf must be aligned to a specific memory boundary. This solution passes the buffer base and write offset down the stack. - new power management function emit_clock_levels implemented by smu_emit_ppclk_levels() This function combines implementation of smu_print_ppclk_levels and smu_print_smuclk_levels and calls emit_clk_levels - new helper function smu_convert_to_smuclk called by smu_print_ppclk_levels and smu_emit_ppclk_levels - emit_clock_levels currently returns the length of the string written to the buffer, maintaining the behaviour of print_clock_levels, but will be upgraded to return error values in a subsequent patch - failure of the emit_clock_levels at the top level (amdgpu_get_pp_dpm_clock) triggers a fallback to the print_clock_levels, which can be removed after emit_clock_levels is implemented across all platforms. (v2) Update to apply on commit 801771de03 adjust printing of empty carriage return only if size == 0 == Test == LOGFILE=pp_clk.test.log AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1` AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'` HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON} lspci -nn | grep "VGA\|Display" > $LOGFILE FILES="pp_od_clk_voltage pp_dpm_sclk pp_dpm_mclk pp_dpm_pcie pp_dpm_socclk pp_dpm_fclk pp_dpm_dcefclk pp_dpm_vclk pp_dpm_dclk " for f in $FILES do echo === $f === >> $LOGFILE cat $HWMON_DIR/device/$f >> $LOGFILE done cat $LOGFILE Signed-off-by: Darren Powell --- .../gpu/drm/amd/include/kgd_pp_interface.h| 1 + drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 21 ++ drivers/gpu/drm/amd/pm/amdgpu_pm.c| 11 +- drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 4 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 46 - drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 10 + .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 189 ++ 7 files changed, 273 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h b/drivers/gpu/drm/amd/include/kgd_pp_interface.h index a8eec91c0995..8a8eb9411cdf 100644 --- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h +++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h @@ -321,6 +321,7 @@ struct amd_pm_funcs { int (*get_fan_speed_pwm)(void *handle, u32 *speed); int (*force_clock_level)(void *handle, enum pp_clock_type type, uint32_t mask); int (*print_clock_levels)(void *handle, enum pp_clock_type type, char *buf); + int (*emit_clock_levels)(void *handle, enum pp_clock_type type, char *buf, int *offset); int (*force_performance_level)(void *handle, enum amd_dpm_forced_level level); int (*get_sclk_od)(void *handle); int (*set_sclk_od)(void *handle, uint32_t value); diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c index 728b6e10f302..e45578124928 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c @@ -906,6 +906,27 @@ int amdgpu_dpm_print_clock_levels(struct amdgpu_device *adev, return ret; } +int amdgpu_dpm_emit_clock_levels(struct amdgpu_device *adev, + enum pp_clock_type type, + char *buf, + int *offset) +{ + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; + int ret = 0; + + if (!pp_funcs->emit_clock_levels) + return 0; + + mutex_lock(>pm.mutex); + ret = pp_funcs->emit_clock_levels(adev->powerplay.pp_handle, + type, + buf, + offset); + mutex_unlock(>pm.mutex); + + return ret; +} + int amdgpu_dpm_set_ppfeature_status(struct amdgpu_device *adev, uint64_t ppfeature_masks) { diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index d2823aaeca09..ec2729b3930e 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -980,8 +980,8 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev, { struct drm_device *ddev = dev_get_drvdata(dev); struct amdgpu_device *adev = drm_to_adev(ddev); - ssize_t size; - int ret; + int size = 0; + int ret = 0; if (amdgpu_in_reset(adev)) return -EPERM; @@ -994,8 +994,11 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev,
[PATCH 3/3] amdgpu/pm: Add Error Handling to emit_clocks stack
Previous implementation of print_clocks required return of bytes written to calling function via return value. Passing this value in by reference allows us now to pass back error codes up the calling stack. (v1) - Errors from get_current_clk_freq, get_dpm_level_count & get_dpm_freq now passed back up the stack - Added -EOPNOTSUPP when encountering for OD clocks !od_enabled missing od_table or od_settings - OD_RANGE continues to print any subset of the ODCAP settings it can find without reporting error for any missing - smu_emit_ppclk returns ENOENT if emit_clk_levels is not supported by the device - modified calling logic so fallback to print_clock_levels is only attempted if emit_clk_levels is not (yet) supported by the device (v2) - change return from amdgpu_dpm_emit_clock_levels if emit_clock_levels not found - updated documentation of pptable_func members print_clk_levels, emit_clk_levels == Test == LOGFILE=pp_clk.test.log AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1` AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'` HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON} lspci -nn | grep "VGA\|Display" > $LOGFILE FILES="pp_od_clk_voltage pp_dpm_sclk pp_dpm_mclk pp_dpm_pcie pp_dpm_socclk pp_dpm_fclk pp_dpm_dcefclk pp_dpm_vclk pp_dpm_dclk " for f in $FILES do echo === $f === >> $LOGFILE cat $HWMON_DIR/device/$f >> $LOGFILE done cat $LOGFILE Signed-off-by: Darren Powell --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 2 +- drivers/gpu/drm/amd/pm/amdgpu_pm.c| 13 ++-- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 8 +++ drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 6 +- .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 21 +-- 5 files changed, 25 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c index e45578124928..03a690a3abb7 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c @@ -915,7 +915,7 @@ int amdgpu_dpm_emit_clock_levels(struct amdgpu_device *adev, int ret = 0; if (!pp_funcs->emit_clock_levels) - return 0; + return -ENOENT; mutex_lock(>pm.mutex); ret = pp_funcs->emit_clock_levels(adev->powerplay.pp_handle, diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 97a8dcbe6eaf..a11def0ee761 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -855,13 +855,12 @@ static ssize_t amdgpu_get_pp_od_clk_voltage(struct device *dev, return ret; } - ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[0], buf, ); - if (ret >= 0) { - /* Proceed with emit for other od clocks if the first call succeeds */ - for(clk_index = 1 ; clk_index < 6 ; clk_index++) - amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], buf, ); + for(clk_index = 0 ; clk_index < 6 ; clk_index++) { + ret = amdgpu_dpm_emit_clock_levels(adev, od_clocks[clk_index], buf, ); + if (ret) + break; } - else { + if (ret == -ENOENT) { size = amdgpu_dpm_print_clock_levels(adev, OD_SCLK, buf); if (size > 0) { size += amdgpu_dpm_print_clock_levels(adev, OD_MCLK, buf+size); @@ -1014,7 +1013,7 @@ static ssize_t amdgpu_get_pp_dpm_clock(struct device *dev, } ret = amdgpu_dpm_emit_clock_levels(adev, type, buf, ); - if (ret < 0) + if (ret == -ENOENT) size = amdgpu_dpm_print_clock_levels(adev, type, buf); if (size == 0) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index d82ea7ee223f..29c101a93dc4 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -2454,12 +2454,10 @@ static int smu_emit_ppclk_levels(void *handle, enum pp_clock_type type, char *bu if (!smu->pm_enabled || !smu->adev->pm.dpm_enabled) return -EOPNOTSUPP; - if (smu->ppt_funcs->emit_clk_levels) { + if (smu->ppt_funcs->emit_clk_levels) ret = smu->ppt_funcs->emit_clk_levels(smu, clk_type, buf, offset); - } - else { - ret = -EOPNOTSUPP; - } + else + ret = -ENOENT; return ret; diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h index 224b869eda70..1429581dca9c 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h @@ -612,6 +612,7 @@ struct pptable_funcs { *to buffer. Star current level. * * Used for sysfs interfaces. +
Re: [PATCH] drm/amd/display: Fix memory leak
[Public] Applied. Thanks! From: Wentland, Harry Sent: Friday, January 21, 2022 2:03 PM To: Yongzhi Liu ; Li, Sun peng (Leo) ; Siqueira, Rodrigo ; Deucher, Alexander ; Koenig, Christian ; Pan, Xinhui ; airl...@linux.ie ; dan...@ffwll.ch ; Lipski, Mikita ; Lin, Wayne ; Kazlauskas, Nicholas ; Zuo, Jerry ; anson.ja...@amd.com ; eryk.b...@amd.com ; Pillai, Aurabindo ; Das, Nirmoy Cc: amd-gfx@lists.freedesktop.org ; dri-de...@lists.freedesktop.org ; linux-ker...@vger.kernel.org Subject: Re: [PATCH] drm/amd/display: Fix memory leak On 2022-01-21 06:26, Yongzhi Liu wrote: > [why] > Resource release is needed on the error handling path > to prevent memory leak. > > [how] > Fix this by adding kfree on the error handling path. > > Signed-off-by: Yongzhi Liu Reviewed-by: Harry Wentland Harry > --- > .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 80 > -- > 1 file changed, 60 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > index ded64d0..e463d46 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > @@ -227,8 +227,10 @@ static ssize_t dp_link_settings_read(struct file *f, > char __user *buf, >break; > >r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); >return r; /* r = -EFAULT */ > + } > >buf += 1; >size -= 1; > @@ -389,8 +391,10 @@ static ssize_t dp_phy_settings_read(struct file *f, char > __user *buf, >break; > >r = put_user((*(rd_buf + result)), buf); > - if (r) > + if (r) { > + kfree(rd_buf); >return r; /* r = -EFAULT */ > + } > >buf += 1; >size -= 1; > @@ -1359,8 +1363,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, > char __user *buf, >break; >} > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); >return -ENXIO; > + } > >dsc = pipe_ctx->stream_res.dsc; >if (dsc) > @@ -1376,8 +1382,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, > char __user *buf, >break; > >r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); >return r; /* r = -EFAULT */ > + } > >buf += 1; >size -= 1; > @@ -1546,8 +1554,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, > char __user *buf, >break; >} > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); >return -ENXIO; > + } > >dsc = pipe_ctx->stream_res.dsc; >if (dsc) > @@ -1563,8 +1573,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, > char __user *buf, >break; > >r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); >return r; /* r = -EFAULT */ > + } > >buf += 1; >size -= 1; > @@ -1731,8 +1743,10 @@ static ssize_t dp_dsc_slice_height_read(struct file > *f, char __user *buf, >break; >} > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); >return -ENXIO; > + } > >dsc = pipe_ctx->stream_res.dsc; >if (dsc) > @@ -1748,8 +1762,10 @@ static ssize_t dp_dsc_slice_height_read(struct file > *f, char __user *buf, >break; > >r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); >return r; /* r = -EFAULT */ > + } > >buf += 1; >size -= 1; > @@ -1912,8 +1928,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file > *f, char __user *buf, >break; >} > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); >return -ENXIO; > + } > >dsc = pipe_ctx->stream_res.dsc; >if (dsc) > @@ -1929,8 +1947,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file > *f, char __user *buf, >break; > >r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); >return r; /* r = -EFAULT */ > +
Re: [PATCH] drm/amd/pm: remove useless if
Applied. Thanks! Alex On Fri, Jan 21, 2022 at 6:48 AM Jiapeng Chong wrote: > > Clean the following coccicheck warning: > > ./drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c:7035:2-4: WARNING: possible > condition with no effect (if == else). > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong > --- > drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > index 23ff0d812e4b..7427c50409d4 100644 > --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c > @@ -7032,10 +7032,7 @@ static int si_power_control_set_level(struct > amdgpu_device *adev) > ret = si_resume_smc(adev); > if (ret) > return ret; > - ret = si_set_sw_state(adev); > - if (ret) > - return ret; > - return 0; > + return si_set_sw_state(adev); > } > > static void si_set_vce_clock(struct amdgpu_device *adev, > -- > 2.20.1.7.g153144c >
Re: [PATCH] drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence obj
On Fri, Jan 21, 2022 at 2:45 AM Christian König wrote: > > Am 21.01.22 um 06:28 schrieb Xin Xiong: > > This issue takes place in an error path in > > amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into > > default case, the function simply returns -EINVAL, forgetting to > > decrement the reference count of a dma_fence obj, which is bumped > > earlier by amdgpu_cs_get_fence(). This may result in reference count > > leaks. > > > > Fix it by decreasing the refcount of specific object before returning > > the error code. > > > > Signed-off-by: Xin Xiong > > Signed-off-by: Xin Tan > > Good catch. Reviewed-by: Christian König Applied manually. Strangely I never got this on any of my emails, and I don't see it in the archives. Alex > > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > index 0311d799a..894869789 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > > @@ -1510,6 +1510,7 @@ int amdgpu_cs_fence_to_handle_ioctl(struct drm_device > > *dev, void *data, > > return 0; > > > > default: > > + dma_fence_put(fence); > > return -EINVAL; > > } > > } >
[PATCH 2/4] drm/amdgpu: add work function for GPU reset
From c598dd586dd15fc5ae0a883a2e6f4094ec024085 Mon Sep 17 00:00:00 2001 From: Shashank Sharma Date: Fri, 21 Jan 2022 17:33:10 +0100 Subject: [PATCH 2/4] drm/amdgpu: add work function for GPU reset This patch adds a new work function, which will get scheduled in event of a GPU reset, and will send a uevent to indicate the same. The userspace can do some post-processing work like collecting data from a trace event. Cc: Alexander Deucher Cc: Christian Koenig Signed-off-by: Shashank Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 14 ++ 2 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 269437b01328..79192f43bb71 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1057,6 +1057,8 @@ struct amdgpu_device { struct work_struct xgmi_reset_work; struct list_headreset_list; + struct work_struct gpu_reset_work; + longgfx_timeout; longsdma_timeout; longvideo_timeout; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index af9bdf16eefd..e29e58240869 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -72,6 +72,7 @@ #include #include +#include MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin"); MODULE_FIRMWARE("amdgpu/vega12_gpu_info.bin"); @@ -3274,6 +3275,18 @@ bool amdgpu_device_has_dc_support(struct amdgpu_device *adev) return amdgpu_device_asic_has_dc_support(adev->asic_type); } +static void amdgpu_device_gpu_reset_func(struct work_struct *__work) +{ + struct amdgpu_device *adev = + container_of(__work, struct amdgpu_device, gpu_reset_work); + + /* +* Inform userspace that a GPU reset happened, and it should collect +* data from the trace event. +*/ + drm_sysfs_gpu_reset_event(>ddev); +} + static void amdgpu_device_xgmi_reset_func(struct work_struct *__work) { struct amdgpu_device *adev = @@ -3506,6 +3519,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, amdgpu_device_delay_enable_gfx_off); INIT_WORK(>xgmi_reset_work, amdgpu_device_xgmi_reset_func); + INIT_WORK(>gpu_reset_work, amdgpu_device_gpu_reset_func); adev->gfx.gfx_off_req_count = 1; adev->pm.ac_power = power_supply_is_system_supplied() > 0; -- 2.32.0
[PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler
From 899ec6060eb7d8a3d4d56ab439e4e6cdd74190a4 Mon Sep 17 00:00:00 2001 From: Somalapuram Amaranath Date: Fri, 21 Jan 2022 14:19:42 +0530 Subject: [PATCH 4/4] drm/amdgpu/nv: add navi GPU reset handler This patch adds a GPU reset handler for Navi ASIC family, which typically dumps some of the registersand sends a trace event. V2: Accomodated call to work function to send uevent Signed-off-by: Somalapuram Amaranath Signed-off-by: Shashank Sharma --- drivers/gpu/drm/amd/amdgpu/nv.c | 28 1 file changed, 28 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 01efda4398e5..ada35d4c5245 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -528,10 +528,38 @@ nv_asic_reset_method(struct amdgpu_device *adev) } } +static void amdgpu_reset_dumps(struct amdgpu_device *adev) +{ + int r = 0, i; + + /* original raven doesn't have full asic reset */ + if ((adev->apu_flags & AMD_APU_IS_RAVEN) && + !(adev->apu_flags & AMD_APU_IS_RAVEN2)) + return; + for (i = 0; i < adev->num_ip_blocks; i++) { + if (!adev->ip_blocks[i].status.valid) + continue; + if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps) + continue; + r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev); + + if (r) + DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n", + adev->ip_blocks[i].version->funcs->name, r); + } + + /* Schedule work to send uevent */ + if (!queue_work(system_unbound_wq, >gpu_reset_work)) + DRM_ERROR("failed to add GPU reset work\n"); + + dump_stack(); +} + static int nv_asic_reset(struct amdgpu_device *adev) { int ret = 0; + amdgpu_reset_dumps(adev); switch (nv_asic_reset_method(adev)) { case AMD_RESET_METHOD_PCI: dev_info(adev->dev, "PCI reset\n"); -- 2.32.0
[PATCH 3/4] drm/amdgpu: add reset register trace dump function
From 1c5c552eeddaffd9fb3e7d45ece1b2b28fccc575 Mon Sep 17 00:00:00 2001 From: Somalapuram Amaranath Date: Fri, 21 Jan 2022 14:19:10 +0530 Subject: [PATCH 3/4] drm/amdgpu: add reset register trace dump function for gfx_v10_0 Implementation of register trace dump function on the AMD GPU resets Signed-off-by: Somalapuram Amaranath --- drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 8 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 53 ++- drivers/gpu/drm/amd/include/amd_shared.h | 1 + 3 files changed, 60 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h index d855cb53c7e0..c97b53b54333 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h @@ -537,6 +537,14 @@ TRACE_EVENT(amdgpu_ib_pipe_sync, __entry->seqno) ); +TRACE_EVENT(gfx_v10_0_reset_reg_dumps, + TP_PROTO(char *reg_dumps), + TP_ARGS(reg_dumps), + TP_STRUCT__entry(__string(dumps, reg_dumps)), + TP_fast_assign(__assign_str(dumps, reg_dumps);), + TP_printk("amdgpu register dump {%s}", __get_str(dumps)) +); + #undef AMDGPU_JOB_GET_TIMELINE_NAME #endif diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index 16dbe593cba2..05974ed5416d 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -46,7 +46,7 @@ #include "v10_structs.h" #include "gfx_v10_0.h" #include "nbio_v2_3.h" - +#include "amdgpu_trace.h" /* * Navi10 has two graphic rings to share each graphic pipe. * 1. Primary ring @@ -188,6 +188,12 @@ #define RLCG_ERROR_REPORT_ENABLED(adev) \ (amdgpu_sriov_reg_indirect_mmhub(adev) || amdgpu_sriov_reg_indirect_gc(adev)) +#define N_REGS (17) +#define DUMP_REG(addr) do {\ + (dump)[i][0] = (addr); \ + (dump)[i++][1] = RREG32_SOC15_IP(GC, addr);\ + } while (0) + MODULE_FIRMWARE("amdgpu/navi10_ce.bin"); MODULE_FIRMWARE("amdgpu/navi10_pfp.bin"); MODULE_FIRMWARE("amdgpu/navi10_me.bin"); @@ -7488,7 +7494,6 @@ static int gfx_v10_0_hw_init(void *handle) { int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; - if (!amdgpu_emu_mode) gfx_v10_0_init_golden_registers(adev); @@ -7602,6 +7607,49 @@ static int gfx_v10_0_hw_fini(void *handle) return 0; } +static int gfx_v10_0_reset_reg_dumps(void *handle) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)handle; + uint32_t i = 0; + uint32_t (*dump)[2], n_regs = 0; + char *reg_dumps; + + dump = kmalloc(N_REGS*2*sizeof(uint32_t), GFP_KERNEL); + reg_dumps = kmalloc(1024, GFP_KERNEL); + + if (dump == NULL || reg_dumps == NULL) + return -ENOMEM; + + DUMP_REG(mmGRBM_STATUS2); + DUMP_REG(mmGRBM_STATUS_SE0); + DUMP_REG(mmGRBM_STATUS_SE1); + DUMP_REG(mmGRBM_STATUS_SE2); + DUMP_REG(mmGRBM_STATUS_SE3); + DUMP_REG(mmSDMA0_STATUS_REG); + DUMP_REG(mmSDMA1_STATUS_REG); + DUMP_REG(mmCP_STAT); + DUMP_REG(mmCP_STALLED_STAT1); + DUMP_REG(mmCP_STALLED_STAT1); + DUMP_REG(mmCP_STALLED_STAT3); + DUMP_REG(mmCP_CPC_STATUS); + DUMP_REG(mmCP_CPC_BUSY_STAT); + DUMP_REG(mmCP_CPC_STALLED_STAT1); + DUMP_REG(mmCP_CPF_STATUS); + DUMP_REG(mmCP_CPF_BUSY_STAT); + DUMP_REG(mmCP_CPF_STALLED_STAT1); + + n_regs = i; + + for (i = 0; i < n_regs; i++) + sprintf(reg_dumps + strlen(reg_dumps), "%08x: %08x, ", dump[i][0], dump[i][1]); + + trace_gfx_v10_0_reset_reg_dumps(reg_dumps); + kfree(dump); + kfree(reg_dumps); + + return 0; +} + static int gfx_v10_0_suspend(void *handle) { return gfx_v10_0_hw_fini(handle); @@ -9367,6 +9415,7 @@ static const struct amd_ip_funcs gfx_v10_0_ip_funcs = { .set_clockgating_state = gfx_v10_0_set_clockgating_state, .set_powergating_state = gfx_v10_0_set_powergating_state, .get_clockgating_state = gfx_v10_0_get_clockgating_state, + .reset_reg_dumps = gfx_v10_0_reset_reg_dumps, }; static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_gfx = { diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h index 257f280d3d53..76d3a55278df 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -296,6 +296,7 @@ struct amd_ip_funcs { enum amd_powergating_state state); void (*get_clockgating_state)(void *handle, u32 *flags); int (*enable_umd_pstate)(void *handle, enum amd_dpm_forced_level *level); + int (*reset_reg_dumps)(void *handle); }; -- 2.32.0
[PATCH 1/4] drm: add a new drm event for GPU reset
From 8144e60d9f80941dd9f8a53e4b468582aaa849b4 Mon Sep 17 00:00:00 2001 From: Shashank Sharma Date: Fri, 21 Jan 2022 17:23:41 +0100 Subject: [PATCH 1/4] drm: add a new drm event for GPU reset This patch adds a DRM uevent to indicate GPU reset event. A userspace can register to this event and do some postprocessing if supported by native driver. An example usage has been added into AMDGPU driver in this series. Cc: Alexander Deucher Cc: Christian Koenig Signed-off-by: Shashank Sharma --- drivers/gpu/drm/drm_sysfs.c | 19 +++ include/drm/drm_sysfs.h | 1 + 2 files changed, 20 insertions(+) diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c index 968a9560b4aa..0d721d536bb1 100644 --- a/drivers/gpu/drm/drm_sysfs.c +++ b/drivers/gpu/drm/drm_sysfs.c @@ -343,6 +343,25 @@ void drm_sysfs_hotplug_event(struct drm_device *dev) } EXPORT_SYMBOL(drm_sysfs_hotplug_event); +/** + * drm_sysfs_gpu_reset_event - generate a GPU reset uevent + * @dev: DRM device + * + * Send a uevent for the DRM device specified by @dev. This event will + * inform userspace that a GPU reset happened, and it can do some + * postprocessing (eg: data collecting from a trace event). + */ +void drm_sysfs_gpu_reset_event(struct drm_device *dev) +{ + char *event_string = "RESET=1"; + char *envp[] = { event_string, NULL }; + + DRM_DEBUG("generating gpu reset event\n"); + + kobject_uevent_env(>primary->kdev->kobj, KOBJ_CHANGE, envp); +} +EXPORT_SYMBOL(drm_sysfs_gpu_reset_event); + /** * drm_sysfs_connector_status_event - generate a DRM uevent for connector * property status change diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h index d454ef617b2c..312956a0a1bf 100644 --- a/include/drm/drm_sysfs.h +++ b/include/drm/drm_sysfs.h @@ -11,6 +11,7 @@ int drm_class_device_register(struct device *dev); void drm_class_device_unregister(struct device *dev); void drm_sysfs_hotplug_event(struct drm_device *dev); +void drm_sysfs_gpu_reset_event(struct drm_device *dev); void drm_sysfs_connector_status_event(struct drm_connector *connector, struct drm_property *property); #endif -- 2.32.0
[PATCH 5/5] drm/amdgpu: convert amdgpu_display_supported_domains() to IP versions
Check IP versions rather than asic types. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 29 - 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 82011e75ed85..6cad39c31c58 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -510,19 +510,24 @@ uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev, case CHIP_STONEY: domain |= AMDGPU_GEM_DOMAIN_GTT; break; - case CHIP_RAVEN: - /* enable S/G on PCO and RV2 */ - if ((adev->apu_flags & AMD_APU_IS_RAVEN2) || - (adev->apu_flags & AMD_APU_IS_PICASSO)) - domain |= AMDGPU_GEM_DOMAIN_GTT; - break; - case CHIP_RENOIR: - case CHIP_VANGOGH: - case CHIP_YELLOW_CARP: - domain |= AMDGPU_GEM_DOMAIN_GTT; - break; - default: + switch (adev->ip_versions[DCE_HWIP][0]) { + case IP_VERSION(1, 0, 0): + case IP_VERSION(1, 0, 1): + /* enable S/G on PCO and RV2 */ + if ((adev->apu_flags & AMD_APU_IS_RAVEN2) || + (adev->apu_flags & AMD_APU_IS_PICASSO)) + domain |= AMDGPU_GEM_DOMAIN_GTT; + break; + case IP_VERSION(2, 1, 0): + case IP_VERSION(3, 0, 1): + case IP_VERSION(3, 1, 2): + case IP_VERSION(3, 1, 3): + domain |= AMDGPU_GEM_DOMAIN_GTT; + break; + default: + break; + } break; } } -- 2.34.1
[PATCH 4/5] drm/amdgpu: handle BACO synchronization with secondary funcs
Extend secondary function handling for runtime pm beyond audio to USB and UCSI. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 30 +++-- 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 89c3578bc818..119a5798623e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1968,19 +1968,25 @@ static bool amdgpu_is_fw_framebuffer(resource_size_t base, return found; } -static void amdgpu_get_audio_func(struct amdgpu_device *adev) +static void amdgpu_get_secondary_funcs(struct amdgpu_device *adev) { struct pci_dev *p = NULL; + int i; - p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus), - adev->pdev->bus->number, 1); - if (p) { - pm_runtime_get_sync(>dev); - - pm_runtime_mark_last_busy(>dev); - pm_runtime_put_autosuspend(>dev); - - pci_dev_put(p); + /* 0 - GPU +* 1 - audio +* 2 - USB +* 3 - UCSI +*/ + for (i = 1; i < 4; i++) { + p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus), + adev->pdev->bus->number, i); + if (p) { + pm_runtime_get_sync(>dev); + pm_runtime_mark_last_busy(>dev); + pm_runtime_put_autosuspend(>dev); + pci_dev_put(p); + } } } @@ -2148,14 +2154,14 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, * be no PMFW-aware D-state transition(D0->D3) on runpm * suspend. Thus the BACO will be not correctly kicked in. * -* Via amdgpu_get_audio_func(), the audio dev is put +* Via amdgpu_get_secondary_funcs(), the audio dev is put * into D0 state. Then there will be a PMFW-aware D-state * transition(D0->D3) on runpm suspend. */ if (amdgpu_device_supports_baco(ddev) && !(adev->flags & AMD_IS_APU) && (adev->asic_type >= CHIP_NAVI10)) - amdgpu_get_audio_func(adev); + amdgpu_get_secondary_funcs(adev); } return 0; -- 2.34.1
[PATCH 3/5] drm/amdgpu: move runtime pm init after drm and fbdev init
Seems more logical to enable runtime pm at the end of the init sequence so we don't end up entering runtime suspend before init is finished. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 65 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 69 + 2 files changed, 66 insertions(+), 68 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 75ceb43392b1..89c3578bc818 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1968,6 +1968,22 @@ static bool amdgpu_is_fw_framebuffer(resource_size_t base, return found; } +static void amdgpu_get_audio_func(struct amdgpu_device *adev) +{ + struct pci_dev *p = NULL; + + p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus), + adev->pdev->bus->number, 1); + if (p) { + pm_runtime_get_sync(>dev); + + pm_runtime_mark_last_busy(>dev); + pm_runtime_put_autosuspend(>dev); + + pci_dev_put(p); + } +} + static int amdgpu_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { @@ -2100,6 +2116,48 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, if (ret) DRM_ERROR("Creating debugfs files failed (%d).\n", ret); + if (adev->runpm) { + /* only need to skip on ATPX */ + if (amdgpu_device_supports_px(ddev)) + dev_pm_set_driver_flags(ddev->dev, DPM_FLAG_NO_DIRECT_COMPLETE); + /* we want direct complete for BOCO */ + if (amdgpu_device_supports_boco(ddev)) + dev_pm_set_driver_flags(ddev->dev, DPM_FLAG_SMART_PREPARE | + DPM_FLAG_SMART_SUSPEND | + DPM_FLAG_MAY_SKIP_RESUME); + pm_runtime_use_autosuspend(ddev->dev); + pm_runtime_set_autosuspend_delay(ddev->dev, 5000); + + pm_runtime_allow(ddev->dev); + + pm_runtime_mark_last_busy(ddev->dev); + pm_runtime_put_autosuspend(ddev->dev); + + /* +* For runpm implemented via BACO, PMFW will handle the +* timing for BACO in and out: +* - put ASIC into BACO state only when both video and +* audio functions are in D3 state. +* - pull ASIC out of BACO state when either video or +* audio function is in D0 state. +* Also, at startup, PMFW assumes both functions are in +* D0 state. +* +* So if snd driver was loaded prior to amdgpu driver +* and audio function was put into D3 state, there will +* be no PMFW-aware D-state transition(D0->D3) on runpm +* suspend. Thus the BACO will be not correctly kicked in. +* +* Via amdgpu_get_audio_func(), the audio dev is put +* into D0 state. Then there will be a PMFW-aware D-state +* transition(D0->D3) on runpm suspend. +*/ + if (amdgpu_device_supports_baco(ddev) && + !(adev->flags & AMD_IS_APU) && + (adev->asic_type >= CHIP_NAVI10)) + amdgpu_get_audio_func(adev); + } + return 0; err_pci: @@ -2111,8 +2169,15 @@ static void amdgpu_pci_remove(struct pci_dev *pdev) { struct drm_device *dev = pci_get_drvdata(pdev); + struct amdgpu_device *adev = drm_to_adev(dev); drm_dev_unplug(dev); + + if (adev->runpm) { + pm_runtime_get_sync(dev->dev); + pm_runtime_forbid(dev->dev); + } + amdgpu_driver_unload_kms(dev); /* diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 16a6da09c924..4c437ad2dd9a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -87,11 +87,6 @@ void amdgpu_driver_unload_kms(struct drm_device *dev) if (adev->rmmio == NULL) return; - if (adev->runpm) { - pm_runtime_get_sync(dev->dev); - pm_runtime_forbid(dev->dev); - } - if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DRV_UNLOAD)) DRM_WARN("smart shift update failed\n"); @@ -124,22 +119,6 @@ void amdgpu_register_gpu_instance(struct amdgpu_device *adev) mutex_unlock(_info.mutex); } -static void amdgpu_get_audio_func(struct amdgpu_device *adev) -{ - struct pci_dev *p = NULL; - - p = pci_get_domain_bus_and_slot(pci_domain_nr(adev->pdev->bus), - adev->pdev->bus->number, 1); - if (p) { -
[PATCH 1/5] drm/amdgpu: set APU flag based on IP discovery table
Use the IP versions to set the APU flag when necessary. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index e6a26b554254..ddbe13c9e4c7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -1253,6 +1253,19 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device *adev) return -EINVAL; } + switch (adev->ip_versions[GC_HWIP][0]) { + case IP_VERSION(9, 1, 0): + case IP_VERSION(9, 2, 2): + case IP_VERSION(9, 3, 0): + case IP_VERSION(10, 1, 3): + case IP_VERSION(10, 3, 1): + case IP_VERSION(10, 3, 3): + adev->flags |= AMD_IS_APU; + break; + default: + break; + } + if (adev->ip_versions[XGMI_HWIP][0] == IP_VERSION(4, 8, 0)) adev->gmc.xgmi.supported = true; -- 2.34.1
[PATCH 2/5] drm/amdgpu: move PX checking into amdgpu_device_ip_early_init
We need to set the APU flag from IP discovery before we evaluate this code. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 15 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c| 13 - 2 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 061ea30650cc..df78adecf157 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include @@ -2073,6 +2074,8 @@ static int amdgpu_device_parse_gpu_info_fw(struct amdgpu_device *adev) */ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) { + struct drm_device *dev = adev_to_drm(adev); + struct pci_dev *parent; int i, r; amdgpu_device_enable_virtual_display(adev); @@ -2137,6 +2140,18 @@ static int amdgpu_device_ip_early_init(struct amdgpu_device *adev) break; } + if (amdgpu_has_atpx() && + (amdgpu_is_atpx_hybrid() || +amdgpu_has_atpx_dgpu_power_cntl()) && + ((adev->flags & AMD_IS_APU) == 0) && + !pci_is_thunderbolt_attached(to_pci_dev(dev->dev))) + adev->flags |= AMD_IS_PX; + + if (!(adev->flags & AMD_IS_APU)) { + parent = pci_upstream_bridge(adev->pdev); + adev->has_pr3 = parent ? pci_pr3_present(parent) : false; + } + amdgpu_amdkfd_device_probe(adev); adev->pm.pp_feature = amdgpu_pp_feature_mask; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c index 56c2d988694a..16a6da09c924 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c @@ -152,23 +152,10 @@ static void amdgpu_get_audio_func(struct amdgpu_device *adev) int amdgpu_driver_load_kms(struct amdgpu_device *adev, unsigned long flags) { struct drm_device *dev; - struct pci_dev *parent; int r, acpi_status; dev = adev_to_drm(adev); - if (amdgpu_has_atpx() && - (amdgpu_is_atpx_hybrid() || -amdgpu_has_atpx_dgpu_power_cntl()) && - ((flags & AMD_IS_APU) == 0) && - !pci_is_thunderbolt_attached(to_pci_dev(dev->dev))) - flags |= AMD_IS_PX; - - if (!(flags & AMD_IS_APU)) { - parent = pci_upstream_bridge(adev->pdev); - adev->has_pr3 = parent ? pci_pr3_present(parent) : false; - } - /* amdgpu_device_init should report only fatal error * like memory allocation failure or iomapping failure, * or memory manager initialization failure, it must -- 2.34.1
[PATCH] drm/amdgpu/pm/smu7: drop message about VI performance levels
Earlier chips only had two performance levels, but newer ones potentially had more. The message is harmless. Drop the message to avoid spamming the log. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1874 Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index cd99db0dc2be..a1e11037831a 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -3295,10 +3295,6 @@ static int smu7_apply_state_adjust_rules(struct pp_hwmgr *hwmgr, request_ps->classification.ui_label); data->mclk_ignore_signal = false; - PP_ASSERT_WITH_CODE(smu7_ps->performance_level_count == 2, -"VI should always have 2 performance levels", - ); - max_limits = adev->pm.ac_power ? &(hwmgr->dyn_state.max_clock_voltage_on_ac) : &(hwmgr->dyn_state.max_clock_voltage_on_dc); -- 2.34.1
Re: [PATCH] drm/amd/display: Fix memory leak
On 2022-01-21 06:26, Yongzhi Liu wrote: > [why] > Resource release is needed on the error handling path > to prevent memory leak. > > [how] > Fix this by adding kfree on the error handling path. > > Signed-off-by: Yongzhi Liu Reviewed-by: Harry Wentland Harry > --- > .../drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 80 > -- > 1 file changed, 60 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > index ded64d0..e463d46 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c > @@ -227,8 +227,10 @@ static ssize_t dp_link_settings_read(struct file *f, > char __user *buf, > break; > > r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); > return r; /* r = -EFAULT */ > + } > > buf += 1; > size -= 1; > @@ -389,8 +391,10 @@ static ssize_t dp_phy_settings_read(struct file *f, char > __user *buf, > break; > > r = put_user((*(rd_buf + result)), buf); > - if (r) > + if (r) { > + kfree(rd_buf); > return r; /* r = -EFAULT */ > + } > > buf += 1; > size -= 1; > @@ -1359,8 +1363,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, > char __user *buf, > break; > } > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); > return -ENXIO; > + } > > dsc = pipe_ctx->stream_res.dsc; > if (dsc) > @@ -1376,8 +1382,10 @@ static ssize_t dp_dsc_clock_en_read(struct file *f, > char __user *buf, > break; > > r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); > return r; /* r = -EFAULT */ > + } > > buf += 1; > size -= 1; > @@ -1546,8 +1554,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, > char __user *buf, > break; > } > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); > return -ENXIO; > + } > > dsc = pipe_ctx->stream_res.dsc; > if (dsc) > @@ -1563,8 +1573,10 @@ static ssize_t dp_dsc_slice_width_read(struct file *f, > char __user *buf, > break; > > r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); > return r; /* r = -EFAULT */ > + } > > buf += 1; > size -= 1; > @@ -1731,8 +1743,10 @@ static ssize_t dp_dsc_slice_height_read(struct file > *f, char __user *buf, > break; > } > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); > return -ENXIO; > + } > > dsc = pipe_ctx->stream_res.dsc; > if (dsc) > @@ -1748,8 +1762,10 @@ static ssize_t dp_dsc_slice_height_read(struct file > *f, char __user *buf, > break; > > r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); > return r; /* r = -EFAULT */ > + } > > buf += 1; > size -= 1; > @@ -1912,8 +1928,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file > *f, char __user *buf, > break; > } > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); > return -ENXIO; > + } > > dsc = pipe_ctx->stream_res.dsc; > if (dsc) > @@ -1929,8 +1947,10 @@ static ssize_t dp_dsc_bits_per_pixel_read(struct file > *f, char __user *buf, > break; > > r = put_user(*(rd_buf + result), buf); > - if (r) > + if (r) { > + kfree(rd_buf); > return r; /* r = -EFAULT */ > + } > > buf += 1; > size -= 1; > @@ -2088,8 +2108,10 @@ static ssize_t dp_dsc_pic_width_read(struct file *f, > char __user *buf, > break; > } > > - if (!pipe_ctx) > + if (!pipe_ctx) { > + kfree(rd_buf); > return -ENXIO; > + } > > dsc = pipe_ctx->stream_res.dsc; > if (dsc) > @@ -2105,8 +2127,10 @@ static ssize_t dp_dsc_pic_width_read(struct file *f, > char __user *buf, > break; > > r = put_user(*(rd_buf + result),
Re: [PATCH v2 0/8] HMM profiler interface
[Public] Please provide a link to the proposed userspace branch that makes use of this. Alex From: amd-gfx on behalf of Philip Yang Sent: Thursday, January 20, 2022 6:13 PM To: amd-gfx@lists.freedesktop.org Cc: Yang, Philip ; Kuehling, Felix Subject: [PATCH v2 0/8] HMM profiler interface The ROCm profiler would expose the data from KFD profiling APIs to application developers to tune the applications based on how the address range attributes affect the behavior and performance. Per process event log use the existing SMI (system management interface) event API. Each event log is one line of text with the event specific information. v2: * Keep existing events behaviour * Use ktime_get_boottime_ns() as timestamp to correlate with other APIs * Use compact message layout, stick with existing message convention * Add unmap from GPU event Philip Yang (8): drm/amdkfd: Correct SMI event read size drm/amdkfd: Add KFD SMI event IDs and triggers drm/amdkfd: Enable per process SMI event drm/amdkfd: Add GPU recoverable fault SMI event drm/amdkfd: add migration SMI event drm/amdkfd: Add user queue eviction restore SMI event drm/amdkfd: Add unmap from GPU SMI event drm/amdkfd: Bump KFD API version for SMI profiling event drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 7 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 11 +- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 67 --- drivers/gpu/drm/amd/amdkfd/kfd_migrate.h | 5 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 37 +++- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 163 +- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.h | 19 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 63 +-- include/uapi/linux/kfd_ioctl.h| 30 +++- 11 files changed, 343 insertions(+), 65 deletions(-) -- 2.17.1
RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems
[Public] > The S2idle suspend/resume process seems also depends on the > CONFIG_SUSPEND. Moreover, why this check function still return true even > when BIOS/AMDPMC not configured correctly? You know we still looking into > some S0ix abort issue and system will run into such problem when mark those > misconfigured case also as s0ix. If return false then the driver expects to go the "S3 path" which "may" work now after the patches that reset GPU on resume landed. If you think this is wrong and it should be returning false in this case - then I think we need another function that checks if s3 is actually a valid target. Then in suspend we test if we're doing s3, and we test if we're (safely) doing s0ix. If we can't do either then the GPU stays fully powered up across the suspend cycle and you see these warnings. Thoughts on that? > > Thanks, > Prike > > -Original Message- > > From: Limonciello, Mario > > Sent: Friday, January 21, 2022 3:59 AM > > To: amd-gfx@lists.freedesktop.org; Lazar, Lijo ; Liang, > > Prike > > Cc: Bjoren Dasse > > Subject: RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems > > > > [Public] > > > > Add back on Lijo and Prike, my mistake they got dropped from CC. > > > > > -Original Message- > > > From: Limonciello, Mario > > > Sent: Tuesday, January 18, 2022 21:41 > > > To: amd-gfx@lists.freedesktop.org > > > Cc: Limonciello, Mario ; Bjoren Dasse > > > > > > Subject: [PATCH v4] drm/amd: Warn users about potential s0ix problems > > > > > > On some OEM setups users can configure the BIOS for S3 or S2idle. > > > When configured to S3 users can still choose 's2idle' in the kernel by > > > using `/sys/power/mem_sleep`. Before commit 6dc8265f9803 > > ("drm/amdgpu: > > > always reset the asic in suspend (v2)"), the GPU would crash. Now > > > when configured this way, the system should resume but will use more > > power. > > > > > > As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about > > > potential power consumption issues during their first attempt at > > > suspending. > > > > > > Reported-by: Bjoren Dasse > > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824 > > > Signed-off-by: Mario Limonciello > > > --- > > > v3->v4: > > > * Add back in CONFIG_SUSPEND check > > > v2->v3: > > > * Better direct users how to recover in the bad cases > > > v1->v2: > > > * Only show messages in s2idle cases > > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 21 +++- > > - > > > 1 file changed, 15 insertions(+), 6 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > > index 4811b0faafd9..2531da6cbec3 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > > @@ -1040,11 +1040,20 @@ void amdgpu_acpi_detect(void) > > > */ > > > bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev) { -#if > > > IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_SUSPEND) > > > - if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) { > > > - if (adev->flags & AMD_IS_APU) > > > - return pm_suspend_target_state == > > > PM_SUSPEND_TO_IDLE; > > > - } > > > -#endif > > > +#if IS_ENABLED(CONFIG_SUSPEND) > > > + if (!(adev->flags & AMD_IS_APU) || > > > + pm_suspend_target_state != PM_SUSPEND_TO_IDLE) > > > + return false; > > > +#else > > > return false; > > > +#endif > > > + if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) > > > + dev_warn_once(adev->dev, > > > + "Power consumption will be higher as BIOS has > > not > > > been configured for suspend-to-idle.\n" > > > + "To use suspend-to-idle change the sleep mode in > > > BIOS setup.\n"); > > > +#if !IS_ENABLED(CONFIG_AMD_PMC) > > > + dev_warn_once(adev->dev, > > > + "Power consumption will be higher as the kernel has not > > > been compiled with CONFIG_AMD_PMC.\n"); > > > +#endif > > > + return true; > > > } > > > -- > > > 2.25.1
Re: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in dp_wait_for_training_aux_rd_interval
[Public] It just changes the limit for when we use msleep vs udelay, not the units. Alex From: Chen, Guchun Sent: Thursday, January 20, 2022 8:49 PM To: Deucher, Alexander ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: RE: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in dp_wait_for_training_aux_rd_interval [Public] If we change if condition, how about the division by "wait_in_micro_secs/1000", as the sleep time is less now. Shall we adjust it as well? Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Friday, January 21, 2022 2:04 AM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH 1/2] drm/amdgpu/display: adjust msleep limit in dp_wait_for_training_aux_rd_interval Some architectures (e.g., ARM) have relatively low udelay limits. On most architectures, anything longer than 2000us is not recommended. Change the check to align with other similar checks in DC. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c index 1f8831156bc4..aa1c67c3c386 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c @@ -202,7 +202,7 @@ void dp_wait_for_training_aux_rd_interval( uint32_t wait_in_micro_secs) { #if defined(CONFIG_DRM_AMD_DC_DCN) - if (wait_in_micro_secs > 16000) + if (wait_in_micro_secs > 1000) msleep(wait_in_micro_secs/1000); else udelay(wait_in_micro_secs); -- 2.34.1
Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM
On Fri, Jan 21, 2022 at 3:35 AM Thorsten Leemhuis wrote: > > Hi, this is your Linux kernel regression tracker speaking. > > On 21.01.22 03:13, James Turner wrote: > > > > I finished the bisection (log below). The issue was introduced in > > f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)"). > > FWIW, that was: > > > drm/amdgpu/acpi: make ATPX/ATCS structures global (v2) > > They are global ACPI methods, so maybe the structures > > global in the driver. This simplified a number of things > > in the handling of these methods. > > > > v2: reset the handle if verify interface fails (Lijo) > > v3: fix compilation when ACPI is not defined. > > > > Reviewed-by: Lijo Lazar > > Signed-off-by: Alex Deucher > > In that case we need to get those two and the maintainers for the driver > involved by addressing them with this mail. And to make it easy for them > here is a link and a quote from the original report: > > https://lore.kernel.org/all/87ee57c8fu@turner.link/ Are you ever loading the amdgpu driver in your tests? If not, I don't see how this patch would affect anything as the driver code would never have executed. It would appear not based on your example. Alex > > ``` > > Hi, > > > > With newer kernels, starting with the v5.14 series, when using a MS > > Windows 10 guest VM with PCI passthrough of an AMD Radeon Pro WX 3200 > > discrete GPU, the passed-through GPU will not run above 501 MHz, even > > when it is under 100% load and well below the temperature limit. As a > > result, GPU-intensive software (such as video games) runs unusably > > slowly in the VM. > > > > In contrast, with older kernels, the passed-through GPU runs at up to > > 1295 MHz (the correct hardware limit), so GPU-intensive software runs at > > a reasonable speed in the VM. > > > > I've confirmed that the issue exists with the following kernel versions: > > > > - v5.16 > > - v5.14 > > - v5.14-rc1 > > > > The issue does not exist with the following kernels: > > > > - v5.13 > > - various packaged (non-vanilla) 5.10.* Arch Linux `linux-lts` kernels > > > > So, the issue was introduced between v5.13 and v5.14-rc1. I'm willing to > > bisect the commit history to narrow it down further, if that would be > > helpful. > > > > The configuration details and test results are provided below. In > > summary, for the kernels with this issue, the GPU core stays at a > > constant 0.8 V, the GPU core clock ranges from 214 MHz to 501 MHz, and > > the GPU memory stays at a constant 625 MHz, in the VM. For the correctly > > working kernels, the GPU core ranges from 0.85 V to 1.0 V, the GPU core > > clock ranges from 214 MHz to 1295 MHz, and the GPU memory stays at 1500 > > MHz, in the VM. > > > > Please let me know if additional information would be helpful. > > > > Regards, > > James Turner > > > > # Configuration Details > > > > Hardware: > > > > - Dell Precision 7540 laptop > > - CPU: Intel Core i7-9750H (x86-64) > > - Discrete GPU: AMD Radeon Pro WX 3200 > > - The internal display is connected to the integrated GPU, and external > > displays are connected to the discrete GPU. > > > > Software: > > > > - KVM host: Arch Linux > > - self-built vanilla kernel (built using Arch Linux `PKGBUILD` > > modified to use vanilla kernel sources from git.kernel.org) > > - libvirt 1:7.10.0-2 > > - qemu 6.2.0-2 > > > > - KVM guest: Windows 10 > > - GPU driver: Radeon Pro Software Version 21.Q3 (Note that I also > > experienced this issue with the 20.Q4 driver, using packaged > > (non-vanilla) Arch Linux kernels on the host, before updating to the > > 21.Q3 driver.) > > > > Kernel config: > > > > - For v5.13, v5.14-rc1, and v5.14, I used > > > > https://github.com/archlinux/svntogit-packages/blob/89c24952adbfa645d9e1a6f12c572929f7e4e3c7/trunk/config > > (The build script ran `make olddefconfig` on that config file.) > > > > - For v5.16, I used > > > > https://github.com/archlinux/svntogit-packages/blob/94f84e1ad8a530e54aa34cadbaa76e8dcc439d10/trunk/config > > (The build script ran `make olddefconfig` on that config file.) > > > > I set up the VM with PCI passthrough according to the instructions at > > https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF > > > > I'm passing through the following PCI devices to the VM, as listed by > > `lspci -D -nn`: > > > > :01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, > > Inc. [AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981] > > :01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] > > Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0] > > > > The host kernel command line includes the following relevant options: > > > > intel_iommu=on vfio-pci.ids=1002:6981,1002:aae0 > > > > to enable IOMMU and bind the `vfio-pci` driver to the PCI devices. > > > > My `/etc/mkinitcpio.conf` includes the following line: > > > > MODULES=(vfio_pci vfio vfio_iommu_type1 vfio_virqfd i915 amdgpu) > > > > to load
[PATCH] drm/amdkfd: enable heavy-weight TLB flush on Vega20
It is to meet the requirement for memory allocation optimization on MI50. Signed-off-by: Eric Huang --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 5b8ae0795c0a..d708f1a502cf 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1582,7 +1582,8 @@ static int kfd_ioctl_free_memory_of_gpu(struct file *filep, static bool kfd_flush_tlb_after_unmap(struct kfd_dev *dev) { return KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 2) || (KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 1) && - dev->adev->sdma.instance[0].fw_version >= 18); + dev->adev->sdma.instance[0].fw_version >= 18) || + KFD_GC_VERSION(dev) == IP_VERSION(9, 4, 0); } static int kfd_ioctl_map_memory_to_gpu(struct file *filep, -- 2.25.1
[PATCH] drm/amd/pm: remove useless if
Clean the following coccicheck warning: ./drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c:7035:2-4: WARNING: possible condition with no effect (if == else). Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong --- drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c index 23ff0d812e4b..7427c50409d4 100644 --- a/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c +++ b/drivers/gpu/drm/amd/pm/legacy-dpm/si_dpm.c @@ -7032,10 +7032,7 @@ static int si_power_control_set_level(struct amdgpu_device *adev) ret = si_resume_smc(adev); if (ret) return ret; - ret = si_set_sw_state(adev); - if (ret) - return ret; - return 0; + return si_set_sw_state(adev); } static void si_set_vce_clock(struct amdgpu_device *adev, -- 2.20.1.7.g153144c
Re: [PATCH v9 3/6] drm: implement top-down allocation method
On 19/01/2022 11:37, Arunpravin wrote: Implemented a function which walk through the order list, compares the offset and returns the maximum offset block, this method is unpredictable in obtaining the high range address blocks which depends on allocation and deallocation. for instance, if driver requests address at a low specific range, allocator traverses from the root block and splits the larger blocks until it reaches the specific block and in the process of splitting, lower orders in the freelist are occupied with low range address blocks and for the subsequent TOPDOWN memory request we may return the low range blocks.To overcome this issue, we may go with the below approach. The other approach, sorting each order list entries in ascending order and compares the last entry of each order list in the freelist and return the max block. This creates sorting overhead on every drm_buddy_free() request and split up of larger blocks for a single page request. v2: - Fix alignment issues(Matthew Auld) - Remove unnecessary list_empty check(Matthew Auld) - merged the below patch to see the feature in action - add top-down alloc support to i915 driver Signed-off-by: Arunpravin --- drivers/gpu/drm/drm_buddy.c | 36 --- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 3 ++ include/drm/drm_buddy.h | 1 + 3 files changed, 35 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 954e31962c74..6aa5c1ce25bf 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -371,6 +371,26 @@ alloc_range_bias(struct drm_buddy *mm, return ERR_PTR(err); } +static struct drm_buddy_block * +get_maxblock(struct list_head *head) +{ + struct drm_buddy_block *max_block = NULL, *node; + + max_block = list_first_entry_or_null(head, +struct drm_buddy_block, +link); + if (!max_block) + return NULL; + + list_for_each_entry(node, head, link) { + if (drm_buddy_block_offset(node) > + drm_buddy_block_offset(max_block)) + max_block = node; + } If we feed in the knowledge of the visible_size(or perhaps implement that generically as "zones"), I think this can be done more efficiently. It could also be useful to track directly in the allocator how much of the visible_size is still available, rather than having to do that in the upper levels by scanning the entire list. But hopefully in practice this should be good enough for our needs, Reviewed-by: Matthew Auld + + return max_block; +} + static struct drm_buddy_block * alloc_from_freelist(struct drm_buddy *mm, unsigned int order, @@ -381,11 +401,17 @@ alloc_from_freelist(struct drm_buddy *mm, int err; for (i = order; i <= mm->max_order; ++i) { - block = list_first_entry_or_null(>free_list[i], -struct drm_buddy_block, -link); - if (block) - break; + if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) { + block = get_maxblock(>free_list[i]); + if (block) + break; + } else { + block = list_first_entry_or_null(>free_list[i], +struct drm_buddy_block, +link); + if (block) + break; + } } if (!block) diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c index 1411f4cf1f21..3662434b64bb 100644 --- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c +++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c @@ -53,6 +53,9 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man, INIT_LIST_HEAD(_res->blocks); bman_res->mm = mm; + if (place->flags & TTM_PL_FLAG_TOPDOWN) + bman_res->flags |= DRM_BUDDY_TOPDOWN_ALLOCATION; + if (place->fpfn || lpfn != man->size) bman_res->flags |= DRM_BUDDY_RANGE_ALLOCATION; diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index 865664b90a8a..424fc443115e 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -28,6 +28,7 @@ }) #define DRM_BUDDY_RANGE_ALLOCATION (1 << 0) +#define DRM_BUDDY_TOPDOWN_ALLOCATION (1 << 1) struct drm_buddy_block { #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12)
Re: [PATCH v9 2/6] drm: improve drm_buddy_alloc function
On 19/01/2022 11:37, Arunpravin wrote: - Make drm_buddy_alloc a single function to handle range allocation and non-range allocation demands - Implemented a new function alloc_range() which allocates the requested power-of-two block comply with range limitations - Moved order computation and memory alignment logic from i915 driver to drm buddy v2: merged below changes to keep the build unbroken - drm_buddy_alloc_range() becomes obsolete and may be removed - enable ttm range allocation (fpfn / lpfn) support in i915 driver - apply enhanced drm_buddy_alloc() function to i915 driver v3(Matthew Auld): - Fix alignment issues and remove unnecessary list_empty check - add more validation checks for input arguments - make alloc_range() block allocations as bottom-up - optimize order computation logic - replace uint64_t with u64, which is preferred in the kernel v4(Matthew Auld): - keep drm_buddy_alloc_range() function implementation for generic actual range allocations - keep alloc_range() implementation for end bias allocations v5(Matthew Auld): - modify drm_buddy_alloc() passing argument place->lpfn to lpfn as place->lpfn will currently always be zero for i915 v6(Matthew Auld): - fixup potential uaf - If we are unlucky and can't allocate enough memory when splitting blocks, where we temporarily end up with the given block and its buddy on the respective free list, then we need to ensure we delete both blocks, and no just the buddy, before potentially freeing them Hmm, not sure we really want to squash existing bug fixes into this patch. Perhaps bring in [1] to the start of your series? i915_buddy is gone now. Alternatively I can resend such that it applies on top drm_buddy. Your choice. [1] https://patchwork.freedesktop.org/patch/469806/?series=98953=1 - fix warnings reported by kernel test robot Signed-off-by: Arunpravin --- drivers/gpu/drm/drm_buddy.c | 326 +- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 67 ++-- drivers/gpu/drm/i915/i915_ttm_buddy_manager.h | 2 + include/drm/drm_buddy.h | 22 +- 4 files changed, 293 insertions(+), 124 deletions(-) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index d60878bc9c20..954e31962c74 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -282,23 +282,99 @@ void drm_buddy_free_list(struct drm_buddy *mm, struct list_head *objects) } EXPORT_SYMBOL(drm_buddy_free_list); -/** - * drm_buddy_alloc_blocks - allocate power-of-two blocks - * - * @mm: DRM buddy manager to allocate from - * @order: size of the allocation - * - * The order value here translates to: - * - * 0 = 2^0 * mm->chunk_size - * 1 = 2^1 * mm->chunk_size - * 2 = 2^2 * mm->chunk_size - * - * Returns: - * allocated ptr to the _buddy_block on success - */ -struct drm_buddy_block * -drm_buddy_alloc_blocks(struct drm_buddy *mm, unsigned int order) +static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) +{ + return s1 <= e2 && e1 >= s2; +} + +static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2) +{ + return s1 <= s2 && e1 >= e2; +} + +static struct drm_buddy_block * +alloc_range_bias(struct drm_buddy *mm, +u64 start, u64 end, +unsigned int order) +{ + struct drm_buddy_block *block; + struct drm_buddy_block *buddy; + LIST_HEAD(dfs); + int err; + int i; + + end = end - 1; + + for (i = 0; i < mm->n_roots; ++i) + list_add_tail(>roots[i]->tmp_link, ); + + do { + u64 block_start; + u64 block_end; + + block = list_first_entry_or_null(, +struct drm_buddy_block, +tmp_link); + if (!block) + break; + + list_del(>tmp_link); + + if (drm_buddy_block_order(block) < order) + continue; + + block_start = drm_buddy_block_offset(block); + block_end = block_start + drm_buddy_block_size(mm, block) - 1; + + if (!overlaps(start, end, block_start, block_end)) + continue; + + if (drm_buddy_block_is_allocated(block)) + continue; + + if (contains(start, end, block_start, block_end) && + order == drm_buddy_block_order(block)) { + /* +* Find the free block within the range. +*/ + if (drm_buddy_block_is_free(block)) + return block; + + continue; + } + + if (!drm_buddy_block_is_split(block)) { + err = split_block(mm, block); + if
RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems
The S2idle suspend/resume process seems also depends on the CONFIG_SUSPEND. Moreover, why this check function still return true even when BIOS/AMDPMC not configured correctly? You know we still looking into some S0ix abort issue and system will run into such problem when mark those misconfigured case also as s0ix. Thanks, Prike > -Original Message- > From: Limonciello, Mario > Sent: Friday, January 21, 2022 3:59 AM > To: amd-gfx@lists.freedesktop.org; Lazar, Lijo ; Liang, > Prike > Cc: Bjoren Dasse > Subject: RE: [PATCH v4] drm/amd: Warn users about potential s0ix problems > > [Public] > > Add back on Lijo and Prike, my mistake they got dropped from CC. > > > -Original Message- > > From: Limonciello, Mario > > Sent: Tuesday, January 18, 2022 21:41 > > To: amd-gfx@lists.freedesktop.org > > Cc: Limonciello, Mario ; Bjoren Dasse > > > > Subject: [PATCH v4] drm/amd: Warn users about potential s0ix problems > > > > On some OEM setups users can configure the BIOS for S3 or S2idle. > > When configured to S3 users can still choose 's2idle' in the kernel by > > using `/sys/power/mem_sleep`. Before commit 6dc8265f9803 > ("drm/amdgpu: > > always reset the asic in suspend (v2)"), the GPU would crash. Now > > when configured this way, the system should resume but will use more > power. > > > > As such, adjust the `amdpu_acpi_is_s0ix function` to warn users about > > potential power consumption issues during their first attempt at > > suspending. > > > > Reported-by: Bjoren Dasse > > Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1824 > > Signed-off-by: Mario Limonciello > > --- > > v3->v4: > > * Add back in CONFIG_SUSPEND check > > v2->v3: > > * Better direct users how to recover in the bad cases > > v1->v2: > > * Only show messages in s2idle cases > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 21 +++- > - > > 1 file changed, 15 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > index 4811b0faafd9..2531da6cbec3 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > > @@ -1040,11 +1040,20 @@ void amdgpu_acpi_detect(void) > > */ > > bool amdgpu_acpi_is_s0ix_active(struct amdgpu_device *adev) { -#if > > IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_SUSPEND) > > - if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) { > > - if (adev->flags & AMD_IS_APU) > > - return pm_suspend_target_state == > > PM_SUSPEND_TO_IDLE; > > - } > > -#endif > > +#if IS_ENABLED(CONFIG_SUSPEND) > > + if (!(adev->flags & AMD_IS_APU) || > > + pm_suspend_target_state != PM_SUSPEND_TO_IDLE) > > + return false; > > +#else > > return false; > > +#endif > > + if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0)) > > + dev_warn_once(adev->dev, > > + "Power consumption will be higher as BIOS has > not > > been configured for suspend-to-idle.\n" > > + "To use suspend-to-idle change the sleep mode in > > BIOS setup.\n"); > > +#if !IS_ENABLED(CONFIG_AMD_PMC) > > + dev_warn_once(adev->dev, > > + "Power consumption will be higher as the kernel has not > > been compiled with CONFIG_AMD_PMC.\n"); > > +#endif > > + return true; > > } > > -- > > 2.25.1
RE: [PATCH Review 1/1] drm/amdgpu: fix channel index mapping for SIENNA_CICHLID
[AMD Official Use Only] > -Original Message- > From: Stanley.Yang > Sent: Friday, January 21, 2022 5:35 PM > To: amd-gfx@lists.freedesktop.org; Zhang, Hawking > ; Ziya, Mohammad zafar > ; Clements, John > ; Zhou1, Tao > Cc: Yang, Stanley > Subject: [PATCH Review 1/1] drm/amdgpu: fix channel index mapping for > SIENNA_CICHLID > > Pmfw read ecc info registers in the following order, > umc0: ch_inst 0, 1, 2 ... 7 > umc1: ch_inst 0, 1, 2 ... 7 > The position of the register value stored in eccinfo table is calculated > according > to the below formula, > channel_index = umc_inst * channel_in_umc + ch_inst Driver directly use > the [Tao]: use -> uses > index of eccinfo table array as channel index, it's not correct, driver need [Tao]: need -> needs to The patch itself is OK for me, with the comments above fixed, it's: Reviewed-by: Tao Zhou > convert eccinfo table array index to channel index according to > channel_idx_tbl. > > Change-Id: I26c3a99d161a00a69f7d00bd177942b6cd65a0de > Signed-off-by: Stanley.Yang > --- > drivers/gpu/drm/amd/amdgpu/umc_v8_7.c | 29 --- > 1 file changed, 17 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c > b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c > index cd57f39df7d1..d70417196662 100644 > --- a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c > +++ b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c > @@ -55,29 +55,36 @@ static inline uint32_t > get_umc_v8_7_channel_index(struct amdgpu_device *adev, } > > static void umc_v8_7_ecc_info_query_correctable_error_count(struct > amdgpu_device *adev, > - uint32_t channel_index, > + uint32_t umc_inst, uint32_t > ch_inst, > unsigned long *error_count) > { > uint64_t mc_umc_status; > + uint32_t eccinfo_table_idx; > struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); > + > + eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst; > + > /* check for SRAM correctable error >* MCUMC_STATUS is a 64 bit register >*/ > - mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status; > + mc_umc_status = ras- > >umc_ecc.ecc[eccinfo_table_idx].mca_umc_status; > if (REG_GET_FIELD(mc_umc_status, > MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1 && > REG_GET_FIELD(mc_umc_status, > MCA_UMC_UMC0_MCUMC_STATUST0, CECC) == 1) > *error_count += 1; > } > > static void umc_v8_7_ecc_info_querry_uncorrectable_error_count(struct > amdgpu_device *adev, > - uint32_t > channel_index, > + uint32_t umc_inst, > uint32_t ch_inst, > unsigned long > *error_count) > { > uint64_t mc_umc_status; > + uint32_t eccinfo_table_idx; > struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); > > + eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst; > + > /* check the MCUMC_STATUS */ > - mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status; > + mc_umc_status = ras- > >umc_ecc.ecc[eccinfo_table_idx].mca_umc_status; > if ((REG_GET_FIELD(mc_umc_status, > MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1) && > (REG_GET_FIELD(mc_umc_status, > MCA_UMC_UMC0_MCUMC_STATUST0, Deferred) == 1 || > REG_GET_FIELD(mc_umc_status, > MCA_UMC_UMC0_MCUMC_STATUST0, UECC) == 1 || @@ -94,20 +101,16 @@ > static void umc_v8_7_ecc_info_query_ras_error_count(struct amdgpu_device > *adev, > > uint32_t umc_inst= 0; > uint32_t ch_inst = 0; > - uint32_t channel_index = 0; > > /* TODO: driver needs to toggle DF Cstate to ensure >* safe access of UMC registers. Will add the protection >*/ > LOOP_UMC_INST_AND_CH(umc_inst, ch_inst) { > - channel_index = get_umc_v8_7_channel_index(adev, > - umc_inst, > - ch_inst); > umc_v8_7_ecc_info_query_correctable_error_count(adev, > - channel_index, > + umc_inst, ch_inst, > &(err_data- > >ce_count)); > umc_v8_7_ecc_info_querry_uncorrectable_error_count(adev, > - channel_index, > + umc_inst, ch_inst, > &(err_data- > >ue_count)); > } > } > @@ -120,12 +123,14 @@ static void > umc_v8_7_ecc_info_query_error_address(struct amdgpu_device *adev, > uint64_t mc_umc_status, err_addr, retired_page; > struct eeprom_table_record
Re: [PATCH] drm/amdgpu: drop WARN_ON in amdgpu_gart_bind/unbind
Am 21.01.22 um 09:47 schrieb Guchun Chen: NULL pointer check has guarded it already. calltrace: amdgpu_ttm_gart_bind+0x49/0xa0 [amdgpu] amdgpu_ttm_alloc_gart+0x13f/0x180 [amdgpu] amdgpu_bo_create_reserved+0x139/0x2c0 [amdgpu] ? amdgpu_ttm_debugfs_init+0x120/0x120 [amdgpu] amdgpu_bo_create_kernel+0x17/0x80 [amdgpu] amdgpu_ttm_init+0x542/0x5e0 [amdgpu] Fixes: f0239505d6c4("drm/amdgpu: remove gart.ready flag") Signed-off-by: Guchun Chen Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c index 53cc844346f0..91d8207336c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c @@ -161,7 +161,7 @@ void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset, uint64_t flags = 0; int idx; - if (WARN_ON(!adev->gart.ptr)) + if (!adev->gart.ptr) return; if (!drm_dev_enter(adev_to_drm(adev), )) @@ -241,7 +241,7 @@ void amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset, int pages, dma_addr_t *dma_addr, uint64_t flags) { - if (WARN_ON(!adev->gart.ptr)) + if (!adev->gart.ptr) return; amdgpu_gart_map(adev, offset, pages, dma_addr, flags, adev->gart.ptr);
Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure
On 1/21/2022 2:56 PM, Quan, Evan wrote: [AMD Official Use Only] -Original Message- From: Lazar, Lijo Sent: Friday, January 21, 2022 4:52 PM To: Quan, Evan ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Chen, Guchun ; Huang, Ray Subject: Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure On 1/21/2022 1:14 PM, Evan Quan wrote: As it has exactly the same value as the 'enabled' member and also do the same thing. I believe the original intention is different. We need to cache the features which are really supported by PMFW on that device on init. When a request comes through sysfs node for enable/disable of a feature, we should check against this and disallow those which are not supported. [Quan, Evan] Uh, I doubt it was designed with such intention. As if it was, there should be checks in ->get_allowed_feature_mask(e.g. navi10_get_allowed_feature_mask) whether driver tries to enable/disable different features from PMFW. " When a request comes through sysfs node for enable/disable of a feature" If the sysfs node mentioned is "pp_features", I might be able to provide some insights why there is no such checks added when I made them. Considering there may be some dependency between those dpm features(e.g. gfx ulv depends on gfx dpm), we actually cannot guard user whether their settings will succeed. E.g. If PMFW supports both GFX ULV and DPM but user wants ULV only, the checks against pmfw supported features seem fine. But actually that cannot work due to the dependency between them. So, instead of applying some checks which actually cannot guard anything, we just let user take the risks. Below is one example > - if (!smu_cmn_feature_is_supported(smu, SMU_FEATURE_FAN_CONTROL_BIT)) > + if (!smu_cmn_feature_is_enabled(smu, SMU_FEATURE_FAN_CONTROL_BIT)) Let's say user switched to manual mode, that time we disable fan control and move to manual mode. Next time when user requests auto mode, we check if fan control is originally supported on that platform and switch to auto. Either way, we should cache the features which are originally supported on the platform (through a combination of PPTable features and allowed feature masks on ASICs which are applicable) and disallow anything outside of that. Thanks, Lijo BR Evan Thanks, Lijo Signed-off-by: Evan Quan Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 - drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 - .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 8 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 10 +- .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 19 --- .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 5 + .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 5 + .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c | 3 --- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 17 - drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 3 --- 10 files changed, 19 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 5ace30434e60..d3237b89f2c5 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -949,7 +949,6 @@ static int smu_sw_init(void *handle) smu->pool_size = adev->pm.smu_prv_buffer_size; smu->smu_feature.feature_num = SMU_FEATURE_MAX; - bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX); bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX); bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX); diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h index 18f24db7d202..3c0360772822 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h @@ -388,7 +388,6 @@ struct smu_power_context { struct smu_feature { uint32_t feature_num; - DECLARE_BITMAP(supported, SMU_FEATURE_MAX); DECLARE_BITMAP(allowed, SMU_FEATURE_MAX); DECLARE_BITMAP(enabled, SMU_FEATURE_MAX); }; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index 84834c24a7e9..9fb290f9aaeb 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -1625,8 +1625,8 @@ static int navi10_display_config_changed(struct smu_context *smu) int ret = 0; if ((smu->watermarks_bitmap & WATERMARKS_EXIST) && - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) && - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) { + smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) && + smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) {
[PATCH Review 1/1] drm/amdgpu: fix channel index mapping for SIENNA_CICHLID
Pmfw read ecc info registers in the following order, umc0: ch_inst 0, 1, 2 ... 7 umc1: ch_inst 0, 1, 2 ... 7 The position of the register value stored in eccinfo table is calculated according to the below formula, channel_index = umc_inst * channel_in_umc + ch_inst Driver directly use the index of eccinfo table array as channel index, it's not correct, driver need convert eccinfo table array index to channel index according to channel_idx_tbl. Change-Id: I26c3a99d161a00a69f7d00bd177942b6cd65a0de Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/amdgpu/umc_v8_7.c | 29 --- 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c index cd57f39df7d1..d70417196662 100644 --- a/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c +++ b/drivers/gpu/drm/amd/amdgpu/umc_v8_7.c @@ -55,29 +55,36 @@ static inline uint32_t get_umc_v8_7_channel_index(struct amdgpu_device *adev, } static void umc_v8_7_ecc_info_query_correctable_error_count(struct amdgpu_device *adev, - uint32_t channel_index, + uint32_t umc_inst, uint32_t ch_inst, unsigned long *error_count) { uint64_t mc_umc_status; + uint32_t eccinfo_table_idx; struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + + eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst; + /* check for SRAM correctable error * MCUMC_STATUS is a 64 bit register */ - mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status; + mc_umc_status = ras->umc_ecc.ecc[eccinfo_table_idx].mca_umc_status; if (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1 && REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, CECC) == 1) *error_count += 1; } static void umc_v8_7_ecc_info_querry_uncorrectable_error_count(struct amdgpu_device *adev, - uint32_t channel_index, + uint32_t umc_inst, uint32_t ch_inst, unsigned long *error_count) { uint64_t mc_umc_status; + uint32_t eccinfo_table_idx; struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst; + /* check the MCUMC_STATUS */ - mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status; + mc_umc_status = ras->umc_ecc.ecc[eccinfo_table_idx].mca_umc_status; if ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1) && (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Deferred) == 1 || REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UECC) == 1 || @@ -94,20 +101,16 @@ static void umc_v8_7_ecc_info_query_ras_error_count(struct amdgpu_device *adev, uint32_t umc_inst= 0; uint32_t ch_inst = 0; - uint32_t channel_index = 0; /* TODO: driver needs to toggle DF Cstate to ensure * safe access of UMC registers. Will add the protection */ LOOP_UMC_INST_AND_CH(umc_inst, ch_inst) { - channel_index = get_umc_v8_7_channel_index(adev, - umc_inst, - ch_inst); umc_v8_7_ecc_info_query_correctable_error_count(adev, - channel_index, + umc_inst, ch_inst, &(err_data->ce_count)); umc_v8_7_ecc_info_querry_uncorrectable_error_count(adev, - channel_index, + umc_inst, ch_inst, &(err_data->ue_count)); } } @@ -120,12 +123,14 @@ static void umc_v8_7_ecc_info_query_error_address(struct amdgpu_device *adev, uint64_t mc_umc_status, err_addr, retired_page; struct eeprom_table_record *err_rec; uint32_t channel_index; + uint32_t eccinfo_table_idx; struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + eccinfo_table_idx = umc_inst * adev->umc.channel_inst_num + ch_inst; channel_index = adev->umc.channel_idx_tbl[umc_inst * adev->umc.channel_inst_num + ch_inst]; - mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status; + mc_umc_status = ras->umc_ecc.ecc[eccinfo_table_idx].mca_umc_status; if (mc_umc_status == 0) return; @@ -140,7 +145,7 @@ static void
RE: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure
[AMD Official Use Only] > -Original Message- > From: Lazar, Lijo > Sent: Friday, January 21, 2022 4:52 PM > To: Quan, Evan ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Chen, Guchun > ; Huang, Ray > Subject: Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' > member of smu_feature structure > > > > On 1/21/2022 1:14 PM, Evan Quan wrote: > > As it has exactly the same value as the 'enabled' member and also do > > the same thing. > > > > I believe the original intention is different. We need to cache the features > which are really supported by PMFW on that device on init. When a request > comes through sysfs node for enable/disable of a feature, we should check > against this and disallow those which are not supported. [Quan, Evan] Uh, I doubt it was designed with such intention. As if it was, there should be checks in ->get_allowed_feature_mask(e.g. navi10_get_allowed_feature_mask) whether driver tries to enable/disable different features from PMFW. " When a request comes through sysfs node for enable/disable of a feature" If the sysfs node mentioned is "pp_features", I might be able to provide some insights why there is no such checks added when I made them. Considering there may be some dependency between those dpm features(e.g. gfx ulv depends on gfx dpm), we actually cannot guard user whether their settings will succeed. E.g. If PMFW supports both GFX ULV and DPM but user wants ULV only, the checks against pmfw supported features seem fine. But actually that cannot work due to the dependency between them. So, instead of applying some checks which actually cannot guard anything, we just let user take the risks. BR Evan > > Thanks, > Lijo > > > Signed-off-by: Evan Quan > > Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a > > --- > > drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 - > > drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 - > > .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 8 > > .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 10 +- > > .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 19 --- > > > .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 5 + > > .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 5 + > > .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c | 3 --- > > drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 17 - > > drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 3 --- > > 10 files changed, 19 insertions(+), 53 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > index 5ace30434e60..d3237b89f2c5 100644 > > --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > @@ -949,7 +949,6 @@ static int smu_sw_init(void *handle) > > > > smu->pool_size = adev->pm.smu_prv_buffer_size; > > smu->smu_feature.feature_num = SMU_FEATURE_MAX; > > - bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX); > > bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX); > > bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX); > > > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h > > b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h > > index 18f24db7d202..3c0360772822 100644 > > --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h > > +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h > > @@ -388,7 +388,6 @@ struct smu_power_context { > > struct smu_feature > > { > > uint32_t feature_num; > > - DECLARE_BITMAP(supported, SMU_FEATURE_MAX); > > DECLARE_BITMAP(allowed, SMU_FEATURE_MAX); > > DECLARE_BITMAP(enabled, SMU_FEATURE_MAX); > > }; > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c > > index 84834c24a7e9..9fb290f9aaeb 100644 > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c > > @@ -1625,8 +1625,8 @@ static int navi10_display_config_changed(struct > smu_context *smu) > > int ret = 0; > > > > if ((smu->watermarks_bitmap & WATERMARKS_EXIST) && > > - smu_cmn_feature_is_supported(smu, > SMU_FEATURE_DPM_DCEFCLK_BIT) && > > - smu_cmn_feature_is_supported(smu, > SMU_FEATURE_DPM_SOCCLK_BIT)) { > > + smu_cmn_feature_is_enabled(smu, > SMU_FEATURE_DPM_DCEFCLK_BIT) && > > + smu_cmn_feature_is_enabled(smu, > SMU_FEATURE_DPM_SOCCLK_BIT)) { > > ret = smu_cmn_send_smc_msg_with_param(smu, > SMU_MSG_NumOfDisplays, > > smu->display_config- > >num_display, > > NULL); > > @@ -1864,13 +1864,13 @@ static int > navi10_notify_smc_display_config(struct smu_context *smu) > > min_clocks.dcef_clock_in_sr = smu->display_config- > >min_dcef_deep_sleep_set_clk; > > min_clocks.memory_clock = smu->display_config- > >min_mem_set_clock; > > > > - if
Re: [PATCH 3/7] drm/amd/pm: drop the redundant 'supported' member of smu_feature structure
On 1/21/2022 1:14 PM, Evan Quan wrote: As it has exactly the same value as the 'enabled' member and also do the same thing. I believe the original intention is different. We need to cache the features which are really supported by PMFW on that device on init. When a request comes through sysfs node for enable/disable of a feature, we should check against this and disallow those which are not supported. Thanks, Lijo Signed-off-by: Evan Quan Change-Id: I07c9a5ac5290cd0d88a40ce1768d393156419b5a --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 - drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 - .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 8 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 10 +- .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c| 19 --- .../gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.c | 5 + .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 5 + .../drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c | 3 --- drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c| 17 - drivers/gpu/drm/amd/pm/swsmu/smu_cmn.h| 3 --- 10 files changed, 19 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 5ace30434e60..d3237b89f2c5 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -949,7 +949,6 @@ static int smu_sw_init(void *handle) smu->pool_size = adev->pm.smu_prv_buffer_size; smu->smu_feature.feature_num = SMU_FEATURE_MAX; - bitmap_zero(smu->smu_feature.supported, SMU_FEATURE_MAX); bitmap_zero(smu->smu_feature.enabled, SMU_FEATURE_MAX); bitmap_zero(smu->smu_feature.allowed, SMU_FEATURE_MAX); diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h index 18f24db7d202..3c0360772822 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h @@ -388,7 +388,6 @@ struct smu_power_context { struct smu_feature { uint32_t feature_num; - DECLARE_BITMAP(supported, SMU_FEATURE_MAX); DECLARE_BITMAP(allowed, SMU_FEATURE_MAX); DECLARE_BITMAP(enabled, SMU_FEATURE_MAX); }; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index 84834c24a7e9..9fb290f9aaeb 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -1625,8 +1625,8 @@ static int navi10_display_config_changed(struct smu_context *smu) int ret = 0; if ((smu->watermarks_bitmap & WATERMARKS_EXIST) && - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) && - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) { + smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) && + smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) { ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_NumOfDisplays, smu->display_config->num_display, NULL); @@ -1864,13 +1864,13 @@ static int navi10_notify_smc_display_config(struct smu_context *smu) min_clocks.dcef_clock_in_sr = smu->display_config->min_dcef_deep_sleep_set_clk; min_clocks.memory_clock = smu->display_config->min_mem_set_clock; - if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) { + if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DPM_DCEFCLK_BIT)) { clock_req.clock_type = amd_pp_dcef_clock; clock_req.clock_freq_in_khz = min_clocks.dcef_clock * 10; ret = smu_v11_0_display_clock_voltage_request(smu, _req); if (!ret) { - if (smu_cmn_feature_is_supported(smu, SMU_FEATURE_DS_DCEFCLK_BIT)) { + if (smu_cmn_feature_is_enabled(smu, SMU_FEATURE_DS_DCEFCLK_BIT)) { ret = smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetMinDeepSleepDcefclk, min_clocks.dcef_clock_in_sr/100, diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index 651fe748e423..d568d6137a00 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -1281,8 +1281,8 @@ static int sienna_cichlid_display_config_changed(struct smu_context *smu) int ret = 0; if ((smu->watermarks_bitmap & WATERMARKS_EXIST) && - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_DCEFCLK_BIT) && - smu_cmn_feature_is_supported(smu, SMU_FEATURE_DPM_SOCCLK_BIT)) { +
[PATCH 2/2] add register dump function for nv asic reset
Register dump call on NV ASIC reset on AMD GPU reset Signed-off-by: Somalapuram Amaranath --- drivers/gpu/drm/amd/amdgpu/nv.c | 24 1 file changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 2ec1ffb36b1f..b5701772597f 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -529,10 +529,34 @@ nv_asic_reset_method(struct amdgpu_device *adev) } } +static void amdgpu_reset_dumps(struct amdgpu_device *adev) +{ + int r = 0, i; + + /* original raven doesn't have full asic reset */ + if ((adev->apu_flags & AMD_APU_IS_RAVEN) && + !(adev->apu_flags & AMD_APU_IS_RAVEN2)) + return; + for (i = 0; i < adev->num_ip_blocks; i++) { + if (!adev->ip_blocks[i].status.valid) + continue; + if (!adev->ip_blocks[i].version->funcs->reset_reg_dumps) + continue; + r = adev->ip_blocks[i].version->funcs->reset_reg_dumps(adev); + + if (r) + DRM_ERROR("reset_reg_dumps of IP block <%s> failed %d\n", + adev->ip_blocks[i].version->funcs->name, r); + } + + dump_stack(); +} + static int nv_asic_reset(struct amdgpu_device *adev) { int ret = 0; + amdgpu_reset_dumps(adev); switch (nv_asic_reset_method(adev)) { case AMD_RESET_METHOD_PCI: dev_info(adev->dev, "PCI reset\n"); -- 2.25.1
[PATCH 1/2] drm/amdgpu: add reset register trace dump function for gfx_v10_0
Implementation of register trace dump function on the AMD GPU resets Signed-off-by: Somalapuram Amaranath --- drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 8 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c| 53 ++- drivers/gpu/drm/amd/include/amd_shared.h | 1 + 3 files changed, 60 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h index d855cb53c7e0..c97b53b54333 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h @@ -537,6 +537,14 @@ TRACE_EVENT(amdgpu_ib_pipe_sync, __entry->seqno) ); +TRACE_EVENT(gfx_v10_0_reset_reg_dumps, + TP_PROTO(char *reg_dumps), + TP_ARGS(reg_dumps), + TP_STRUCT__entry(__string(dumps, reg_dumps)), + TP_fast_assign(__assign_str(dumps, reg_dumps);), + TP_printk("amdgpu register dump {%s}", __get_str(dumps)) +); + #undef AMDGPU_JOB_GET_TIMELINE_NAME #endif diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index dbe7442fb25c..a63cdd0ad53a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -46,7 +46,7 @@ #include "v10_structs.h" #include "gfx_v10_0.h" #include "nbio_v2_3.h" - +#include "amdgpu_trace.h" /* * Navi10 has two graphic rings to share each graphic pipe. * 1. Primary ring @@ -188,6 +188,12 @@ #define RLCG_ERROR_REPORT_ENABLED(adev) \ (amdgpu_sriov_reg_indirect_mmhub(adev) || amdgpu_sriov_reg_indirect_gc(adev)) +#define N_REGS (17) +#define DUMP_REG(addr) do {\ + (dump)[i][0] = (addr); \ + (dump)[i++][1] = RREG32_SOC15_IP(GC, addr);\ + } while (0) + MODULE_FIRMWARE("amdgpu/navi10_ce.bin"); MODULE_FIRMWARE("amdgpu/navi10_pfp.bin"); MODULE_FIRMWARE("amdgpu/navi10_me.bin"); @@ -7466,7 +7472,6 @@ static int gfx_v10_0_hw_init(void *handle) { int r; struct amdgpu_device *adev = (struct amdgpu_device *)handle; - if (!amdgpu_emu_mode) gfx_v10_0_init_golden_registers(adev); @@ -7580,6 +7585,49 @@ static int gfx_v10_0_hw_fini(void *handle) return 0; } +static int gfx_v10_0_reset_reg_dumps(void *handle) +{ + struct amdgpu_device *adev = (struct amdgpu_device *)handle; + uint32_t i = 0; + uint32_t (*dump)[2], n_regs = 0; + char *reg_dumps; + + dump = kmalloc(N_REGS*2*sizeof(uint32_t), GFP_KERNEL); + reg_dumps = kmalloc(1024, GFP_KERNEL); + + if (dump == NULL || reg_dumps == NULL) + return -ENOMEM; + + DUMP_REG(mmGRBM_STATUS2); + DUMP_REG(mmGRBM_STATUS_SE0); + DUMP_REG(mmGRBM_STATUS_SE1); + DUMP_REG(mmGRBM_STATUS_SE2); + DUMP_REG(mmGRBM_STATUS_SE3); + DUMP_REG(mmSDMA0_STATUS_REG); + DUMP_REG(mmSDMA1_STATUS_REG); + DUMP_REG(mmCP_STAT); + DUMP_REG(mmCP_STALLED_STAT1); + DUMP_REG(mmCP_STALLED_STAT1); + DUMP_REG(mmCP_STALLED_STAT3); + DUMP_REG(mmCP_CPC_STATUS); + DUMP_REG(mmCP_CPC_BUSY_STAT); + DUMP_REG(mmCP_CPC_STALLED_STAT1); + DUMP_REG(mmCP_CPF_STATUS); + DUMP_REG(mmCP_CPF_BUSY_STAT); + DUMP_REG(mmCP_CPF_STALLED_STAT1); + + n_regs = i; + + for (i = 0; i < n_regs; i++) + sprintf(reg_dumps + strlen(reg_dumps), "%08x: %08x, ", dump[i][0], dump[i][1]); + + trace_gfx_v10_0_reset_reg_dumps(reg_dumps); + kfree(dump); + kfree(reg_dumps); + + return 0; +} + static int gfx_v10_0_suspend(void *handle) { return gfx_v10_0_hw_fini(handle); @@ -9359,6 +9407,7 @@ static const struct amd_ip_funcs gfx_v10_0_ip_funcs = { .set_clockgating_state = gfx_v10_0_set_clockgating_state, .set_powergating_state = gfx_v10_0_set_powergating_state, .get_clockgating_state = gfx_v10_0_get_clockgating_state, + .reset_reg_dumps = gfx_v10_0_reset_reg_dumps, }; static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_gfx = { diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h index 4b9e68a79f06..a165680d73a2 100644 --- a/drivers/gpu/drm/amd/include/amd_shared.h +++ b/drivers/gpu/drm/amd/include/amd_shared.h @@ -298,6 +298,7 @@ struct amd_ip_funcs { enum amd_powergating_state state); void (*get_clockgating_state)(void *handle, u32 *flags); int (*enable_umd_pstate)(void *handle, enum amd_dpm_forced_level *level); + int (*reset_reg_dumps)(void *handle); }; -- 2.25.1
[PATCH] drm/amdgpu: drop WARN_ON in amdgpu_gart_bind/unbind
NULL pointer check has guarded it already. calltrace: amdgpu_ttm_gart_bind+0x49/0xa0 [amdgpu] amdgpu_ttm_alloc_gart+0x13f/0x180 [amdgpu] amdgpu_bo_create_reserved+0x139/0x2c0 [amdgpu] ? amdgpu_ttm_debugfs_init+0x120/0x120 [amdgpu] amdgpu_bo_create_kernel+0x17/0x80 [amdgpu] amdgpu_ttm_init+0x542/0x5e0 [amdgpu] Fixes: f0239505d6c4("drm/amdgpu: remove gart.ready flag") Signed-off-by: Guchun Chen --- drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c index 53cc844346f0..91d8207336c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c @@ -161,7 +161,7 @@ void amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset, uint64_t flags = 0; int idx; - if (WARN_ON(!adev->gart.ptr)) + if (!adev->gart.ptr) return; if (!drm_dev_enter(adev_to_drm(adev), )) @@ -241,7 +241,7 @@ void amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset, int pages, dma_addr_t *dma_addr, uint64_t flags) { - if (WARN_ON(!adev->gart.ptr)) + if (!adev->gart.ptr) return; amdgpu_gart_map(adev, offset, pages, dma_addr, flags, adev->gart.ptr); -- 2.17.1
Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM
Hi, this is your Linux kernel regression tracker speaking. On 21.01.22 03:13, James Turner wrote: > > I finished the bisection (log below). The issue was introduced in > f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)"). FWIW, that was: > drm/amdgpu/acpi: make ATPX/ATCS structures global (v2) > They are global ACPI methods, so maybe the structures > global in the driver. This simplified a number of things > in the handling of these methods. > > v2: reset the handle if verify interface fails (Lijo) > v3: fix compilation when ACPI is not defined. > > Reviewed-by: Lijo Lazar > Signed-off-by: Alex Deucher In that case we need to get those two and the maintainers for the driver involved by addressing them with this mail. And to make it easy for them here is a link and a quote from the original report: https://lore.kernel.org/all/87ee57c8fu@turner.link/ ``` > Hi, > > With newer kernels, starting with the v5.14 series, when using a MS > Windows 10 guest VM with PCI passthrough of an AMD Radeon Pro WX 3200 > discrete GPU, the passed-through GPU will not run above 501 MHz, even > when it is under 100% load and well below the temperature limit. As a > result, GPU-intensive software (such as video games) runs unusably > slowly in the VM. > > In contrast, with older kernels, the passed-through GPU runs at up to > 1295 MHz (the correct hardware limit), so GPU-intensive software runs at > a reasonable speed in the VM. > > I've confirmed that the issue exists with the following kernel versions: > > - v5.16 > - v5.14 > - v5.14-rc1 > > The issue does not exist with the following kernels: > > - v5.13 > - various packaged (non-vanilla) 5.10.* Arch Linux `linux-lts` kernels > > So, the issue was introduced between v5.13 and v5.14-rc1. I'm willing to > bisect the commit history to narrow it down further, if that would be > helpful. > > The configuration details and test results are provided below. In > summary, for the kernels with this issue, the GPU core stays at a > constant 0.8 V, the GPU core clock ranges from 214 MHz to 501 MHz, and > the GPU memory stays at a constant 625 MHz, in the VM. For the correctly > working kernels, the GPU core ranges from 0.85 V to 1.0 V, the GPU core > clock ranges from 214 MHz to 1295 MHz, and the GPU memory stays at 1500 > MHz, in the VM. > > Please let me know if additional information would be helpful. > > Regards, > James Turner > > # Configuration Details > > Hardware: > > - Dell Precision 7540 laptop > - CPU: Intel Core i7-9750H (x86-64) > - Discrete GPU: AMD Radeon Pro WX 3200 > - The internal display is connected to the integrated GPU, and external > displays are connected to the discrete GPU. > > Software: > > - KVM host: Arch Linux > - self-built vanilla kernel (built using Arch Linux `PKGBUILD` > modified to use vanilla kernel sources from git.kernel.org) > - libvirt 1:7.10.0-2 > - qemu 6.2.0-2 > > - KVM guest: Windows 10 > - GPU driver: Radeon Pro Software Version 21.Q3 (Note that I also > experienced this issue with the 20.Q4 driver, using packaged > (non-vanilla) Arch Linux kernels on the host, before updating to the > 21.Q3 driver.) > > Kernel config: > > - For v5.13, v5.14-rc1, and v5.14, I used > > https://github.com/archlinux/svntogit-packages/blob/89c24952adbfa645d9e1a6f12c572929f7e4e3c7/trunk/config > (The build script ran `make olddefconfig` on that config file.) > > - For v5.16, I used > > https://github.com/archlinux/svntogit-packages/blob/94f84e1ad8a530e54aa34cadbaa76e8dcc439d10/trunk/config > (The build script ran `make olddefconfig` on that config file.) > > I set up the VM with PCI passthrough according to the instructions at > https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF > > I'm passing through the following PCI devices to the VM, as listed by > `lspci -D -nn`: > > :01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. > [AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981] > :01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] > Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0] > > The host kernel command line includes the following relevant options: > > intel_iommu=on vfio-pci.ids=1002:6981,1002:aae0 > > to enable IOMMU and bind the `vfio-pci` driver to the PCI devices. > > My `/etc/mkinitcpio.conf` includes the following line: > > MODULES=(vfio_pci vfio vfio_iommu_type1 vfio_virqfd i915 amdgpu) > > to load `vfio-pci` before the graphics drivers. (Note that removing > `i915 amdgpu` has no effect on this issue.) > > I'm using libvirt to manage the VM. The relevant portions of the XML > file are: > > > > > > function="0x0"/> > > > > > > function="0x0"/> > > > # Test Results > > For testing, I used the following procedure: > > 1. Boot the host machine and log in. > > 2. Run the following commands to gather information. For
Re: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation
On Wed, 2022-01-19 at 16:00 -0500, Steven Rostedt wrote: > On Wed, 19 Jan 2022 21:25:08 +0200 > Andy Shevchenko wrote: > > > > I say keep it one line! > > > > > > Reviewed-by: Steven Rostedt (Google) > > > > I believe Sakari strongly follows the 80 rule, which means... > > Checkpatch says "100" I think we need to simply update the docs (the > documentation always lags the code ;-) checkpatch doesn't say anything normally, it's a stupid script. It just mindlessly bleats a message when a line exceeds 100 chars... Just fyi: I think it's nicer on a single line too.