[PATCH] drm/amdgpu: Enable mode-1 reset for RAS recovery in fatal error mode

2022-11-14 Thread YiPeng Chai
The patch is enabling mode-1 reset for RAS recovery in fatal error mode. Signed-off-by: YiPeng Chai Reviewed-by: Hawking Zhang Reviewed-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c| 7 ++- 2 files changed, 10

RE: [PATCH] drm/amd/pm: Fix output of pp_od_clk_voltage

2022-11-14 Thread Quan, Evan
[AMD Official Use Only - General] > -Original Message- > From: amd-gfx On Behalf Of > Jonatas Esteves > Sent: Tuesday, November 15, 2022 7:13 AM > To: amd-gfx@lists.freedesktop.org > Cc: Jonatas Esteves > Subject: [PATCH] drm/amd/pm: Fix output of pp_od_clk_voltage > > Printing the

[PATCH] drm/amd/pm: Fix output of pp_od_clk_voltage

2022-11-14 Thread Jonatas Esteves
Printing the other clock types should not be conditioned on being able to print OD_SCLK. Some GPUs currently have limited capability of only printing a subset of these. Since this condition was introduced in v5.18-rc1, reading from `pp_od_clk_voltage` has been returning empty on the Asus ROG

[PATCH] drm/amd/dc/dce120: Fix audio register mapping, stop triggering KASAN

2022-11-14 Thread Lyude Paul
There's been a very long running bug that seems to have been neglected for a while, where amdgpu consistently triggers a KASAN error at start: BUG: KASAN: global-out-of-bounds in read_indirect_azalia_reg+0x1d4/0x2a0 [amdgpu] Read of size 4 at addr c2274b28 by task modprobe/1889

[PATCH v2 2/4] drm/display/dp_mst: Fix drm_dp_mst_add_affected_dsc_crtcs() return code

2022-11-14 Thread Lyude Paul
Looks like that we're accidentally dropping a pretty important return code here. For some reason, we just return -EINVAL if we fail to get the MST topology state. This is wrong: error codes are important and should never be squashed without being handled, which here seems to have the potential to

[PATCH v2 4/4] drm/amdgpu/dm/dp_mst: Don't grab mst_mgr->lock when computing DSC state

2022-11-14 Thread Lyude Paul
Now that we've fixed the issue with using the incorrect topology manager, we're actually grabbing the topology manager's lock - and consequently deadlocking. Luckily for us though, there's actually nothing in AMD's DSC state computation code that really should need this lock. The one exception is

[PATCH v2 3/4] drm/amdgpu/dm/mst: Use the correct topology mgr pointer in amdgpu_dm_connector

2022-11-14 Thread Lyude Paul
This bug hurt me. Basically, it appears that we've been grabbing the entirely wrong mutex in the MST DSC computation code for amdgpu! While we've been grabbing: amdgpu_dm_connector->mst_mgr That's zero-initialized memory, because the only connectors we'll ever actually be doing DSC

[PATCH v2 1/4] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-14 Thread Lyude Paul
It appears that amdgpu makes the mistake of completely ignoring the return values from the DP MST helpers, and instead just returns a simple true/false. In this case, it seems to have come back to bite us because as a result of simply returning false from compute_mst_dsc_configs_for_state(),

[PATCH v2 0/4] drm/amdgpu: Regression fixes from MST atomic-only conversion

2022-11-14 Thread Lyude Paul
There was a bit of unexpected fallout from the atomic-only conversion patches that I had pushed a while ago, which mostly affected amdgpu. This fixes most of the severe issues, although we're still investigating some lingering issues (which I suspect may just fix themselves following this

Re: [PATCH 1/2] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-14 Thread Lyude Paul
On Wed, 2022-11-09 at 09:48 +, Lin, Wayne wrote: > >    } > > - if (!drm_dp_mst_atomic_check(state) && !debugfs_overwrite) { > > + ret = drm_dp_mst_atomic_check(state); > > + if (ret == 0 && !debugfs_overwrite) { > >    set_dsc_configs_from_fairness_vars(params, vars, count, >

[linux-next:master] BUILD REGRESSION 5c92ddca1053df02387e8006d06094e18cc8538a

2022-11-14 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 5c92ddca1053df02387e8006d06094e18cc8538a Add linux-next specific files for 20221114 Error/Warning reports: https://lore.kernel.org/oe-kbuild-all/202211041320.coq8eelj-...@intel.com https

Re: [PATCH 1/2] drm/amdgpu/mst: Stop ignoring error codes and deadlocking

2022-11-14 Thread Lyude Paul
On Wed, 2022-11-09 at 09:48 +, Lin, Wayne wrote: > [Public] > > Thanks, Lyude! > Comments inline. > > > -Original Message- > > From: Lyude Paul > > Sent: Saturday, November 5, 2022 7:59 AM > > To: amd-gfx@lists.freedesktop.org > > Cc: Wentland, Harry ; sta...@vger.kernel.org; > >

RE: [PATCH] drm/amd/display: Align dcn314_smu logging with other DCNs

2022-11-14 Thread Limonciello, Mario
[Public] Conceptually makes sense to me, but please see below comments: > -Original Message- > From: roman...@amd.com > Sent: Monday, November 14, 2022 15:07 > To: amd-gfx@lists.freedesktop.org; Deucher, Alexander > ; Wentland, Harry > ; Limonciello, Mario > ; Rizvi, Saaem > Cc: Li,

[PATCH] drm/amd/display: Align dcn314_smu logging with other DCNs

2022-11-14 Thread Roman.Li
From: Roman Li [Why] Assert on non-OK response from SMU is unnecessary. It was replaced with respective log message on other asics in the past with commit: "drm/amd/display: Removing assert statements for Linux" [How] Remove asert and add dbg logging as on other DCNs. CC: Saaem Rizvi

Re: amd/amdkfd: Fix a memory limit issue

2022-11-14 Thread Limonciello, Mario
On 11/14/2022 12:45, Eric Huang wrote: It is to resolve a regression, which fails to allocate VRAM due to no free memory in application, the reason is we add check of vram_pin_size for memory limit, and application is pinning the memory for Peerdirect, KFD should not count it in memory limit. So

Re: [PATCH] amd/amdkfd: Fix a memory limit issue

2022-11-14 Thread Felix Kuehling
Am 2022-11-14 um 13:45 schrieb Eric Huang: It is to resolve a regression, which fails to allocate VRAM due to no free memory in application, the reason is we add check of vram_pin_size for memory limit, and application is pinning the memory for Peerdirect, KFD should not count it in memory

[PATCH] amd/amdkfd: Fix a memory limit issue

2022-11-14 Thread Eric Huang
It is to resolve a regression, which fails to allocate VRAM due to no free memory in application, the reason is we add check of vram_pin_size for memory limit, and application is pinning the memory for Peerdirect, KFD should not count it in memory limit. So removing vram_pin_size will resolve it.

Re: [PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)

2022-11-14 Thread Michel Dänzer
On 2022-11-10 18:00, Michel Dänzer wrote: > On 2022-11-08 09:01, Zhu, Jiadong wrote: >> >> I reproduced the glxgears 400fps scenario locally. The issue is caused by >> the patch5 "drm/amdgpu: Improve the software rings priority scheduler" which >> slows down the low priority scheduler thread if

Re: [6.1][regression] after commit dd80d9c8eecac8c516da5b240d01a35660ba6cb6 some games (Cyberpunk 2077, Forza Horizon 4/5) hang at start

2022-11-14 Thread Thorsten Leemhuis
Hi, this is your Linux kernel regression tracker. Top-posting for once, to make this easily accessible to everyone. Christian, was any progress made to address this? It looks stalled sine 10+ days, as I looked for posts and commits that referenced this report, but couldn't find anything. Ciao,

Re: [6.1][regression] after commit dd80d9c8eecac8c516da5b240d01a35660ba6cb6 some games (Cyberpunk 2077, Forza Horizon 4/5) hang at start

2022-11-14 Thread Christian König
Hi Mikhail, Am 02.11.22 um 14:43 schrieb Christian König: Am 02.11.22 um 14:36 schrieb Mikhail Gavrilov: On Tue, Nov 1, 2022 at 10:52 PM Christian König wrote: [SNIP] But the most interesting thing is that all previous kernels 6.0, 5.19 are affected by the problem. It is not enough to revert

Re: [Intel-gfx] [PATCH 1/7] drm: mark drm.debug-on-dyndbg as BROKEN for now

2022-11-14 Thread Ville Syrjälä
On Fri, Nov 11, 2022 at 03:17:09PM -0700, Jim Cromie wrote: > drm.debug-on-dyndbg has a regression, due to a chicken-egg > initialization problem: > > 1- modprobe i915 >i915 needs drm.ko, which is loaded 1st > > 2- "modprobe drm drm.debug=0x1ff" (virtual/implied) >drm.debug is set

Re: [PATCH] drm/radeon: fix potential racing issue due to mmap_lock

2022-11-14 Thread Christian König
Am 13.11.22 um 13:42 schrieb Dawei Li: Both find_vma() and get_user_pages() need explicit protection of mmap lock, fix them by mmap_lock and get_user_pages_fast(). NAK, the MM read lock should already be taken when we reach this function. Could be that this is buggy and the function is called

[PATCH] drm/radeon: fix potential racing issue due to mmap_lock

2022-11-14 Thread Dawei Li
Both find_vma() and get_user_pages() need explicit protection of mmap lock, fix them by mmap_lock and get_user_pages_fast(). Fixes: ddd00e33e17a ("drm/radeon: add userptr flag to limit it to anonymous memory v2") Fixes: f72a113a71ab ("drm/radeon: add userptr support v8") Signed-off-by: Dawei Li

Re: [PATCH] drm/amd/display: add parameter backlight_min

2022-11-14 Thread Filip Moc
On Tue, Nov 01, 2022 at 12:06:55PM -0400, Alex Deucher wrote: > On Tue, Nov 1, 2022 at 11:42 AM Filip Moc wrote: > > > > Hello Alex, > > > > thank you for your response. > > > > Yes, I have HP ENVY x360 Convertible 13-ay1xxx, and backlight_min=2 > > seems to work the best in my case. > > > > I

[PATCH] [next] drm/amdgpu: Replace one-elements array with flex-array members

2022-11-14 Thread Paulo Miguel Almeida
One-element arrays are deprecated, and we are replacing them with flexible array members instead. So, replace one-element array with flexible-array member in structs ATOM_I2C_VOLTAGE_OBJECT_V3, ATOM_ASIC_INTERNAL_SS_INFO_V2, ATOM_ASIC_INTERNAL_SS_INFO_V3, and refactor the rest of the code

[PATCH v2] amdgpu_dm: add missing NULL checks in amdgpu_dm_fini()

2022-11-14 Thread Daniil Tatianin
adev->dm.dc is already checked in a few other if branches of the same function so no reason not to check it everywhere else as well. Moreover, admgpu_dm_fini() can be called from an error branch in amdgpu_dm_init(), at which point it won't contain a valid dm.dc. This might happen, for example,