Re: [PATCH] drm/amdgpu: fix for suspend/resume sequence under sriov

2022-11-02 Thread Alex Deucher
On Thu, Nov 3, 2022 at 12:06 AM Victor Zhao  wrote:
>
> - clear kiq ring after suspend/resume under sriov to aviod kiq ring
> test failure
> - update irq after resume to fix kiq interrput loss
>
> Signed-off-by: Victor Zhao 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
>  drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 522820eeaa59..5b9f992e4607 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4197,6 +4197,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
> fbcon)
> }
>
> /* Make sure IB tests flushed */
> +   if (amdgpu_sriov_vf(adev))
> +   amdgpu_irq_gpu_reset_resume_helper(adev);
> flush_delayed_work(>delayed_init_work);
>
> if (adev->in_s0ix) {
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> index 7853d3ca58cf..49d34c7bbf20 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
> @@ -6909,6 +6909,8 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring 
> *ring)
> mutex_unlock(>srbm_mutex);
> } else {
> memset((void *)mqd, 0, sizeof(*mqd));
> +   if (amdgpu_sriov_vf(adev) && adev->in_suspend)
> +   amdgpu_ring_clear_ring(ring);

gfx_v8_0.c, gfx_v9_0.c, and gfx_v11_0.c need a similar fix.  With
those fixed as well, the patch is:
Acked-by: Alex Deucher 

Alex


> mutex_lock(>srbm_mutex);
> nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
> amdgpu_ring_init_mqd(ring);
> --
> 2.25.1
>


RE: [PATCH] drm/amd/amdgpu: temporary workaround to skip ras error for gc_v11_0_3

2022-11-02 Thread Zhang, Hawking
[AMD Official Use Only - General]

Reviewed-by: Hawking Zhang 

Regards,
Hawking
-Original Message-
From: amd-gfx  On Behalf Of Kenneth Feng
Sent: Thursday, November 3, 2022 11:39
To: amd-gfx@lists.freedesktop.org
Cc: Feng, Kenneth 
Subject: [PATCH] drm/amd/amdgpu: temporary workaround to skip ras error for 
gc_v11_0_3

temporary workaround to skip ras error for gc_v11_0_3 until IFWI release later

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 84a76c36d9a7..dac236a6b3b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -4688,10 +4688,10 @@ static int gfx_v11_0_ras_late_init(void *handle)
 
ret = amdgpu_ras_feature_enable(adev, gfx_common_if, true);
if (ret)
-   dev_err(adev->dev, "Failed to enable gfx11 ras feature\n");
+   dev_warn(adev->dev, "Failed to enable gfx11 ras feature\n");
 
kfree(gfx_common_if);
-   return ret;
+   return 0;
 }
 
 static int gfx_v11_0_late_init(void *handle)
-- 
2.25.1


RE: [PATCH] drm/amdgpu: fix for suspend/resume sequence under sriov

2022-11-02 Thread Zhao, Victor
[AMD Official Use Only - General]

Hi Alex,

This is a patch fixing the sriov suspend/resume sequence. Please help review.


Thanks,
Victor



-Original Message-
From: Victor Zhao  
Sent: Thursday, November 3, 2022 12:06 PM
To: amd-gfx@lists.freedesktop.org; Deucher, Alexander 

Cc: Zhao, Victor 
Subject: [PATCH] drm/amdgpu: fix for suspend/resume sequence under sriov

- clear kiq ring after suspend/resume under sriov to aviod kiq ring test failure
- update irq after resume to fix kiq interrput loss

Signed-off-by: Victor Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 522820eeaa59..5b9f992e4607 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4197,6 +4197,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
}
 
/* Make sure IB tests flushed */
+   if (amdgpu_sriov_vf(adev))
+   amdgpu_irq_gpu_reset_resume_helper(adev);
flush_delayed_work(>delayed_init_work);
 
if (adev->in_s0ix) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 7853d3ca58cf..49d34c7bbf20 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -6909,6 +6909,8 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
} else {
memset((void *)mqd, 0, sizeof(*mqd));
+   if (amdgpu_sriov_vf(adev) && adev->in_suspend)
+   amdgpu_ring_clear_ring(ring);
mutex_lock(>srbm_mutex);
nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
amdgpu_ring_init_mqd(ring);
--
2.25.1


[PATCH] drm/amdgpu: fix for suspend/resume sequence under sriov

2022-11-02 Thread Victor Zhao
- clear kiq ring after suspend/resume under sriov to aviod kiq ring
test failure
- update irq after resume to fix kiq interrput loss

Signed-off-by: Victor Zhao 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 ++
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 522820eeaa59..5b9f992e4607 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4197,6 +4197,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
}
 
/* Make sure IB tests flushed */
+   if (amdgpu_sriov_vf(adev))
+   amdgpu_irq_gpu_reset_resume_helper(adev);
flush_delayed_work(>delayed_init_work);
 
if (adev->in_s0ix) {
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 7853d3ca58cf..49d34c7bbf20 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -6909,6 +6909,8 @@ static int gfx_v10_0_kiq_init_queue(struct amdgpu_ring 
*ring)
mutex_unlock(>srbm_mutex);
} else {
memset((void *)mqd, 0, sizeof(*mqd));
+   if (amdgpu_sriov_vf(adev) && adev->in_suspend)
+   amdgpu_ring_clear_ring(ring);
mutex_lock(>srbm_mutex);
nv_grbm_select(adev, ring->me, ring->pipe, ring->queue, 0);
amdgpu_ring_init_mqd(ring);
-- 
2.25.1



[PATCH] drm/amd/amdgpu: temporary workaround to skip ras error for gc_v11_0_3

2022-11-02 Thread Kenneth Feng
temporary workaround to skip ras error for gc_v11_0_3 until IFWI release later

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 84a76c36d9a7..dac236a6b3b6 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -4688,10 +4688,10 @@ static int gfx_v11_0_ras_late_init(void *handle)
 
ret = amdgpu_ras_feature_enable(adev, gfx_common_if, true);
if (ret)
-   dev_err(adev->dev, "Failed to enable gfx11 ras feature\n");
+   dev_warn(adev->dev, "Failed to enable gfx11 ras feature\n");
 
kfree(gfx_common_if);
-   return ret;
+   return 0;
 }
 
 static int gfx_v11_0_late_init(void *handle)
-- 
2.25.1



[PATCH] drm/amdgpu: Disable GFX RAS feature for SRIOV case

2022-11-02 Thread YuBiao Wang
In sriov guest side doesn't need init ras feature, so skip it.

Signed-off-by: YuBiao Wang 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 84a76c36d9a7..be8ed617e269 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -4707,7 +4707,7 @@ static int gfx_v11_0_late_init(void *handle)
if (r)
return r;
 
-   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(11, 0, 3)) {
+   if (!amdgpu_sriov_vf(adev) && adev->ip_versions[GC_HWIP][0] == 
IP_VERSION(11, 0, 3)) {
r = gfx_v11_0_ras_late_init(handle);
if (r)
return r;
-- 
2.25.1



RE: [PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)

2022-11-02 Thread Zhu, Jiadong
[AMD Official Use Only - General]

>The bad news is that this series still makes some things very slow. The most 
>extreme examples so far are glxgears (runs at ~400 fps now, ~7000 fps before, 
>i.e. almost 20x slowdown) and hexchat (scrolling one page now takes ~1 second, 
>I can see it drawing line by line; before it was almost instantaneous). I 
>suspect this series makes the overhead of running a single GPU job much 
>bigger. On the bright side, I'm not noticing any significant intermittent 
>freezes anymore.

Hi Michel,

Thanks for the trying.
Is there high priority jobs running while executing glxgears? I am running 
glxgears while submitting high priority ibs using amdgpu_test, the fps ranges 
from 6000~8000.

Continuous preemption and resubmission may cause the slow fps. Could you have a 
check about how fast the trailing fence seqNo expands. On my side, the 
increment of Last signaled trailing fence is < 10 in a second.


cat /sys/kernel/debug/dri/0/amdgpu_fence_info
--- ring 0 (gfx) ---
Last signaled fence  0x0001
Last emitted 0x0001
Last signaled trailing fence 0x013c
Last emitted 0x013c
Last preempted   0x

Thanks,
Jiadong

-Original Message-
From: Michel Dänzer 
Sent: Wednesday, November 2, 2022 7:26 PM
To: Zhu, Jiadong 
Cc: Tuikov, Luben ; Huang, Ray ; 
Koenig, Christian ; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)


[ Dropping Andrey's no longer working address from Cc ]

On 2022-11-01 11:09, Michel Dänzer wrote:
> On 2022-11-01 10:58, Zhu, Jiadong wrote:
>>
>>> Patch 3 assigns preempt_ib in gfx_v9_0_sw_ring_funcs_gfx, but not in 
>>> gfx_v9_0_ring_funcs_gfx. mux->real_ring in amdgpu_mcbp_trigger_preempt 
>>> presumably uses the latter, which would explain why amdgpu_ring_preempt_ib 
>>> ends up dereferencing a NULL pointer.
>>
>> It's weird the assignment should be in gfx_v9_0_ring_funcs_gfx instead of 
>> gfx_v9_0_sw_ring_funcs_gfx.
>>
>> [PATCH 3/5] drm/amdgpu: Modify unmap_queue format for gfx9 (v4):
>> @@ -6925,6 +7047,7 @@ static const struct amdgpu_ring_funcs 
>> gfx_v9_0_ring_funcs_gfx = {
>> .emit_cntxcntl = gfx_v9_ring_emit_cntxcntl,
>> .init_cond_exec = gfx_v9_0_ring_emit_init_cond_exec,
>> .patch_cond_exec = gfx_v9_0_ring_emit_patch_cond_exec,
>> +   .preempt_ib = gfx_v9_0_ring_preempt_ib,
>> .emit_frame_cntl = gfx_v9_0_ring_emit_frame_cntl,
>> .emit_wreg = gfx_v9_0_ring_emit_wreg,
>> .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait, diff --git
>> a/drivers/gpu/drm/amd/amdgpu/soc15d.h
>> b/drivers/gpu/drm/amd/amdgpu/soc15d.h
>
> Ah! Looks like stg applied patch 3 incorrectly for me. :(
>
> I'll try and test with this fixed this week, and report back.

I'm now running with patch 3 applied correctly, and with patch 5 as well.


The good news is that I'm now seeing a positive effect with GpuTest benchmarks 
which are GPU-limited at low frame rates. In particular, with the pixmark piano 
benchmark, the GNOME Wayland session now actually stays more responsive on this 
machine than it does on my work laptop with an Intel iGPU. However, with the 
plot3d benchmark (with /plot3d_vertex_density=1750 on the command line to 
increase GPU load), it still doesn't quite manage to keep the desktop running 
at full frame rate, in contrast to the Intel iGPU.

The bad news is that this series still makes some things very slow. The most 
extreme examples so far are glxgears (runs at ~400 fps now, ~7000 fps before, 
i.e. almost 20x slowdown) and hexchat (scrolling one page now takes ~1 second, 
I can see it drawing line by line; before it was almost instantaneous). I 
suspect this series makes the overhead of running a single GPU job much bigger. 
On the bright side, I'm not noticing any significant intermittent freezes 
anymore.


In summary, while the benefits are promising, the downsides are unacceptable 
for enabling this by default.


--
Earthling Michel Dänzer|  
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fredhat.com%2Fdata=05%7C01%7CJiadong.Zhu%40amd.com%7Cb15fb94893a247d734ff08dabcc5265c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638029852189066953%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=awC3VH4zMdZGK9ayi8V3goI%2B%2FEkj0%2B2LL2VokYlLXSk%3Dreserved=0
Libre software enthusiast  | Mesa and Xwayland developer



[pull] amdgpu, amdkfd drm-fixes-6.1

2022-11-02 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 6.1.  The big change here is the hang fix for the GC11 trap handler.

The following changes since commit 30a0b95b1335e12efef89dd78518ed3e4a71a763:

  Linux 6.1-rc3 (2022-10-30 15:19:28 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.1-2022-11-02

for you to fetch changes up to 6640f8e5adb69a0550fe1d224d3ac64c10f00eef:

  drm/amdkfd: update GFX11 CWSR trap handler (2022-11-02 17:16:25 -0400)


amd-drm-fixes-6.1-2022-11-02:

amdgpu:
- DCN 3.1.4 fixes
- DCN 3.2.x fixes
- GC 11.x fixes
- Virtual display fix
- Fail suspend if resources can't be evicted
- SR-IOV fix
- Display PSR fix

amdkfd:
- Fix possible NULL pointer deref
- GC 11.x trap handler fix


Alvin Lee (1):
  drm/amd/display: Enable timing sync on DCN32

Dillon Varone (2):
  drm/amd/display: Update latencies on DCN321
  drm/amd/display: Set memclk levels to be at least 1 for dcn32

Fangzhi Zuo (1):
  drm/amd/display: Ignore Cable ID Feature

Gavin Wan (1):
  drm/amdgpu: Disable GPU reset on SRIOV before remove pci.

George Shen (4):
  drm/amd/display: Fix DCN32 DSC delay calculation
  drm/amd/display: Use forced DSC bpp in DML
  drm/amd/display: Round up DST_after_scaler to nearest int
  drm/amd/display: Add DSC delay factor workaround

Graham Sider (2):
  drm/amdgpu: correct MES debugfs versions
  drm/amdgpu: disable GFXOFF during compute for GFX11

Jay Cornwall (1):
  drm/amdkfd: update GFX11 CWSR trap handler

Jun Lei (1):
  drm/amd/display: Limit dcn32 to 1950Mhz display clock

Leo Chen (1):
  drm/amd/display: Update DSC capabilitie for DCN314

Mario Limonciello (1):
  drm/amd: Fail the suspend if resources can't be evicted

Max Tseng (1):
  drm/amd/display: cursor update command incomplete

Nevenko Stupar (1):
  drm/amd/display: Investigate tool reported FCLK P-state deviations

Yang Li (1):
  drm/amdkfd: Fix NULL pointer dereference in svm_migrate_to_ram()

Yifan Zhang (1):
  drm/amdgpu: set fb_modifiers_not_supported in vkms

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c |   7 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  15 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c   |   2 +
 drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h | 764 +++--
 .../gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx10.asm |   6 +
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c   |   4 +-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  |   3 +
 .../amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c   |  11 +-
 drivers/gpu/drm/amd/display/dc/dc.h|   1 +
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c  |   4 +
 .../drm/amd/display/dc/dcn314/dcn314_resource.c|   2 +-
 .../gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c   |   1 +
 .../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c   |   4 +-
 .../amd/display/dc/dml/dcn32/display_mode_vba_32.c |  10 +-
 .../dc/dml/dcn32/display_mode_vba_util_32.c|   7 +-
 .../dc/dml/dcn32/display_mode_vba_util_32.h|   3 +-
 .../display/dc/dml/dcn32/display_rq_dlg_calc_32.c  |   4 +-
 .../gpu/drm/amd/display/dc/dml/dcn321/dcn321_fpu.c |  15 +-
 .../drm/amd/display/dc/dml/display_mode_structs.h  |   3 +
 .../gpu/drm/amd/display/dc/dml/display_mode_vba.c  |   2 +-
 22 files changed, 464 insertions(+), 417 deletions(-)


[PATCH] drm/amdgpu: Add notifier lock for KFD userptrs

2022-11-02 Thread Felix Kuehling
Add a per-process MMU notifier lock for processing notifiers from
userptrs. Use that lock to properly synchronize page table updates with
MMU notifiers.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|  12 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 202 +++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c|  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mn.h|   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  18 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   4 +
 6 files changed, 158 insertions(+), 93 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index f50e3ba4d7a5..1ca18a77818b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "amdgpu_sync.h"
@@ -75,7 +76,7 @@ struct kgd_mem {
 
uint32_t alloc_flags;
 
-   atomic_t invalid;
+   uint32_t invalid;
struct amdkfd_process_info *process_info;
 
struct amdgpu_sync sync;
@@ -131,7 +132,8 @@ struct amdkfd_process_info {
struct amdgpu_amdkfd_fence *eviction_fence;
 
/* MMU-notifier related fields */
-   atomic_t evicted_bos;
+   struct mutex notifier_lock;
+   uint32_t evicted_bos;
struct delayed_work restore_userptr_work;
struct pid *pid;
bool block_mmu_notifications;
@@ -180,7 +182,8 @@ int kfd_debugfs_kfd_mem_limits(struct seq_file *m, void 
*data);
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm);
 struct amdgpu_amdkfd_fence *to_amdgpu_amdkfd_fence(struct dma_fence *f);
 int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct amdgpu_bo *bo);
-int amdgpu_amdkfd_evict_userptr(struct kgd_mem *mem, struct mm_struct *mm);
+int amdgpu_amdkfd_evict_userptr(struct mmu_interval_notifier *mni,
+   unsigned long cur_seq, struct kgd_mem *mem);
 #else
 static inline
 bool amdkfd_fence_check_mm(struct dma_fence *f, struct mm_struct *mm)
@@ -201,7 +204,8 @@ int amdgpu_amdkfd_remove_fence_on_pt_pd_bos(struct 
amdgpu_bo *bo)
 }
 
 static inline
-int amdgpu_amdkfd_evict_userptr(struct kgd_mem *mem, struct mm_struct *mm)
+int amdgpu_amdkfd_evict_userptr(struct mmu_interval_notifier *mni,
+   unsigned long cur_seq, struct kgd_mem *mem)
 {
return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 955fa8c8213b..5510b7c42ac7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -964,7 +964,9 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t 
user_addr,
 * later stage when it is scheduled by another ioctl called by
 * CRIU master process for the target pid for restore.
 */
-   atomic_inc(>invalid);
+   mutex_lock(_info->notifier_lock);
+   mem->invalid++;
+   mutex_unlock(_info->notifier_lock);
mutex_unlock(_info->lock);
return 0;
}
@@ -1301,6 +1303,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
return -ENOMEM;
 
mutex_init(>lock);
+   mutex_init(>notifier_lock);
INIT_LIST_HEAD(>vm_list_head);
INIT_LIST_HEAD(>kfd_bo_list);
INIT_LIST_HEAD(>userptr_valid_list);
@@ -1317,7 +1320,6 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
}
 
info->pid = get_task_pid(current->group_leader, PIDTYPE_PID);
-   atomic_set(>evicted_bos, 0);
INIT_DELAYED_WORK(>restore_userptr_work,
  amdgpu_amdkfd_restore_userptr_worker);
 
@@ -1372,6 +1374,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void 
**process_info,
put_pid(info->pid);
 create_evict_fence_fail:
mutex_destroy(>lock);
+   mutex_destroy(>notifier_lock);
kfree(info);
}
return ret;
@@ -1496,6 +1499,7 @@ void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device 
*adev,
cancel_delayed_work_sync(_info->restore_userptr_work);
put_pid(process_info->pid);
mutex_destroy(_info->lock);
+   mutex_destroy(_info->notifier_lock);
kfree(process_info);
}
 }
@@ -1548,7 +1552,9 @@ int amdgpu_amdkfd_criu_resume(void *p)
 
mutex_lock(>lock);
pr_debug("scheduling work\n");
-   atomic_inc(>evicted_bos);
+   mutex_lock(>notifier_lock);
+   pinfo->evicted_bos++;
+   mutex_unlock(>notifier_lock);
if (!READ_ONCE(pinfo->block_mmu_notifications)) {
ret = -EINVAL;
goto out_unlock;
@@ 

[PATCH v2] drm/amdkfd: Fix error handling in criu_checkpoint

2022-11-02 Thread Felix Kuehling
Checkpoint BOs last. That way we don't need to close dmabuf FDs if
something else fails later. This avoids problematic access to user mode
memory in the error handling code path.

criu_checkpoint_bos has its own error handling and cleanup that does not
depend on access to user memory.

criu_restore is updated to match the order in which objects are saved to
make sure restored BOs use the correct private data. Since this is a
change in the layout of the checkpoint private data, bump
KFD_CRIU_PRIV_VERSION.

Fixes: be072b06c739 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects")
Reported-by: Jann Horn 
CC: Rajneesh Bhardwaj 
Signed-off-by: Felix Kuehling 

---

v2: Also changed the order on restore and bump KFD_CRIU_PRIV_VERSION
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 31 
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h|  7 --
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 5feaba6a77de..666edcb40354 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1994,38 +1994,27 @@ static int criu_checkpoint(struct file *filep,
if (ret)
goto exit_unlock;
 
-   ret = criu_checkpoint_bos(p, num_bos, (uint8_t __user *)args->bos,
-   (uint8_t __user *)args->priv_data, _offset);
-   if (ret)
-   goto exit_unlock;
-
if (num_objects) {
ret = kfd_criu_checkpoint_queues(p, (uint8_t __user 
*)args->priv_data,
 _offset);
if (ret)
-   goto close_bo_fds;
+   goto exit_unlock;
 
ret = kfd_criu_checkpoint_events(p, (uint8_t __user 
*)args->priv_data,
 _offset);
if (ret)
-   goto close_bo_fds;
+   goto exit_unlock;
 
ret = kfd_criu_checkpoint_svm(p, (uint8_t __user 
*)args->priv_data, _offset);
if (ret)
-   goto close_bo_fds;
+   goto exit_unlock;
}
 
-close_bo_fds:
-   if (ret) {
-   /* If IOCTL returns err, user assumes all FDs opened in 
criu_dump_bos are closed */
-   uint32_t i;
-   struct kfd_criu_bo_bucket *bo_buckets = (struct 
kfd_criu_bo_bucket *) args->bos;
-
-   for (i = 0; i < num_bos; i++) {
-   if (bo_buckets[i].alloc_flags & 
KFD_IOC_ALLOC_MEM_FLAGS_VRAM)
-   close_fd(bo_buckets[i].dmabuf_fd);
-   }
-   }
+   /* This must be the last thing in this function that can fail.
+* Otherwise we leak dmabuf file descriptors.
+*/
+   ret = criu_checkpoint_bos(p, num_bos, (uint8_t __user *)args->bos,
+  (uint8_t __user *)args->priv_data, _offset);
 
 exit_unlock:
mutex_unlock(>mutex);
@@ -2477,11 +2466,11 @@ static int criu_restore(struct file *filep,
if (ret)
goto exit_unlock;
 
-   ret = criu_restore_bos(p, args, _offset, args->priv_data_size);
+   ret = criu_restore_objects(filep, p, args, _offset, 
args->priv_data_size);
if (ret)
goto exit_unlock;
 
-   ret = criu_restore_objects(filep, p, args, _offset, 
args->priv_data_size);
+   ret = criu_restore_bos(p, args, _offset, args->priv_data_size);
if (ret)
goto exit_unlock;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 552c3ac85a13..069977d37605 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1063,9 +1063,12 @@ void kfd_process_set_trap_handler(struct 
qcm_process_device *qpd,
  * kfd_criu_queue_priv_data
  * kfd_criu_event_priv_data
  * kfd_criu_svm_range_priv_data
+ *
+ * Version history:
+ * 1: Initial upstream version
+ * 2: BOs are saved last to fix and simplify error handling
  */
-
-#define KFD_CRIU_PRIV_VERSION 1
+#define KFD_CRIU_PRIV_VERSION 2
 
 struct kfd_criu_process_priv_data {
uint32_t version;
-- 
2.32.0



[PATCH 1/1] drm/amdgpu: Drop eviction lock when allocating PT BO

2022-11-02 Thread Philip Yang
Re-take the eviction lock immediately again after the allocation is
completed, to fix circular locking warning with drm_buddy allocator.

Move amdgpu_vm_eviction_lock/unlock/trylock to amdgpu_vm.h as they are
called from multiple files.

Signed-off-by: Philip Yang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 26 ---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 26 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c |  2 ++
 3 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 2291aa14d888..003aa9e47085 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -143,32 +143,6 @@ int amdgpu_vm_set_pasid(struct amdgpu_device *adev, struct 
amdgpu_vm *vm,
return 0;
 }
 
-/*
- * vm eviction_lock can be taken in MMU notifiers. Make sure no reclaim-FS
- * happens while holding this lock anywhere to prevent deadlocks when
- * an MMU notifier runs in reclaim-FS context.
- */
-static inline void amdgpu_vm_eviction_lock(struct amdgpu_vm *vm)
-{
-   mutex_lock(>eviction_lock);
-   vm->saved_flags = memalloc_noreclaim_save();
-}
-
-static inline int amdgpu_vm_eviction_trylock(struct amdgpu_vm *vm)
-{
-   if (mutex_trylock(>eviction_lock)) {
-   vm->saved_flags = memalloc_noreclaim_save();
-   return 1;
-   }
-   return 0;
-}
-
-static inline void amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm)
-{
-   memalloc_noreclaim_restore(vm->saved_flags);
-   mutex_unlock(>eviction_lock);
-}
-
 /**
  * amdgpu_vm_bo_evicted - vm_bo is evicted
  *
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 83acb7bd80fe..02240dc2f425 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -495,4 +495,30 @@ static inline uint64_t amdgpu_vm_tlb_seq(struct amdgpu_vm 
*vm)
return atomic64_read(>tlb_seq);
 }
 
+/*
+ * vm eviction_lock can be taken in MMU notifiers. Make sure no reclaim-FS
+ * happens while holding this lock anywhere to prevent deadlocks when
+ * an MMU notifier runs in reclaim-FS context.
+ */
+static inline void amdgpu_vm_eviction_lock(struct amdgpu_vm *vm)
+{
+   mutex_lock(>eviction_lock);
+   vm->saved_flags = memalloc_noreclaim_save();
+}
+
+static inline int amdgpu_vm_eviction_trylock(struct amdgpu_vm *vm)
+{
+   if (mutex_trylock(>eviction_lock)) {
+   vm->saved_flags = memalloc_noreclaim_save();
+   return 1;
+   }
+   return 0;
+}
+
+static inline void amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm)
+{
+   memalloc_noreclaim_restore(vm->saved_flags);
+   mutex_unlock(>eviction_lock);
+}
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
index 358b91243e37..b5f3bba851db 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
@@ -597,7 +597,9 @@ static int amdgpu_vm_pt_alloc(struct amdgpu_device *adev,
if (entry->bo)
return 0;
 
+   amdgpu_vm_eviction_unlock(vm);
r = amdgpu_vm_pt_create(adev, vm, cursor->level, immediate, );
+   amdgpu_vm_eviction_lock(vm);
if (r)
return r;
 
-- 
2.35.1



Re: [23/33] drm/amd/display: cursor update command incomplete

2022-11-02 Thread Alex Deucher
On Tue, Nov 1, 2022 at 3:27 PM Limonciello, Mario
 wrote:
>
> On 10/20/2022 10:46, Rodrigo Siqueira wrote:
> > From: Max Tseng 
> >
> > Missing send cursor_rect width & Height into DMUB. PSR-SU would use
> > these information. But missing these assignment in last refactor commit
> >
> > Reviewed-by: Anthony Koo 
> > Acked-by: Rodrigo Siqueira 
> > Signed-off-by: Max Tseng 
> > ---
>
> This was reported to help fix a PSR-SU hang found in 6.1-rc1 and later.
>
> Reported-by: Timur Kristóf 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2227
> Fixes: b73353f7f3d4 ("drm/amd/display: Use the same cursor info across
> features")
>
> Alex,
>
> Can you please queue this for a future fixes PR for 6.1?

Yes, queued up.

Alex

>
> Thanks,
>
> >   drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c | 4 
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c 
> > b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> > index 4996d2810edb..938dba5249d4 100644
> > --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> > +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c
> > @@ -623,6 +623,10 @@ void hubp2_cursor_set_attributes(
> >   hubp->att.size.bits.width= attr->width;
> >   hubp->att.size.bits.height   = attr->height;
> >   hubp->att.cur_ctl.bits.mode  = attr->color_format;
> > +
> > + hubp->cur_rect.w = attr->width;
> > + hubp->cur_rect.h = attr->height;
> > +
> >   hubp->att.cur_ctl.bits.pitch = hw_pitch;
> >   hubp->att.cur_ctl.bits.line_per_chunk = lpc;
> >   hubp->att.cur_ctl.bits.cur_2x_magnify = 
> > attr->attribute_flags.bits.ENABLE_MAGNIFICATION;
>


[linux-next:master] BUILD REGRESSION 61c3426aca2c71052ddcd06c32e29d92304990fd

2022-11-02 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: 61c3426aca2c71052ddcd06c32e29d92304990fd  Add linux-next specific 
files for 20221102

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202210271517.snuenhd0-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202211021422.8upycnnp-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4878: warning: This comment 
starts with '/**', but isn't a kernel-doc comment. Refer 
Documentation/doc-guide/kernel-doc.rst
drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_dp.c:5044:24: warning: 
implicit conversion from 'enum ' to 'enum dc_status' 
[-Wenum-conversion]
include/asm-generic/div64.h:222:35: warning: comparison of distinct pointer 
types lacks a cast
include/asm-generic/div64.h:234:32: warning: right shift count >= width of type 
[-Wshift-count-overflow]
lib/test_maple_tree.c:453:12: warning: result of comparison of constant 
4398046511104 with expression of type 'unsigned long' is always false 
[-Wtautological-constant-out-of-range-compare]
vmlinux.o: warning: objtool: select_reloc_root+0x3a3: unreachable instruction

Unverified Error/Warning (likely false positive, please contact us if 
interested):

lib/zstd/compress/huf_compress.c:460 HUF_getIndex() warn: the 
'RANK_POSITION_LOG_BUCKETS_BEGIN' macro might need parens
lib/zstd/decompress/zstd_decompress_block.c:1009 ZSTD_execSequence() warn: 
inconsistent indenting
lib/zstd/decompress/zstd_decompress_block.c:894 ZSTD_execSequenceEnd() warn: 
inconsistent indenting
lib/zstd/decompress/zstd_decompress_block.c:942 
ZSTD_execSequenceEndSplitLitBuffer() warn: inconsistent indenting
lib/zstd/decompress/zstd_decompress_internal.h:206 ZSTD_DCtx_get_bmi2() warn: 
inconsistent indenting

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|-- arc-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|   |-- 
include-asm-generic-div64.h:warning:comparison-of-distinct-pointer-types-lacks-a-cast
|   `-- include-asm-generic-div64.h:warning:right-shift-count-width-of-type
|-- arc-randconfig-r024-20221031
|   |-- 
include-asm-generic-div64.h:warning:comparison-of-distinct-pointer-types-lacks-a-cast
|   `-- include-asm-generic-div64.h:warning:right-shift-count-width-of-type
|-- arc-randconfig-r043-20221101
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|-- arm-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|   |-- 
include-asm-generic-div64.h:warning:comparison-of-distinct-pointer-types-lacks-a-cast
|   `-- include-asm-generic-div64.h:warning:right-shift-count-width-of-type
|-- arm64-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|-- i386-allyesconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|-- ia64-allmodconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc.c:warning:This-comment-starts-with-but-isn-t-a-kernel-doc-comment.-Refer-Documentation-doc-guide-kernel-doc.rst
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-core-dc_link_dp.c:warning:implicit-conversion-from-enum-anonymous-to-enum-dc_status
|-- m68k-randconfig-m041-20221102
|   |-- 
lib-zstd-compress-huf_compress.c-HUF_getIndex()-warn:the-RANK_POSITION_LOG_BUCKETS_BEGIN-macro-might-need-parens
|   |-- 
lib-zstd-decompress-zstd_decompress_block.c-ZSTD_execSequence()-warn:inconsist

Re: [PATCH v2 2/3] drm/amd/display: change GPU match with IP version for Vangogh

2022-11-02 Thread Alex Deucher
On Wed, Nov 2, 2022 at 1:00 PM Perry Yuan  wrote:
>
> Use ip versions (10,3,1) to match the GPU after Vangogh switched to use IP
> discovery path.
>
> Signed-off-by: Perry Yuan 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 1efe7fa5bc58..90636b88d6bf 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -10202,8 +10202,8 @@ int amdgpu_dm_process_dmub_aux_transfer_sync(bool 
> is_cmd_aux, struct dc_context
>   */
>  bool check_seamless_boot_capability(struct amdgpu_device *adev)
>  {
> -   switch (adev->asic_type) {
> -   case CHIP_VANGOGH:
> +   switch (adev->ip_versions[GC_HWIP][0]) {
> +   case IP_VERSION(10, 3, 1):

How about:
switch (adev->ip_versions[DCE_HWIP][0]) {
case IP_VERSION(3, 0, 1):

Since this code is more relevant to the DC IP than the GC IP.  With
that fixed, the series is:
Reviewed-by: Alex Deucher 

> if (!adev->mman.keep_stolen_vga_memory)
> return true;
> break;
> --
> 2.34.1
>


[PATCH v2 3/3] drm/amdgpu: remove the DID of Vangogh from pciidlist

2022-11-02 Thread Perry Yuan
change the vangogh family to use IP discovery path to initialize IP
list, this needs to remove the DID from the PCI ID list to allow the IP
discovery path to set all the IP versions correctly.

Signed-off-by: Perry Yuan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 429fcdf28836..9c323405e3bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1909,9 +1909,6 @@ static const struct pci_device_id pciidlist[] = {
{0x1002, 0x73AF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
{0x1002, 0x73BF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_SIENNA_CICHLID},
 
-   /* Van Gogh */
-   {0x1002, 0x163F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VANGOGH|AMD_IS_APU},
-
/* Yellow Carp */
{0x1002, 0x164D, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 
CHIP_YELLOW_CARP|AMD_IS_APU},
{0x1002, 0x1681, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 
CHIP_YELLOW_CARP|AMD_IS_APU},
-- 
2.34.1



[PATCH v2 2/3] drm/amd/display: change GPU match with IP version for Vangogh

2022-11-02 Thread Perry Yuan
Use ip versions (10,3,1) to match the GPU after Vangogh switched to use IP
discovery path.

Signed-off-by: Perry Yuan 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1efe7fa5bc58..90636b88d6bf 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10202,8 +10202,8 @@ int amdgpu_dm_process_dmub_aux_transfer_sync(bool 
is_cmd_aux, struct dc_context
  */
 bool check_seamless_boot_capability(struct amdgpu_device *adev)
 {
-   switch (adev->asic_type) {
-   case CHIP_VANGOGH:
+   switch (adev->ip_versions[GC_HWIP][0]) {
+   case IP_VERSION(10, 3, 1):
if (!adev->mman.keep_stolen_vga_memory)
return true;
break;
-- 
2.34.1



[PATCH v2 1/3] drm/amdgpu: add Vangogh APU flag to IP discovery path

2022-11-02 Thread Perry Yuan
Add the missing apu flag for Vangogh when using IP discovery code path
to initialize IPs

Signed-off-by: Perry Yuan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
index 95d34590cad1..c1b1f223f3d0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c
@@ -2153,6 +2153,7 @@ int amdgpu_discovery_set_ip_blocks(struct amdgpu_device 
*adev)
break;
case IP_VERSION(10, 3, 1):
adev->family = AMDGPU_FAMILY_VGH;
+   adev->apu_flags |= AMD_APU_IS_VANGOGH;
break;
case IP_VERSION(10, 3, 3):
adev->family = AMDGPU_FAMILY_YC;
-- 
2.34.1



Re: [PATCH 2/2] drm/amdgpu: Fix type of second parameter in odn_edit_dpm_table() callback

2022-11-02 Thread Alex Deucher
Applied the series.  Thanks!

Alex

On Wed, Nov 2, 2022 at 11:43 AM Kees Cook  wrote:
>
> On Wed, Nov 02, 2022 at 08:25:40AM -0700, Nathan Chancellor wrote:
> > With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
> > indirect call targets are validated against the expected function
> > pointer prototype to make sure the call target is valid to help mitigate
> > ROP attacks. If they are not identical, there is a failure at run time,
> > which manifests as either a kernel panic or thread getting killed. A
> > proposed warning in clang aims to catch these at compile time, which
> > reveals:
> >
> >   drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:3008:29: error: 
> > incompatible function pointer types initializing 'int (*)(void *, uint32_t, 
> > long *, uint32_t)' (aka 'int (*)(void *, unsigned int, long *, unsigned 
> > int)') with an expression of type 'int (void *, enum 
> > PP_OD_DPM_TABLE_COMMAND, long *, uint32_t)' (aka 'int (void *, enum 
> > PP_OD_DPM_TABLE_COMMAND, long *, unsigned int)') 
> > [-Werror,-Wincompatible-function-pointer-types-strict]
> >   .odn_edit_dpm_table  = smu_od_edit_dpm_table,
> >  ^
> >   1 error generated.
> >
> > There are only two implementations of ->odn_edit_dpm_table() in 'struct
> > amd_pm_funcs': smu_od_edit_dpm_table() and pp_odn_edit_dpm_table(). One
> > has a second parameter type of 'enum PP_OD_DPM_TABLE_COMMAND' and the
> > other uses 'u32'. Ultimately, smu_od_edit_dpm_table() calls
> > ->od_edit_dpm_table() from 'struct pptable_funcs' and
> > pp_odn_edit_dpm_table() calls ->odn_edit_dpm_table() from 'struct
> > pp_hwmgr_func', which both have a second parameter type of 'enum
> > PP_OD_DPM_TABLE_COMMAND'.
> >
> > Update the type parameter in both the prototype in 'struct amd_pm_funcs'
> > and pp_odn_edit_dpm_table() to 'enum PP_OD_DPM_TABLE_COMMAND', which
> > cleans up the warning.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1750
> > Reported-by: Sami Tolvanen 
> > Signed-off-by: Nathan Chancellor 
>
> Reviewed-by: Kees Cook 
>
> --
> Kees Cook


Re: [PATCH v2] [next] drm/amdgpu: Replace one-element array with flexible-array member

2022-11-02 Thread Alex Deucher
Applied.  Thanks!

Alex

On Fri, Oct 28, 2022 at 9:31 PM Paulo Miguel Almeida
 wrote:
>
> One-element arrays are deprecated, and we are replacing them with
> flexible array members instead. So, replace one-element array with
> flexible-array member in struct _ATOM_FAKE_EDID_PATCH_RECORD and
> refactor the rest of the code accordingly.
>
> Important to mention is that doing a build before/after this patch
> results in no binary output differences.
>
> This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
> routines on memcpy() and help us make progress towards globally
> enabling -fstrict-flex-arrays=3 [1].
>
> Link: https://github.com/KSPP/linux/issues/79
> Link: https://github.com/KSPP/linux/issues/238
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [1]
>
> Signed-off-by: Paulo Miguel Almeida 
> ---
> Changelog:
>
> v2: no binary output differences patch; report binary changes findings
> on commit log. Res: Kees Cook
> v1: https://lore.kernel.org/lkml/y1tkwdwpup+ud...@mail.google.com/
> ---
>  drivers/gpu/drm/amd/amdgpu/atombios_encoders.c | 7 +--
>  drivers/gpu/drm/amd/include/atombios.h | 2 +-
>  2 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c 
> b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
> index 6be9ac2b9c5b..18ae9433e463 100644
> --- a/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
> +++ b/drivers/gpu/drm/amd/amdgpu/atombios_encoders.c
> @@ -2081,8 +2081,11 @@ amdgpu_atombios_encoder_get_lcd_info(struct 
> amdgpu_encoder *encoder)
> }
> }
> record += 
> fake_edid_record->ucFakeEDIDLength ?
> -   
> fake_edid_record->ucFakeEDIDLength + 2 :
> -   
> sizeof(ATOM_FAKE_EDID_PATCH_RECORD);
> + 
> struct_size(fake_edid_record,
> + 
> ucFakeEDIDString,
> + 
> fake_edid_record->ucFakeEDIDLength) :
> + /* empty fake edid record 
> must be 3 bytes long */
> + 
> sizeof(ATOM_FAKE_EDID_PATCH_RECORD) + 1;
> break;
> case LCD_PANEL_RESOLUTION_RECORD_TYPE:
> panel_res_record = 
> (ATOM_PANEL_RESOLUTION_PATCH_RECORD *)record;
> diff --git a/drivers/gpu/drm/amd/include/atombios.h 
> b/drivers/gpu/drm/amd/include/atombios.h
> index 15943bc21bc5..b5b1d073f8e2 100644
> --- a/drivers/gpu/drm/amd/include/atombios.h
> +++ b/drivers/gpu/drm/amd/include/atombios.h
> @@ -4107,7 +4107,7 @@ typedef struct _ATOM_FAKE_EDID_PATCH_RECORD
>  {
>UCHAR ucRecordType;
>UCHAR ucFakeEDIDLength;   // = 128 means EDID length is 128 bytes, 
> otherwise the EDID length = ucFakeEDIDLength*128
> -  UCHAR ucFakeEDIDString[1];// This actually has ucFakeEdidLength 
> elements.
> +  UCHAR ucFakeEDIDString[]; // This actually has ucFakeEdidLength 
> elements.
>  } ATOM_FAKE_EDID_PATCH_RECORD;
>
>  typedef struct  _ATOM_PANEL_RESOLUTION_PATCH_RECORD
> --
> 2.37.3
>


Re: [PATCH v2] [next] drm/radeon: Replace one-element array with flexible-array member

2022-11-02 Thread Alex Deucher
On Tue, Nov 1, 2022 at 6:41 PM Kees Cook  wrote:
>
> On Tue, Nov 01, 2022 at 06:09:16PM -0400, Alex Deucher wrote:
> > On Tue, Nov 1, 2022 at 5:54 PM Kees Cook  wrote:
> > > Does the ROM always only have a single byte there? This seems unlikely
> > > given the member "ucFakeEDIDLength" (and the code below).
> >
> > I'm not sure.  I'm mostly concerned about this:
> >
> > record += fake_edid_record->ucFakeEDIDLength ?
> >   fake_edid_record->ucFakeEDIDLength + 2 :
> >   sizeof(ATOM_FAKE_EDID_PATCH_RECORD);
>
> But this is exactly what the code currently does, as noted in the commit
> log: "It's worth mentioning that doing a build before/after this patch
> results in no binary output differences.
>
> > Presumably the record should only exist if ucFakeEDIDLength is non 0,
> > but I don't know if there are some OEMs out there that just included
> > an empty record for some reason.  Maybe the code is wrong today and
> > there are some OEMs that include it and the array is already size 0.
> > In that case, Paulo's original patches are probably more correct.
>
> Right, but if true, that seems to be a distinctly separate bug fix?

You've convinced me.  Applied.

Thanks,

Alex

>
> --
> Kees Cook


Re: [PATCH] drm/amdgpu: workaround for TLB seq race

2022-11-02 Thread Alex Deucher
On Wed, Nov 2, 2022 at 10:58 AM Christian König
 wrote:
>
> It can happen that we query the sequence value before the callback
> had a chance to run.
>
> Work around that by grabbing the fence lock and releasing it again.

workaround

> Should be replaced by hw handling soon.
>
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 9ecb7f663e19..e51a46c9582b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -485,6 +485,21 @@ void amdgpu_debugfs_vm_bo_info(struct amdgpu_vm *vm, 
> struct seq_file *m);
>   */
>  static inline uint64_t amdgpu_vm_tlb_seq(struct amdgpu_vm *vm)
>  {
> +   unsigned long flags;
> +   spinlock_t *lock;
> +
> +   /*
> +* Work around to stop racing between the fence signaling and handling

Workaround

WIth that fixed up, the patch is:
Acked-by: Alex Deucher 


> +* the cb. The lock is static after initially setting it up, just make
> +* sure that the dma_fence structure isn't freed up.
> +*/
> +   rcu_read_lock();
> +   lock = vm->last_tlb_flush->lock;
> +   rcu_read_unlock();
> +
> +   spin_lock_irqsave(lock, flags);
> +   spin_unlock_irqrestore(lock, flags);
> +
> return atomic64_read(>tlb_seq);
>  }
>
> --
> 2.34.1
>


Re: [PATCH 2/2] drm/amdgpu: Fix type of second parameter in odn_edit_dpm_table() callback

2022-11-02 Thread Kees Cook
On Wed, Nov 02, 2022 at 08:25:40AM -0700, Nathan Chancellor wrote:
> With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
> indirect call targets are validated against the expected function
> pointer prototype to make sure the call target is valid to help mitigate
> ROP attacks. If they are not identical, there is a failure at run time,
> which manifests as either a kernel panic or thread getting killed. A
> proposed warning in clang aims to catch these at compile time, which
> reveals:
> 
>   drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:3008:29: error: 
> incompatible function pointer types initializing 'int (*)(void *, uint32_t, 
> long *, uint32_t)' (aka 'int (*)(void *, unsigned int, long *, unsigned 
> int)') with an expression of type 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, 
> long *, uint32_t)' (aka 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, long *, 
> unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict]
>   .odn_edit_dpm_table  = smu_od_edit_dpm_table,
>  ^
>   1 error generated.
> 
> There are only two implementations of ->odn_edit_dpm_table() in 'struct
> amd_pm_funcs': smu_od_edit_dpm_table() and pp_odn_edit_dpm_table(). One
> has a second parameter type of 'enum PP_OD_DPM_TABLE_COMMAND' and the
> other uses 'u32'. Ultimately, smu_od_edit_dpm_table() calls
> ->od_edit_dpm_table() from 'struct pptable_funcs' and
> pp_odn_edit_dpm_table() calls ->odn_edit_dpm_table() from 'struct
> pp_hwmgr_func', which both have a second parameter type of 'enum
> PP_OD_DPM_TABLE_COMMAND'.
> 
> Update the type parameter in both the prototype in 'struct amd_pm_funcs'
> and pp_odn_edit_dpm_table() to 'enum PP_OD_DPM_TABLE_COMMAND', which
> cleans up the warning.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1750
> Reported-by: Sami Tolvanen 
> Signed-off-by: Nathan Chancellor 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH 1/2] drm/amdgpu: Fix type of second parameter in trans_msg() callback

2022-11-02 Thread Kees Cook
On Wed, Nov 02, 2022 at 08:25:39AM -0700, Nathan Chancellor wrote:
> With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
> indirect call targets are validated against the expected function
> pointer prototype to make sure the call target is valid to help mitigate
> ROP attacks. If they are not identical, there is a failure at run time,
> which manifests as either a kernel panic or thread getting killed. A
> proposed warning in clang aims to catch these at compile time, which
> reveals:
> 
>   drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:412:15: error: incompatible function 
> pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, 
> u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, 
> unsigned int, unsigned int)') with an expression of type 'void (struct 
> amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct 
> amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned 
> int)') [-Werror,-Wincompatible-function-pointer-types-strict]
>   .trans_msg = xgpu_ai_mailbox_trans_msg,
>   ^
>   1 error generated.
> 
>   drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:435:15: error: incompatible function 
> pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, 
> u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, 
> unsigned int, unsigned int)') with an expression of type 'void (struct 
> amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct 
> amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned 
> int)') [-Werror,-Wincompatible-function-pointer-types-strict]
>   .trans_msg = xgpu_nv_mailbox_trans_msg,
>   ^
>   1 error generated.
> 
> The type of the second parameter in the prototype should be 'enum
> idh_request' instead of 'u32'. Update it to clear up the warnings.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1750
> Reported-by: Sami Tolvanen 
> Signed-off-by: Nathan Chancellor 

Reviewed-by: Kees Cook 

-- 
Kees Cook


[PATCH 20/22] drm/amd/display: Add margin for max vblank time for SubVP + DRR

2022-11-02 Thread Alan Liu
From: Alvin Lee 

[Description]
- Incorporate FW delays as port of max VTOTAL calculated for
  SubVP + DRR cases (since it is part of the microschedule).
- Also add margin for the max VTOTAL possible for SubVP + DRR cases.
- Due to rounding errors in FW (integer arithmetic), the microschedule
  calculation can get pushed to the next frame (incorrectly) in cases
  where we use the max VTOTAL possible to complete the MCLK switch.
- When the rounding error occurs, we are only off by 1-2 lines,
  use 40us margin which is working consistently.

Reviewed-by: Jun Lei 
Reviewed-by: Aric Cyr 
Acked-by: Alan Liu 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/dc.h  |  1 +
 drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c | 12 ++--
 .../gpu/drm/amd/display/dc/dcn32/dcn32_resource.c|  1 +
 .../gpu/drm/amd/display/dc/dcn321/dcn321_resource.c  |  1 +
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index d69121809524..1ec1b441d5cb 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -261,6 +261,7 @@ struct dc_caps {
uint32_t cache_line_size;
uint32_t cache_num_ways;
uint16_t subvp_fw_processing_delay_us;
+   uint8_t subvp_drr_max_vblank_margin_us;
uint16_t subvp_prefetch_end_to_mall_start_us;
uint8_t subvp_swath_height_margin_lines; // subvp start line must be 
aligned to 2 x swath height
uint16_t subvp_pstate_allow_width_us;
diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c 
b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
index 4cb912bf400b..097556f7b32c 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
+++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
@@ -477,12 +477,20 @@ static void populate_subvp_cmd_drr_info(struct dc *dc,
(((uint64_t)main_timing->pix_clk_100hz * 100)));
drr_active_us = div64_u64(((uint64_t)drr_timing->v_addressable * 
drr_timing->h_total * 100),
(((uint64_t)drr_timing->pix_clk_100hz * 100)));
-   max_drr_vblank_us = div64_u64((subvp_active_us - prefetch_us - 
drr_active_us), 2) + drr_active_us;
-   max_drr_mallregion_us = subvp_active_us - prefetch_us - mall_region_us;
+   max_drr_vblank_us = div64_u64((subvp_active_us - prefetch_us -
+   dc->caps.subvp_fw_processing_delay_us - drr_active_us), 
2) + drr_active_us;
+   max_drr_mallregion_us = subvp_active_us - prefetch_us - mall_region_us 
- dc->caps.subvp_fw_processing_delay_us;
max_drr_supported_us = max_drr_vblank_us > max_drr_mallregion_us ? 
max_drr_vblank_us : max_drr_mallregion_us;
max_vtotal_supported = div64_u64(((uint64_t)drr_timing->pix_clk_100hz * 
100 * max_drr_supported_us),
(((uint64_t)drr_timing->h_total * 100)));
 
+   /* When calculating the max vtotal supported for SubVP + DRR cases, add
+* margin due to possible rounding errors (being off by 1 line in the
+* FW calculation can incorrectly push the P-State switch to wait 1 
frame
+* longer).
+*/
+   max_vtotal_supported = max_vtotal_supported - 
dc->caps.subvp_drr_max_vblank_margin_us;
+
pipe_data->pipe_config.vblank_data.drr_info.min_vtotal_supported = 
min_vtotal_supported;
pipe_data->pipe_config.vblank_data.drr_info.max_vtotal_supported = 
max_vtotal_supported;
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index 4bd861427b3c..77e40ee488bd 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -2117,6 +2117,7 @@ static bool dcn32_resource_construct(
dc->caps.cache_num_ways = 16;
dc->caps.max_cab_allocation_bytes = 67108864; // 64MB = 1024 * 1024 * 64
dc->caps.subvp_fw_processing_delay_us = 15;
+   dc->caps.subvp_drr_max_vblank_margin_us = 40;
dc->caps.subvp_prefetch_end_to_mall_start_us = 15;
dc->caps.subvp_swath_height_margin_lines = 16;
dc->caps.subvp_pstate_allow_width_us = 20;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c
index 6292ac515d1a..e5861225f1df 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c
@@ -1704,6 +1704,7 @@ static bool dcn321_resource_construct(
dc->caps.cache_num_ways = 16;
dc->caps.max_cab_allocation_bytes = 33554432; // 32MB = 1024 * 1024 * 32
dc->caps.subvp_fw_processing_delay_us = 15;
+   dc->caps.subvp_drr_max_vblank_margin_us = 40;
dc->caps.subvp_prefetch_end_to_mall_start_us = 15;
dc->caps.subvp_swath_height_margin_lines = 16;
dc->caps.subvp_pstate_allow_width_us = 20;
-- 
2.25.1



[PATCH 22/22] drm/amd/display: 3.2.211

2022-11-02 Thread Alan Liu
From: Aric Cyr 

DC version 3.2.211 brings along the following fixes:

- Wait for VBLANK during pipe programming
- Adding HDMI SCDC DEVICE_ID define
- Cursor update refactor: PSR-SU support condition
- Update 709 gamma to 2.222 as stated in the standerd
- Consider dp cable id only when data is non zero
- Waiting for 1 frame to fix the flash issue on PSR1
- Update SR watermarks for DCN314
- Allow tuning DCN314 bounding box
- Zeromem mypipe heap struct before using it
- Use min transition for SubVP into MPO
- Disable phantom OTG after enable for plane disable
- Disable DRR actions during state commit
- Fix fallback issues for DP LL 1.4a tests
- Fix FCLK deviation and tool compile issues
- Fix reg timeout in enc314_enable_fifo
- Fix gpio port mapping issue
- Only update link settings after successful MST link train
- Enforce minimum prefetch time for low memclk on DCN32
- Set correct EOTF and Gamut flag in VRR info
- Add margin for max vblank time for SubVP + DRR
- Populate DP2.0 output type for DML pipe

Acked-by: Alan Liu 
Signed-off-by: Aric Cyr 
---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 1ec1b441d5cb..caed5597d1dc 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -47,7 +47,7 @@ struct aux_payload;
 struct set_config_cmd_payload;
 struct dmub_notification;
 
-#define DC_VER "3.2.210"
+#define DC_VER "3.2.211"
 
 #define MAX_SURFACES 3
 #define MAX_PLANES 6
-- 
2.25.1



[PATCH 21/22] drm/amd/display: Populate DP2.0 output type for DML pipe

2022-11-02 Thread Alan Liu
From: George Shen 

[Why]
DCN3.2 DML logic uses a new output type for DP2.0,
which will enable validation to pass for higher BW
timings that require DP2.0 link rates.

[How]
Populate the DML pipe with DP2.0 output type if
the signal type of the pipe_ctx is 128b/132b.

Reviewed-by: Alvin Lee 
Acked-by: Jasdeep Dhillon 
Signed-off-by: George Shen 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
index 602e885ed52c..75dbb7ee193b 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/dcn20_fpu.c
@@ -1296,6 +1296,8 @@ int dcn20_populate_dml_pipes_from_context(
case SIGNAL_TYPE_DISPLAY_PORT_MST:
case SIGNAL_TYPE_DISPLAY_PORT:
pipes[pipe_cnt].dout.output_type = dm_dp;
+   if (is_dp_128b_132b_signal(_ctx->pipe_ctx[i]))
+   pipes[pipe_cnt].dout.output_type = dm_dp2p0;
break;
case SIGNAL_TYPE_EDP:
pipes[pipe_cnt].dout.output_type = dm_edp;
-- 
2.25.1



[PATCH 18/22] drm/amd/display: Enforce minimum prefetch time for low memclk on DCN32

2022-11-02 Thread Alan Liu
From: Dillon Varone 

[WHY?]
Data return times when using lowest memclk can be <= 60us, which can cause
underflow on high bandwidth displays with a workload.

[HOW?]
Enforce a minimum prefetch time during validation for low memclk modes.

Reviewed-by: Jun Lei 
Acked-by: Alan Liu 
Signed-off-by: Dillon Varone 
---
 drivers/gpu/drm/amd/display/dc/dc.h  |  1 +
 .../gpu/drm/amd/display/dc/dcn32/dcn32_resource.c|  1 +
 .../gpu/drm/amd/display/dc/dcn321/dcn321_resource.c  |  1 +
 drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c |  2 ++
 .../amd/display/dc/dml/dcn32/display_mode_vba_32.c   |  4 
 .../amd/display/dc/dml/dcn32/display_mode_vba_32.h   |  3 +++
 .../display/dc/dml/dcn32/display_mode_vba_util_32.c  | 12 ++--
 .../display/dc/dml/dcn32/display_mode_vba_util_32.h  |  1 +
 .../gpu/drm/amd/display/dc/dml/dcn321/dcn321_fpu.c   |  2 ++
 .../drm/amd/display/dc/dml/display_mode_structs.h|  1 +
 10 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index 84c82d3a6761..d69121809524 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -864,6 +864,7 @@ struct dc_debug_options {
bool enable_dp_dig_pixel_rate_div_policy;
enum lttpr_mode lttpr_mode_override;
unsigned int dsc_delay_factor_wa_x1000;
+   unsigned int min_prefetch_in_strobe_ns;
 };
 
 struct gpu_info_soc_bounding_box_v1_0;
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
index 4ba9a8662185..4bd861427b3c 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c
@@ -724,6 +724,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.enable_dp_dig_pixel_rate_div_policy = 1,
.allow_sw_cursor_fallback = false,
.alloc_extra_way_for_cursor = true,
+   .min_prefetch_in_strobe_ns = 6, // 60us
 };
 
 static const struct dc_debug_options debug_defaults_diags = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c
index 61087f2385a9..6292ac515d1a 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c
@@ -722,6 +722,7 @@ static const struct dc_debug_options debug_defaults_drv = {
.enable_dp_dig_pixel_rate_div_policy = 1,
.allow_sw_cursor_fallback = false,
.alloc_extra_way_for_cursor = true,
+   .min_prefetch_in_strobe_ns = 6, // 60us
 };
 
 static const struct dc_debug_options debug_defaults_diags = {
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index 0d704e302d03..853ffb704985 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -2351,6 +2351,8 @@ void dcn32_update_bw_bounding_box_fpu(struct dc *dc, 
struct clk_bw_params *bw_pa
/* DML DSC delay factor workaround */
dcn3_2_ip.dsc_delay_factor_wa = dc->debug.dsc_delay_factor_wa_x1000 / 
1000.0;
 
+   dcn3_2_ip.min_prefetch_in_strobe_us = 
dc->debug.min_prefetch_in_strobe_ns / 1000.0;
+
/* Override dispclk_dppclk_vco_speed_mhz from Clk Mgr */
dcn3_2_soc.dispclk_dppclk_vco_speed_mhz = 
dc->clk_mgr->dentist_vco_freq_khz / 1000.0;
dc->dml.soc.dispclk_dppclk_vco_speed_mhz = 
dc->clk_mgr->dentist_vco_freq_khz / 1000.0;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index ae6e6abc620b..244fd15d24b4 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -786,6 +786,8 @@ static void 
DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
v->SwathHeightY[k],
v->SwathHeightC[k],
TWait,
+   
v->DRAMSpeedPerState[mode_lib->vba.VoltageLevel] <= MEM_STROBE_FREQ_MHZ ?
+   
mode_lib->vba.ip.min_prefetch_in_strobe_us : 0,
/* Output */
>DSTXAfterScaler[k],
>DSTYAfterScaler[k],
@@ -3245,6 +3247,8 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l

v->swath_width_chroma_ub_this_state[k],

v->SwathHeightYThisState[k],

v->SwathHeightCThisState[k], v->TWait,
+  

[PATCH 19/22] drm/amd/display: Set correct EOTF and Gamut flag in VRR info

2022-11-02 Thread Alan Liu
From: Mike Hsieh 

[Why] FreeSync always use G2.2 EOTF and Native gamut
[How] Set EOTF and Gamut flags accordingly

Reviewed-by: Krunoslav Kovac 
Acked-by: Alan Liu 
Signed-off-by: Mike Hsieh 
---
 drivers/gpu/drm/amd/display/modules/freesync/freesync.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c 
b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
index 0f39ab9dc5b4..c2e00f7b8381 100644
--- a/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
+++ b/drivers/gpu/drm/amd/display/modules/freesync/freesync.c
@@ -688,10 +688,10 @@ static void build_vrr_infopacket_fs2_data(enum 
color_transfer_func app_tf,
if (app_tf != TRANSFER_FUNC_UNKNOWN) {
infopacket->valid = true;
 
-   infopacket->sb[6] |= 0x08;  // PB6 = [Bit 3 = Native Color 
Active]
-
-   if (app_tf == TRANSFER_FUNC_GAMMA_22) {
-   infopacket->sb[9] |= 0x04;  // PB6 = [Bit 2 = Gamma 2.2 
EOTF Active]
+   if (app_tf != TRANSFER_FUNC_PQ2084) {
+   infopacket->sb[6] |= 0x08;  // PB6 = [Bit 3 = Native 
Color Active]
+   if (app_tf == TRANSFER_FUNC_GAMMA_22)
+   infopacket->sb[9] |= 0x04;  // PB6 = [Bit 2 = 
Gamma 2.2 EOTF Active]
}
}
 }
-- 
2.25.1



[PATCH 17/22] drm/amd/display: Only update link settings after successful MST link train

2022-11-02 Thread Alan Liu
From: Michael Strauss 

[WHY]
Currently driver reduces verified link caps on DPIA devices if a link is
trained at a link rate below the max rate verified during link detection.
This blocks high bandwidth modes after setting a low bandwidth mode.

[HOW]
Only update link rate after a successful link train if link is MST.

Reviewed-by: Mustapha Ghaddar 
Acked-by: Alan Liu 
Signed-off-by: Michael Strauss 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 8 
 drivers/gpu/drm/amd/display/dc/core/dc_link.c | 4 
 drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c  | 7 +--
 drivers/gpu/drm/amd/display/dc/dm_helpers.h   | 5 +
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
index a21e2ba77ddb..b433fab57670 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
@@ -1009,3 +1009,11 @@ void dm_helpers_enable_periodic_detection(struct 
dc_context *ctx, bool enable)
 {
/* TODO: add periodic detection implementation */
 }
+
+void dm_helpers_dp_mst_update_branch_bandwidth(
+   struct dc_context *ctx,
+   struct dc_link *link)
+{
+   // TODO
+}
+
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 6990b64c0211..945e9ae4e630 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -4663,6 +4663,10 @@ void dc_link_set_preferred_training_settings(struct dc 
*dc,
link->preferred_link_setting.link_rate = LINK_RATE_UNKNOWN;
}
 
+   if (link->connector_signal == SIGNAL_TYPE_DISPLAY_PORT &&
+   link->type == dc_connection_mst_branch)
+   dm_helpers_dp_mst_update_branch_bandwidth(dc->ctx, link);
+
/* Retrain now, or wait until next stream update to apply */
if (skip_immediate_retrain == false)
dc_link_set_preferred_link_settings(dc, 
>preferred_link_setting, link);
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index cf9191053365..24e1164b1bee 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -2771,8 +2771,11 @@ bool perform_link_training_with_retries(
/* Update verified link settings to 
current one
 * Because DPIA LT might fallback to 
lower link setting.
 */
-   link->verified_link_cap.link_rate = 
link->cur_link_settings.link_rate;
-   link->verified_link_cap.lane_count = 
link->cur_link_settings.lane_count;
+   if (stream->signal == 
SIGNAL_TYPE_DISPLAY_PORT_MST) {
+   
link->verified_link_cap.link_rate = link->cur_link_settings.link_rate;
+   
link->verified_link_cap.lane_count = link->cur_link_settings.lane_count;
+   
dm_helpers_dp_mst_update_branch_bandwidth(link->ctx, link);
+   }
}
} else {
status = dc_link_dp_perform_link_training(link,
diff --git a/drivers/gpu/drm/amd/display/dc/dm_helpers.h 
b/drivers/gpu/drm/amd/display/dc/dm_helpers.h
index 6abbed22bb20..59ab4f392fc9 100644
--- a/drivers/gpu/drm/amd/display/dc/dm_helpers.h
+++ b/drivers/gpu/drm/amd/display/dc/dm_helpers.h
@@ -116,6 +116,11 @@ bool dm_helpers_dp_mst_start_top_mgr(
 bool dm_helpers_dp_mst_stop_top_mgr(
struct dc_context *ctx,
struct dc_link *link);
+
+void dm_helpers_dp_mst_update_branch_bandwidth(
+   struct dc_context *ctx,
+   struct dc_link *link);
+
 /**
  * OS specific aux read callback.
  */
-- 
2.25.1



[PATCH 16/22] drm/amd/display: Fix gpio port mapping issue

2022-11-02 Thread Alan Liu
From: Steve Su 

[Why]
1. Port of gpio has different mapping.

[How]
1. Add a dummy entry in mapping table.
2. Fix incorrect mask bit field access.

Reviewed-by: Alvin Lee 
Acked-by: Alan Liu 
Signed-off-by: Steve Su 
---
 .../amd/display/dc/gpio/dcn32/hw_factory_dcn32.c   | 14 ++
 drivers/gpu/drm/amd/display/dc/gpio/hw_ddc.c   |  9 ++---
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/gpio/dcn32/hw_factory_dcn32.c 
b/drivers/gpu/drm/amd/display/dc/gpio/dcn32/hw_factory_dcn32.c
index d635b73af46f..0ea52ba5ac82 100644
--- a/drivers/gpu/drm/amd/display/dc/gpio/dcn32/hw_factory_dcn32.c
+++ b/drivers/gpu/drm/amd/display/dc/gpio/dcn32/hw_factory_dcn32.c
@@ -107,6 +107,13 @@ static const struct ddc_registers ddc_data_regs_dcn[] = {
ddc_data_regs_dcn2(3),
ddc_data_regs_dcn2(4),
ddc_data_regs_dcn2(5),
+   {
+   // add a dummy entry for cases no such port
+   {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
+   .ddc_setup = 0,
+   .phy_aux_cntl = 0,
+   .dc_gpio_aux_ctrl_5 = 0
+   },
{
DDC_GPIO_VGA_REG_LIST(DATA),
.ddc_setup = 0,
@@ -121,6 +128,13 @@ static const struct ddc_registers ddc_clk_regs_dcn[] = {
ddc_clk_regs_dcn2(3),
ddc_clk_regs_dcn2(4),
ddc_clk_regs_dcn2(5),
+   {
+   // add a dummy entry for cases no such port
+   {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,},
+   .ddc_setup = 0,
+   .phy_aux_cntl = 0,
+   .dc_gpio_aux_ctrl_5 = 0
+   },
{
DDC_GPIO_VGA_REG_LIST(CLK),
.ddc_setup = 0,
diff --git a/drivers/gpu/drm/amd/display/dc/gpio/hw_ddc.c 
b/drivers/gpu/drm/amd/display/dc/gpio/hw_ddc.c
index 6fd38cdd68c0..525bc8881950 100644
--- a/drivers/gpu/drm/amd/display/dc/gpio/hw_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/gpio/hw_ddc.c
@@ -94,11 +94,14 @@ static enum gpio_result set_config(
 * is required for detection of AUX mode */
if (hw_gpio->base.en != GPIO_DDC_LINE_VIP_PAD) {
if (!ddc_data_pd_en || !ddc_clk_pd_en) {
-
-   REG_SET_2(gpio.MASK_reg, regval,
+   if (hw_gpio->base.en == GPIO_DDC_LINE_DDC_VGA) {
+   // bit 4 of mask has different usage in 
some cases
+   REG_SET(gpio.MASK_reg, regval, 
DC_GPIO_DDC1DATA_PD_EN, 1);
+   } else {
+   REG_SET_2(gpio.MASK_reg, regval,
DC_GPIO_DDC1DATA_PD_EN, 1,
DC_GPIO_DDC1CLK_PD_EN, 1);
-
+   }
if (config_data->type ==

GPIO_CONFIG_TYPE_I2C_AUX_DUAL_MODE)
msleep(3);
-- 
2.25.1



[PATCH 13/22] drm/amd/display: Fix fallback issues for DP LL 1.4a tests

2022-11-02 Thread Alan Liu
From: Mustapha Ghaddar 

[WHY]
Unlike DP or USBC, the USB4 link does not get its own encoder and
has to share therefore verify_caps is skipped.

[HOW]
Fix the fallback logic for automated tests and take that
into consideration for LT and LS.

Reviewed-by: Jun Lei 
Acked-by: Alan Liu 
Signed-off-by: Mustapha Ghaddar 
---
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 22 ---
 .../drm/amd/display/dc/core/dc_link_dpia.c| 15 -
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  1 +
 3 files changed, 30 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 601f78b0b08b..cf9191053365 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -4554,9 +4554,19 @@ void dc_link_dp_handle_link_loss(struct dc_link *link)
 
for (i = 0; i < MAX_PIPES; i++) {
pipe_ctx = >dc->current_state->res_ctx.pipe_ctx[i];
-   if (pipe_ctx && pipe_ctx->stream && !pipe_ctx->stream->dpms_off 
&&
-   pipe_ctx->stream->link == link && 
!pipe_ctx->prev_odm_pipe)
+   if (pipe_ctx && pipe_ctx->stream && !pipe_ctx->stream->dpms_off
+   && pipe_ctx->stream->link == link && 
!pipe_ctx->prev_odm_pipe) {
+   // Always use max settings here for DP 1.4a LL 
Compliance CTS
+   if (link->is_automated) {
+   
pipe_ctx->link_config.dp_link_settings.lane_count =
+   
link->verified_link_cap.lane_count;
+   
pipe_ctx->link_config.dp_link_settings.link_rate =
+   
link->verified_link_cap.link_rate;
+   
pipe_ctx->link_config.dp_link_settings.link_spread =
+   
link->verified_link_cap.link_spread;
+   }
core_link_enable_stream(link->dc->current_state, 
pipe_ctx);
+   }
}
 }
 
@@ -4597,6 +4607,8 @@ bool dc_link_handle_hpd_rx_irq(struct dc_link *link, 
union hpd_irq_data *out_hpd
}
 
if (hpd_irq_dpcd_data.bytes.device_service_irq.bits.AUTOMATED_TEST) {
+   // Workaround for DP 1.4a LL Compliance CTS as USB4 has to 
share encoders unlike DP and USBC
+   link->is_automated = true;
device_service_clear.bits.AUTOMATED_TEST = 1;
core_link_write_dpcd(
link,
@@ -7240,6 +7252,7 @@ void dp_retrain_link_dp_test(struct dc_link *link,
struct pipe_ctx *pipes =
>dc->current_state->res_ctx.pipe_ctx[0];
unsigned int i;
+   bool do_fallback = false;
 
 
for (i = 0; i < MAX_PIPES; i++) {
@@ -7272,13 +7285,16 @@ void dp_retrain_link_dp_test(struct dc_link *link,
memset(>cur_link_settings, 0,
sizeof(link->cur_link_settings));
 
+   if (link->ep_type == DISPLAY_ENDPOINT_USB4_DPIA)
+   do_fallback = true;
+
perform_link_training_with_retries(
link_setting,
skip_video_pattern,
LINK_TRAINING_ATTEMPTS,
[i],
SIGNAL_TYPE_DISPLAY_PORT,
-   false);
+   do_fallback);
 
link->dc->hwss.enable_stream([i]);
 
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dpia.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dpia.c
index 74e36b34d3f7..d130d58ac08e 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dpia.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dpia.c
@@ -791,10 +791,14 @@ static enum link_training_result 
dpia_training_eq_transparent(
}
 
if (dp_is_ch_eq_done(lane_count, dpcd_lane_status) &&
-   dp_is_symbol_locked(link->cur_link_settings.lane_count, 
dpcd_lane_status) &&
-   dp_is_interlane_aligned(dpcd_lane_status_updated)) {
-   result =  LINK_TRAINING_SUCCESS;
-   break;
+   
dp_is_symbol_locked(link->cur_link_settings.lane_count, dpcd_lane_status)) {
+   /* Take into consideration corner case for DP 1.4a LL 
Compliance CTS as USB4
+* has to share encoders unlike DP and USBC
+*/
+   if (dp_is_interlane_aligned(dpcd_lane_status_updated) 
|| (link->is_automated && retries_eq)) {
+   result =  LINK_TRAINING_SUCCESS;
+   break;
+   

[PATCH 12/22] drm/amd/display: Disable DRR actions during state commit

2022-11-02 Thread Alan Liu
From: Wesley Chalmers 

[WHY]
Committing a state while performing DRR actions can cause underflow.

[HOW]
Disabled features performing DRR actions during state commit.
Need to follow-up on why DRR actions affect state commit.

Reviewed-by: Jun Lei 
Acked-by: Alan Liu 
Signed-off-by: Wesley Chalmers 
---
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
index 8c5045711264..c20e9f76f021 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c
@@ -992,8 +992,5 @@ void dcn30_prepare_bandwidth(struct dc *dc,
dc->clk_mgr->funcs->set_max_memclk(dc->clk_mgr, 
dc->clk_mgr->bw_params->clk_table.entries[dc->clk_mgr->bw_params->clk_table.num_entries
 - 1].memclk_mhz);
 
dcn20_prepare_bandwidth(dc, context);
-
-   dc_dmub_srv_p_state_delegate(dc,
-   context->bw_ctx.bw.dcn.clk.fw_based_mclk_switching, context);
 }
 
-- 
2.25.1



[PATCH 15/22] drm/amd/display: Fix reg timeout in enc314_enable_fifo

2022-11-02 Thread Alan Liu
From: Nicholas Kazlauskas 

[Why]
The link enablement sequence can end up resetting the encoder while
the PHY symclk isn't yet on.

This means that waiting for symclk on will timeout, along with the reset
bit never asserting high.

This causes unnecessary delay when enabling the link and produces a
warning affecting multiple IGT tests.

[How]
Don't wait for the symclk to be on here because firmware already does.

Don't wait for reset if we know the symclk isn't on.

Split the reset into a helper function that checks the bit and decides
whether or not a delay is sufficient.

Reviewed-by: Roman Li 
Acked-by: Alan Liu 
Signed-off-by: Nicholas Kazlauskas 
---
 .../dc/dcn314/dcn314_dio_stream_encoder.c | 24 ++-
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c 
b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
index 7e773bf7b895..38842f938bed 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn314/dcn314_dio_stream_encoder.c
@@ -49,18 +49,30 @@
 #define CTX \
enc1->base.ctx
 
+static void enc314_reset_fifo(struct stream_encoder *enc, bool reset)
+{
+   struct dcn10_stream_encoder *enc1 = DCN10STRENC_FROM_STRENC(enc);
+   uint32_t reset_val = reset ? 1 : 0;
+   uint32_t is_symclk_on;
+
+   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_RESET, reset_val);
+   REG_GET(DIG_FE_CNTL, DIG_SYMCLK_FE_ON, _symclk_on);
+
+   if (is_symclk_on)
+   REG_WAIT(DIG_FIFO_CTRL0, DIG_FIFO_RESET_DONE, reset_val, 10, 
5000);
+   else
+   udelay(10);
+}
 
 static void enc314_enable_fifo(struct stream_encoder *enc)
 {
struct dcn10_stream_encoder *enc1 = DCN10STRENC_FROM_STRENC(enc);
 
-   /* TODO: Confirm if we need to wait for DIG_SYMCLK_FE_ON */
-   REG_WAIT(DIG_FE_CNTL, DIG_SYMCLK_FE_ON, 1, 10, 5000);
REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_READ_START_LEVEL, 0x7);
-   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 1);
-   REG_WAIT(DIG_FIFO_CTRL0, DIG_FIFO_RESET_DONE, 1, 10, 5000);
-   REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_RESET, 0);
-   REG_WAIT(DIG_FIFO_CTRL0, DIG_FIFO_RESET_DONE, 0, 10, 5000);
+
+   enc314_reset_fifo(enc, true);
+   enc314_reset_fifo(enc, false);
+
REG_UPDATE(DIG_FIFO_CTRL0, DIG_FIFO_ENABLE, 1);
 }
 
-- 
2.25.1



[PATCH 11/22] drm/amd/display: Disable phantom OTG after enable for plane disable

2022-11-02 Thread Alan Liu
From: Alvin Lee 

[Description]
- Need to disable phantom OTG after it's enabled
  in order to restore it to it's original state.
- If it's enabled and then an MCLK switch comes in
  we may not prefetch the correct data since the phantom
  OTG could already be in the middle of the frame.

Reviewed-by: Jun Lei 
Acked-by: Alan Liu 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c   | 14 +-
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c  |  8 
 .../drm/amd/display/dc/inc/hw/timing_generator.h   |  1 +
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index da808996e21d..9c3704c4d7e4 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1055,6 +1055,7 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
struct dc_state *dangling_context = dc_create_state(dc);
struct dc_state *current_ctx;
struct pipe_ctx *pipe;
+   struct timing_generator *tg;
 
if (dangling_context == NULL)
return;
@@ -1098,6 +1099,7 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
 
if (should_disable && old_stream) {
pipe = >current_state->res_ctx.pipe_ctx[i];
+   tg = pipe->stream_res.tg;
/* When disabling plane for a phantom pipe, we must 
turn on the
 * phantom OTG so the disable programming gets the 
double buffer
 * update. Otherwise the pipe will be left in a 
partially disabled
@@ -1105,7 +1107,8 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
 * again for different use.
 */
if (old_stream->mall_stream_config.type == 
SUBVP_PHANTOM) {
-   
pipe->stream_res.tg->funcs->enable_crtc(pipe->stream_res.tg);
+   if (tg->funcs->enable_crtc)
+   tg->funcs->enable_crtc(tg);
}
dc_rem_all_planes_for_stream(dc, old_stream, 
dangling_context);
disable_all_writeback_pipes_for_stream(dc, old_stream, 
dangling_context);
@@ -1122,6 +1125,15 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
dc->hwss.interdependent_update_lock(dc, 
dc->current_state, false);
dc->hwss.post_unlock_program_front_end(dc, 
dangling_context);
}
+   /* We need to put the phantom OTG back into it's 
default (disabled) state or we
+* can get corruption when transition from one SubVP 
config to a different one.
+* The OTG is set to disable on falling edge of VUPDATE 
so the plane disable
+* will still get it's double buffer update.
+*/
+   if (old_stream->mall_stream_config.type == 
SUBVP_PHANTOM) {
+   if (tg->funcs->disable_phantom_crtc)
+   tg->funcs->disable_phantom_crtc(tg);
+   }
}
}
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c 
b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
index 2b33eeb213e2..2ee798965bc2 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c
@@ -167,6 +167,13 @@ static void optc32_phantom_crtc_post_enable(struct 
timing_generator *optc)
REG_WAIT(OTG_CLOCK_CONTROL, OTG_BUSY, 0, 1, 10);
 }
 
+static void optc32_disable_phantom_otg(struct timing_generator *optc)
+{
+   struct optc *optc1 = DCN10TG_FROM_TG(optc);
+
+   REG_UPDATE(OTG_CONTROL, OTG_MASTER_EN, 0);
+}
+
 static void optc32_set_odm_bypass(struct timing_generator *optc,
const struct dc_crtc_timing *dc_crtc_timing)
 {
@@ -260,6 +267,7 @@ static struct timing_generator_funcs dcn32_tg_funcs = {
.enable_crtc = optc32_enable_crtc,
.disable_crtc = optc32_disable_crtc,
.phantom_crtc_post_enable = optc32_phantom_crtc_post_enable,
+   .disable_phantom_crtc = optc32_disable_phantom_otg,
/* used by enable_timing_synchronization. Not need for FPGA */
.is_counter_moving = optc1_is_counter_moving,
.get_position = optc1_get_position,
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
index 65f18f9dad34..43eb61961e0f 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/timing_generator.h
@@ 

[PATCH 14/22] drm/amd/display: Fix FCLK deviation and tool compile issues

2022-11-02 Thread Alan Liu
From: Chaitanya Dhere 

[Why]
Recent backports from open source do not have header inclusion pattern
that is consistent with inclusion style in the rest of the file. This
breaks the internal tool builds as well. A recent commit erronously
modified the original DML formula for calculating
ActiveClockChangeLatencyHidingY. This resulted in a FCLK deviation
from the golden values.

[How]
Change the way in which display_mode_vba.h is included so that it is
consistent with the inclusion style in rest of the file which also fixes
the tool build. Restore the DML formula to its original state to fix the
FCLK deviation.

Reviewed-by: Aurabindo Pillai 
Reviewed-by: Jun Lei 
Acked-by: Alan Liu 
Signed-off-by: Chaitanya Dhere 
---
 .../gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c | 2 +-
 .../gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
index 968924c491c1..ab9217732a17 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
@@ -4397,7 +4397,7 @@ void 
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
 
if (v->NumberOfActiveSurfaces > 1) {
ActiveClockChangeLatencyHidingY = 
ActiveClockChangeLatencyHidingY
-   - (1 - 1 / v->NumberOfActiveSurfaces) * 
SwathHeightY[k] * v->HTotal[k]
+   - (1.0 - 1.0 / 
v->NumberOfActiveSurfaces) * SwathHeightY[k] * v->HTotal[k]
/ v->PixelClock[k] / 
v->VRatio[k];
}
 
diff --git 
a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
index 2c3827546ac7..fdccaa93eb2e 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
@@ -30,7 +30,7 @@
 #include "os_types.h"
 #include "../dc_features.h"
 #include "../display_mode_structs.h"
-#include "dml/display_mode_vba.h"
+#include "../display_mode_vba.h"
 
 unsigned int dml32_dscceComputeDelay(
unsigned int bpc,
-- 
2.25.1



[PATCH 10/22] drm/amd/display: Use min transition for SubVP into MPO

2022-11-02 Thread Alan Liu
From: Alvin Lee 

[Description]
- For SubVP transitioning into MPO, we want to
  use a minimal transition to prevent transient
  underflow
- Transitioning a phantom pipe directly into a
  "real" pipe can result in underflow due to the
  HUBP still having it's "phantom" programming
  when HUBP is unblanked (have to wait for next
  VUPDATE of the new OTG)
- Also ensure subvp pipe lock is acquired early
  enough for programming in dc_commit_state_no_check
- When disabling phantom planes, enable phantom OTG
  first so the disable gets the double buffer update

Reviewed-by: Aric Cyr 
Acked-by: Alan Liu 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 43 +++-
 1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index d446e6098948..da808996e21d 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -1054,6 +1054,7 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
int i, j;
struct dc_state *dangling_context = dc_create_state(dc);
struct dc_state *current_ctx;
+   struct pipe_ctx *pipe;
 
if (dangling_context == NULL)
return;
@@ -1096,6 +1097,16 @@ static void disable_dangling_plane(struct dc *dc, struct 
dc_state *context)
}
 
if (should_disable && old_stream) {
+   pipe = >current_state->res_ctx.pipe_ctx[i];
+   /* When disabling plane for a phantom pipe, we must 
turn on the
+* phantom OTG so the disable programming gets the 
double buffer
+* update. Otherwise the pipe will be left in a 
partially disabled
+* state that can result in underflow or hang when 
enabling it
+* again for different use.
+*/
+   if (old_stream->mall_stream_config.type == 
SUBVP_PHANTOM) {
+   
pipe->stream_res.tg->funcs->enable_crtc(pipe->stream_res.tg);
+   }
dc_rem_all_planes_for_stream(dc, old_stream, 
dangling_context);
disable_all_writeback_pipes_for_stream(dc, old_stream, 
dangling_context);
 
@@ -1749,6 +1760,12 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
context->stream_count == 0)
dc->hwss.prepare_bandwidth(dc, context);
 
+   /* When SubVP is active, all HW programming must be done while
+* SubVP lock is acquired
+*/
+   if (dc->hwss.subvp_pipe_control_lock)
+   dc->hwss.subvp_pipe_control_lock(dc, context, true, true, NULL, 
subvp_prev_use);
+
if (dc->debug.enable_double_buffered_dsc_pg_support)
dc->hwss.update_dsc_pg(dc, context, false);
 
@@ -1776,9 +1793,6 @@ static enum dc_status dc_commit_state_no_check(struct dc 
*dc, struct dc_state *c
dc->hwss.wait_for_mpcc_disconnect(dc, dc->res_pool, pipe);
}
 
-   if (dc->hwss.subvp_pipe_control_lock)
-   dc->hwss.subvp_pipe_control_lock(dc, context, true, true, NULL, 
subvp_prev_use);
-
result = dc->hwss.apply_ctx_to_hw(dc, context);
 
if (result != DC_OK) {
@@ -3675,7 +3689,6 @@ static bool 
could_mpcc_tree_change_for_active_pipes(struct dc *dc,
 
struct dc_stream_status *cur_stream_status = 
stream_get_status(dc->current_state, stream);
bool force_minimal_pipe_splitting = false;
-   uint32_t i;
 
*is_plane_addition = false;
 
@@ -3707,27 +3720,11 @@ static bool 
could_mpcc_tree_change_for_active_pipes(struct dc *dc,
}
}
 
-   /* For SubVP pipe split case when adding MPO video
-* we need to add a minimal transition. In this case
-* there will be 2 streams (1 main stream, 1 phantom
-* stream).
+   /* For SubVP when adding MPO video we need to add a minimal transition.
 */
-   if (cur_stream_status &&
-   dc->current_state->stream_count == 2 &&
-   stream->mall_stream_config.type == SUBVP_MAIN) {
-   bool is_pipe_split = false;
-
-   for (i = 0; i < dc->res_pool->pipe_count; i++) {
-   if (dc->current_state->res_ctx.pipe_ctx[i].stream == 
stream &&
-   
(dc->current_state->res_ctx.pipe_ctx[i].bottom_pipe ||
-   
dc->current_state->res_ctx.pipe_ctx[i].next_odm_pipe)) {
-   is_pipe_split = true;
-   break;
-   }
-   }
-
+   if (cur_stream_status && stream->mall_stream_config.type == SUBVP_MAIN) 
{
/* determine if minimal transition is required due to SubVP*/
-   

[PATCH 09/22] drm/amd/display: Zeromem mypipe heap struct before using it

2022-11-02 Thread Alan Liu
From: Aurabindo Pillai 

[Why]
Bug was caused when moving variable from stack to heap because it was reusable
and garbage was left over, so we need to zero mem.

Reviewed-by: Martin Leung 
Acked-by: Alan Liu 
Signed-off-by: Aurabindo Pillai 
Signed-off-by: Martin Leung 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index 3d184679f129..ae6e6abc620b 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -3192,6 +3192,7 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct 
display_mode_lib *mode_l

mode_lib->vba.FCLKChangeLatency, mode_lib->vba.UrgLatency[i],

mode_lib->vba.SREnterPlusExitTime);
 
+   
memset(>dummy_vars.dml32_ModeSupportAndSystemConfigurationFull, 0, 
sizeof(DmlPipe));

v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.myPipe.Dppclk = 
mode_lib->vba.RequiredDPPCLK[i][j][k];

v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.myPipe.Dispclk = 
mode_lib->vba.RequiredDISPCLK[i][j];

v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.myPipe.PixelClock = 
mode_lib->vba.PixelClock[k];
-- 
2.25.1



[PATCH 05/22] drm/amd/display: Consider dp cable id only when data is non zero

2022-11-02 Thread Alan Liu
From: Wenjing Liu 

Cable ID is a DP2 feature to identify max certified link rate that
a cable can carry. The cable identification method requires both
cable and display hardware support. Since the specs comes late, it is
anticipated that the first round of DP2 cables and displays may not
be fully compatible to reliably return cable ID data. Therefore the
decision of our cable id policy is that if the cable can return non
zero cable id data, we will take cable's link rate capability into
account. However if we get zero data, the cable link rate capability
is considered inconclusive. In this case, we will not take cable's
capability into account to avoid of over limiting hardware capability
from users. The max overall link rate capability is still determined
after actual dp pre-training. Cable id is considered as an auxiliary
method of determining max link bandwidth capability.

Reviewed-by: George Shen 
Acked-by: Alan Liu 
Signed-off-by: Wenjing Liu 
---
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 22 +++
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
index 4ea3c825f228..601f78b0b08b 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c
@@ -3020,7 +3020,7 @@ static enum dc_link_rate get_lttpr_max_link_rate(struct 
dc_link *link)
 
 static enum dc_link_rate get_cable_max_link_rate(struct dc_link *link)
 {
-   enum dc_link_rate cable_max_link_rate = LINK_RATE_HIGH3;
+   enum dc_link_rate cable_max_link_rate = LINK_RATE_UNKNOWN;
 
if (link->dpcd_caps.cable_id.bits.UHBR10_20_CAPABILITY & DP_UHBR20)
cable_max_link_rate = LINK_RATE_UHBR20;
@@ -3083,15 +3083,29 @@ struct dc_link_settings dp_get_max_link_cap(struct 
dc_link *link)
max_link_cap.link_spread =
link->reported_link_cap.link_spread;
 
-   /* Lower link settings based on cable attributes */
+   /* Lower link settings based on cable attributes
+* Cable ID is a DP2 feature to identify max certified link rate that
+* a cable can carry. The cable identification method requires both
+* cable and display hardware support. Since the specs comes late, it is
+* anticipated that the first round of DP2 cables and displays may not
+* be fully compatible to reliably return cable ID data. Therefore the
+* decision of our cable id policy is that if the cable can return non
+* zero cable id data, we will take cable's link rate capability into
+* account. However if we get zero data, the cable link rate capability
+* is considered inconclusive. In this case, we will not take cable's
+* capability into account to avoid of over limiting hardware capability
+* from users. The max overall link rate capability is still determined
+* after actual dp pre-training. Cable id is considered as an auxiliary
+* method of determining max link bandwidth capability.
+*/
cable_max_link_rate = get_cable_max_link_rate(link);
 
if (!link->dc->debug.ignore_cable_id &&
+   cable_max_link_rate != LINK_RATE_UNKNOWN &&
cable_max_link_rate < max_link_cap.link_rate)
max_link_cap.link_rate = cable_max_link_rate;
 
-   /*
-* account for lttpr repeaters cap
+   /* account for lttpr repeaters cap
 * notes: repeaters do not snoop in the DPRX Capabilities addresses 
(3.6.3).
 */
if (dp_is_lttpr_present(link)) {
-- 
2.25.1



[PATCH 04/22] drm/amd/display: Update 709 gamma to 2.222 as stated in the standerd

2022-11-02 Thread Alan Liu
From: Nawwar Ali 

[WHY]
Previously driver use gamma 2.2 for 709 color space,
but the standard is to use gamma of 2.222

[HOW]
Change it gamma to 2.222

Reviewed-by: Krunoslav Kovac 
Acked-by: Alan Liu 
Signed-off-by: Nawwar Ali 
---
 drivers/gpu/drm/amd/display/modules/color/color_gamma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c 
b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
index 447a0ec9cbe2..f6034213c700 100644
--- a/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
+++ b/drivers/gpu/drm/amd/display/modules/color/color_gamma.c
@@ -61,7 +61,7 @@ static const int32_t numerator01[] = { 31308,   18, 0,  
0,  0};
 static const int32_t numerator02[] = { 12920,   4500,   0,  0,  0};
 static const int32_t numerator03[] = { 55,  99, 0,  0,  0};
 static const int32_t numerator04[] = { 55,  99, 0,  0,  0};
-static const int32_t numerator05[] = { 2400,2200,   2200, 2400, 2600};
+static const int32_t numerator05[] = { 2400,,   2200, 2400, 2600};
 
 /* one-time setup of X points */
 void setup_x_points_distribution(void)
-- 
2.25.1



[PATCH 08/22] drm/amd/display: Allow tuning DCN314 bounding box

2022-11-02 Thread Alan Liu
From: Nicholas Kazlauskas 

[Why]
We're missing the helpers from dcn20 that would allow
overriding these with DC debug options.

[How]
Use dcn20_patch_bounding_box to support overriding all the
relevant values.

Reviewed-by: Jun Lei 
Acked-by: Alan Liu 
Signed-off-by: Nicholas Kazlauskas 
---
 drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
index 34b6c763a455..796c9d19e671 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
@@ -264,11 +264,8 @@ void dcn314_update_bw_bounding_box_fpu(struct dc *dc, 
struct clk_bw_params *bw_p
dc->dml.soc.dispclk_dppclk_vco_speed_mhz = max_dispclk_mhz * 2;
}
 
-   if ((int)(dcn3_14_soc.dram_clock_change_latency_us * 1000)
-   != dc->debug.dram_clock_change_latency_ns
-   && dc->debug.dram_clock_change_latency_ns) {
-   dcn3_14_soc.dram_clock_change_latency_us = 
dc->debug.dram_clock_change_latency_ns / 1000;
-   }
+   dcn20_patch_bounding_box(dc, _14_soc);
+
if (!IS_FPGA_MAXIMUS_DC(dc->ctx->dce_environment))
dml_init_instance(>dml, _14_soc, _14_ip, 
DML_PROJECT_DCN314);
else
-- 
2.25.1



[PATCH 03/22] drm/amd/display: Cursor update refactor: PSR-SU support condition

2022-11-02 Thread Alan Liu
From: Max Tseng 

[Why]
PSR-SU requires extra conditions while cursor update.

Reviewed-by: Robin Chen 
Acked-by: Alan Liu 
Signed-off-by: Max Tseng 
---
 drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c | 48 
 1 file changed, 48 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c 
b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
index 67eef5beab95..4cb912bf400b 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
+++ b/drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c
@@ -859,11 +859,59 @@ void dc_dmub_srv_log_diagnostic_data(struct dc_dmub_srv 
*dc_dmub_srv)
diag_data.is_cw6_enabled);
 }
 
+static bool dc_can_pipe_disable_cursor(struct pipe_ctx *pipe_ctx)
+{
+   struct pipe_ctx *test_pipe, *split_pipe;
+   const struct scaler_data *scl_data = _ctx->plane_res.scl_data;
+   struct rect r1 = scl_data->recout, r2, r2_half;
+   int r1_r = r1.x + r1.width, r1_b = r1.y + r1.height, r2_r, r2_b;
+   int cur_layer = pipe_ctx->plane_state->layer_index;
+
+   /**
+* Disable the cursor if there's another pipe above this with a
+* plane that contains this pipe's viewport to prevent double cursor
+* and incorrect scaling artifacts.
+*/
+   for (test_pipe = pipe_ctx->top_pipe; test_pipe;
+test_pipe = test_pipe->top_pipe) {
+   // Skip invisible layer and pipe-split plane on same layer
+   if (!test_pipe->plane_state->visible || 
test_pipe->plane_state->layer_index == cur_layer)
+   continue;
+
+   r2 = test_pipe->plane_res.scl_data.recout;
+   r2_r = r2.x + r2.width;
+   r2_b = r2.y + r2.height;
+   split_pipe = test_pipe;
+
+   /**
+* There is another half plane on same layer because of
+* pipe-split, merge together per same height.
+*/
+   for (split_pipe = pipe_ctx->top_pipe; split_pipe;
+split_pipe = split_pipe->top_pipe)
+   if (split_pipe->plane_state->layer_index == 
test_pipe->plane_state->layer_index) {
+   r2_half = split_pipe->plane_res.scl_data.recout;
+   r2.x = (r2_half.x < r2.x) ? r2_half.x : r2.x;
+   r2.width = r2.width + r2_half.width;
+   r2_r = r2.x + r2.width;
+   break;
+   }
+
+   if (r1.x >= r2.x && r1.y >= r2.y && r1_r <= r2_r && r1_b <= 
r2_b)
+   return true;
+   }
+
+   return false;
+}
+
 static bool dc_dmub_should_update_cursor_data(struct pipe_ctx *pipe_ctx)
 {
if (pipe_ctx->plane_state != NULL) {
if (pipe_ctx->plane_state->address.type == 
PLN_ADDR_TYPE_VIDEO_PROGRESSIVE)
return false;
+
+   if (dc_can_pipe_disable_cursor(pipe_ctx))
+   return false;
}
 
if ((pipe_ctx->stream->link->psr_settings.psr_version == 
DC_PSR_VERSION_SU_1 ||
-- 
2.25.1



[PATCH 07/22] drm/amd/display: Update SR watermarks for DCN314

2022-11-02 Thread Alan Liu
From: Nicholas Kazlauskas 

[Why & How]
New values requested by hardware after fine-tuning.
Update for all memory types.

Reviewed-by: Jun Lei 
Acked-by: Alan Liu 
Signed-off-by: Nicholas Kazlauskas 
---
 .../dc/clk_mgr/dcn314/dcn314_clk_mgr.c| 32 +--
 .../amd/display/dc/dml/dcn314/dcn314_fpu.c|  4 +--
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c 
b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
index 1131c6d73f6c..20a06c04e4a1 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn314/dcn314_clk_mgr.c
@@ -363,32 +363,32 @@ static struct wm_table ddr5_wm_table = {
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 9,
-   .sr_enter_plus_exit_time_us = 11,
+   .sr_exit_time_us = 12.5,
+   .sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 9,
-   .sr_enter_plus_exit_time_us = 11,
+   .sr_exit_time_us = 12.5,
+   .sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 9,
-   .sr_enter_plus_exit_time_us = 11,
+   .sr_exit_time_us = 12.5,
+   .sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.72,
-   .sr_exit_time_us = 9,
-   .sr_enter_plus_exit_time_us = 11,
+   .sr_exit_time_us = 12.5,
+   .sr_enter_plus_exit_time_us = 14.5,
.valid = true,
},
}
@@ -400,32 +400,32 @@ static struct wm_table lpddr5_wm_table = {
.wm_inst = WM_A,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 11.5,
-   .sr_enter_plus_exit_time_us = 14.5,
+   .sr_exit_time_us = 16.5,
+   .sr_enter_plus_exit_time_us = 18.5,
.valid = true,
},
{
.wm_inst = WM_B,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 11.5,
-   .sr_enter_plus_exit_time_us = 14.5,
+   .sr_exit_time_us = 16.5,
+   .sr_enter_plus_exit_time_us = 18.5,
.valid = true,
},
{
.wm_inst = WM_C,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 11.5,
-   .sr_enter_plus_exit_time_us = 14.5,
+   .sr_exit_time_us = 16.5,
+   .sr_enter_plus_exit_time_us = 18.5,
.valid = true,
},
{
.wm_inst = WM_D,
.wm_type = WM_TYPE_PSTATE_CHG,
.pstate_latency_us = 11.65333,
-   .sr_exit_time_us = 11.5,
-   .sr_enter_plus_exit_time_us = 14.5,
+   .sr_exit_time_us = 16.5,
+   .sr_enter_plus_exit_time_us = 18.5,
.valid = true,
},
}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c 
b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
index cf420ad2b8dc..34b6c763a455 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c
@@ -146,8 +146,8 @@ struct _vcs_dpi_soc_bounding_box_st dcn3_14_soc = {
},
},
.num_states = 5,
-   .sr_exit_time_us = 9.0,
-   .sr_enter_plus_exit_time_us = 11.0,
+   .sr_exit_time_us = 16.5,
+   .sr_enter_plus_exit_time_us = 18.5,
.sr_exit_z8_time_us = 442.0,
.sr_enter_plus_exit_z8_time_us = 560.0,
 

[PATCH 06/22] drm/amd/display: Waiting for 1 frame to fix the flash issue on PSR1

2022-11-02 Thread Alan Liu
From: Ryan Lin 

[Why]
Needs more frames waiting before the PSR_Exit sending for the specific
TCON.

[How]
Add relock_delay_frame_cnt to control how many frames waiting are needed
before the PSR_Exit sending. The default value is 0. The Driver side can
set this variable for specific TCONs.

Reviewed-by: Robin Chen 
Acked-by: Alan Liu 
Signed-off-by: Ryan Lin 
---
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c   | 5 +
 drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 6 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c 
b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
index cda1592c3a5b..2d3201b77d6a 100644
--- a/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c
@@ -413,6 +413,11 @@ static bool dmub_psr_copy_settings(struct dmub_psr *dmub,
else
copy_settings_data->debug.bitfields.force_wakeup_by_tps3 = 0;
 
+   //WA for PSR1 on specific TCON, require frame delay for frame re-lock
+   copy_settings_data->relock_delay_frame_cnt = 0;
+   if (link->dpcd_caps.sink_dev_id == DP_BRANCH_DEVICE_ID_001CF8)
+   copy_settings_data->relock_delay_frame_cnt = 2;
+
dc_dmub_srv_cmd_queue(dc->dmub_srv, );
dc_dmub_srv_cmd_execute(dc->dmub_srv);
dc_dmub_srv_wait_idle(dc->dmub_srv);
diff --git a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h 
b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
index 9df330c86a55..34b03bc7f838 100644
--- a/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
+++ b/drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h
@@ -1876,10 +1876,14 @@ struct dmub_cmd_psr_copy_settings_data {
 * Use FSM state for PSR power up/down
 */
uint8_t use_phy_fsm;
+   /**
+* frame delay for frame re-lock
+*/
+   uint8_t relock_delay_frame_cnt;
/**
 * Explicit padding to 2 byte boundary.
 */
-   uint8_t pad3[2];
+   uint8_t pad3;
 };
 
 /**
-- 
2.25.1



[PATCH 02/22] drm/amd/display: Adding HDMI SCDC DEVICE_ID define

2022-11-02 Thread Alan Liu
From: Leo Ma 

[Why && How]
We will need to differentiate vendor behavior in the future.

Reviewed-by: Chris Park 
Acked-by: Alan Liu 
Signed-off-by: Leo Ma 
---
 drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
index 651231387043..ce8d6a54ca54 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c
@@ -82,6 +82,7 @@ struct dp_hdmi_dongle_signature_data {
 #define HDMI_SCDC_STATUS_FLAGS 0x40
 #define HDMI_SCDC_ERR_DETECT 0x50
 #define HDMI_SCDC_TEST_CONFIG 0xC0
+#define HDMI_SCDC_DEVICE_ID 0xD3
 
 union hdmi_scdc_update_read_data {
uint8_t byte[2];
-- 
2.25.1



[PATCH 01/22] drm/amd/display: Wait for VBLANK during pipe programming

2022-11-02 Thread Alan Liu
From: Alvin Lee 

[Description]
- Wait for vblank during front end programming
  for global sync to ensure all double buffer
  updates take.
- This prevents underflow in some cases.

Reviewed-by: Martin Leung 
Acked-by: Alan Liu 
Signed-off-by: Alvin Lee 
---
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index f3334f513eb4..b465a83bde6f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1663,6 +1663,7 @@ static void dcn20_program_pipe(
pipe_ctx->pipe_dlg_param.vupdate_width);
 
if (pipe_ctx->stream->mall_stream_config.type != SUBVP_PHANTOM) 
{
+   
pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, 
CRTC_STATE_VBLANK);

pipe_ctx->stream_res.tg->funcs->wait_for_state(pipe_ctx->stream_res.tg, 
CRTC_STATE_VACTIVE);
}
 
-- 
2.25.1



[PATCH 00/22] DC Patches Nov 2, 2022

2022-11-02 Thread Alan Liu
This DC patchset brings improvements in multiple areas. In summary, we have:

- Wait for VBLANK during pipe programming
- Adding HDMI SCDC DEVICE_ID define
- Cursor update refactor: PSR-SU support condition
- Update 709 gamma to 2.222 as stated in the standerd
- Consider dp cable id only when data is non zero
- Waiting for 1 frame to fix the flash issue on PSR1
- Update SR watermarks for DCN314
- Allow tuning DCN314 bounding box
- Zeromem mypipe heap struct before using it
- Use min transition for SubVP into MPO
- Disable phantom OTG after enable for plane disable
- Disable DRR actions during state commit
- Fix fallback issues for DP LL 1.4a tests
- Fix FCLK deviation and tool compile issues
- Fix reg timeout in enc314_enable_fifo
- Fix gpio port mapping issue
- Only update link settings after successful MST link train
- Enforce minimum prefetch time for low memclk on DCN32
- Set correct EOTF and Gamut flag in VRR info
- Add margin for max vblank time for SubVP + DRR
- Populate DP2.0 output type for DML pipe

Below are the authors of each patch:

Alvin Lee (4):
  drm/amd/display: Wait for VBLANK during pipe programming
  drm/amd/display: Use min transition for SubVP into MPO
  drm/amd/display: Disable phantom OTG after enable for plane disable
  drm/amd/display: Add margin for max vblank time for SubVP + DRR

Aric Cyr (1):
  drm/amd/display: 3.2.211

Aurabindo Pillai (1):
  drm/amd/display: Zeromem mypipe heap struct before using it

Chaitanya Dhere (1):
  drm/amd/display: Fix FCLK deviation and tool compile issues

Dillon Varone (1):
  drm/amd/display: Enforce minimum prefetch time for low memclk on DCN32

George Shen (1):
  drm/amd/display: Populate DP2.0 output type for DML pipe

Leo Ma (1):
  drm/amd/display: Adding HDMI SCDC DEVICE_ID define

Max Tseng (1):
  drm/amd/display: Cursor update refactor: PSR-SU support condition

Michael Strauss (1):
  drm/amd/display: Only update link settings after successful MST link
train

Mike Hsieh (1):
  drm/amd/display: Set correct EOTF and Gamut flag in VRR info

Mustapha Ghaddar (1):
  drm/amd/display: Fix fallback issues for DP LL 1.4a tests

Nawwar Ali (1):
  drm/amd/display: Update 709 gamma to 2.222 as stated in the standerd

Nicholas Kazlauskas (3):
  drm/amd/display: Update SR watermarks for DCN314
  drm/amd/display: Allow tuning DCN314 bounding box
  drm/amd/display: Fix reg timeout in enc314_enable_fifo

Ryan Lin (1):
  drm/amd/display: Waiting for 1 frame to fix the flash issue on PSR1

Steve Su (1):
  drm/amd/display: Fix gpio port mapping issue

Wenjing Liu (1):
  drm/amd/display: Consider dp cable id only when data is non zero

Wesley Chalmers (1):
  drm/amd/display: Disable DRR actions during state commit

 .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  8 +++
 .../dc/clk_mgr/dcn314/dcn314_clk_mgr.c| 32 +-
 drivers/gpu/drm/amd/display/dc/core/dc.c  | 55 ++---
 drivers/gpu/drm/amd/display/dc/core/dc_link.c |  4 ++
 .../gpu/drm/amd/display/dc/core/dc_link_ddc.c |  1 +
 .../gpu/drm/amd/display/dc/core/dc_link_dp.c  | 51 +---
 .../drm/amd/display/dc/core/dc_link_dpia.c| 15 +++--
 drivers/gpu/drm/amd/display/dc/dc.h   |  4 +-
 drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c  | 60 ++-
 drivers/gpu/drm/amd/display/dc/dc_link.h  |  1 +
 drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c |  5 ++
 .../drm/amd/display/dc/dcn20/dcn20_hwseq.c|  1 +
 .../drm/amd/display/dc/dcn30/dcn30_hwseq.c|  3 -
 .../dc/dcn314/dcn314_dio_stream_encoder.c | 24 ++--
 .../gpu/drm/amd/display/dc/dcn32/dcn32_optc.c |  8 +++
 .../drm/amd/display/dc/dcn32/dcn32_resource.c |  2 +
 .../amd/display/dc/dcn321/dcn321_resource.c   |  2 +
 drivers/gpu/drm/amd/display/dc/dm_helpers.h   |  5 ++
 .../drm/amd/display/dc/dml/dcn20/dcn20_fpu.c  |  2 +
 .../amd/display/dc/dml/dcn314/dcn314_fpu.c| 11 ++--
 .../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c  |  2 +
 .../dc/dml/dcn32/display_mode_vba_32.c|  5 ++
 .../dc/dml/dcn32/display_mode_vba_32.h|  3 +
 .../dc/dml/dcn32/display_mode_vba_util_32.c   | 14 -
 .../dc/dml/dcn32/display_mode_vba_util_32.h   |  3 +-
 .../amd/display/dc/dml/dcn321/dcn321_fpu.c|  2 +
 .../amd/display/dc/dml/display_mode_structs.h |  1 +
 .../display/dc/gpio/dcn32/hw_factory_dcn32.c  | 14 +
 drivers/gpu/drm/amd/display/dc/gpio/hw_ddc.c  |  9 ++-
 .../amd/display/dc/inc/hw/timing_generator.h  |  1 +
 .../gpu/drm/amd/display/dmub/inc/dmub_cmd.h   |  6 +-
 .../amd/display/modules/color/color_gamma.c   |  2 +-
 .../amd/display/modules/freesync/freesync.c   |  8 +--
 33 files changed, 279 insertions(+), 85 deletions(-)

-- 
2.25.1



Re: [PATCH 1/3] drm/amdgpu/gfx9: set gfx.funcs in early init

2022-11-02 Thread Christian König

Am 02.11.22 um 16:23 schrieb Alex Deucher:

So the callbacks are set before we use them.

Fixes: 0c9646e1a043 ("drm/amdgpu: switch to select_se_sh wrapper for gfx v9_0")
Signed-off-by: Alex Deucher 


Reviewed-by: Christian König  for the series.


---
  drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 877521230529..5d23a0f03615 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1921,8 +1921,6 @@ static int gfx_v9_0_gpu_early_init(struct amdgpu_device 
*adev)
u32 gb_addr_config;
int err;
  
-	adev->gfx.funcs = _v9_0_gfx_funcs;

-
switch (adev->ip_versions[GC_HWIP][0]) {
case IP_VERSION(9, 0, 1):
adev->gfx.config.max_hw_contexts = 8;
@@ -4541,6 +4539,8 @@ static int gfx_v9_0_early_init(void *handle)
  {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
+	adev->gfx.funcs = _v9_0_gfx_funcs;

+
if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 1) ||
adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 2))
adev->gfx.num_gfx_rings = 0;




[PATCH 1/2] drm/amdgpu: Fix type of second parameter in trans_msg() callback

2022-11-02 Thread Nathan Chancellor
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:412:15: error: incompatible function 
pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, 
u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, 
unsigned int, unsigned int)') with an expression of type 'void (struct 
amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct 
amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned int)') 
[-Werror,-Wincompatible-function-pointer-types-strict]
  .trans_msg = xgpu_ai_mailbox_trans_msg,
  ^
  1 error generated.

  drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:435:15: error: incompatible function 
pointer types initializing 'void (*)(struct amdgpu_device *, u32, u32, u32, 
u32)' (aka 'void (*)(struct amdgpu_device *, unsigned int, unsigned int, 
unsigned int, unsigned int)') with an expression of type 'void (struct 
amdgpu_device *, enum idh_request, u32, u32, u32)' (aka 'void (struct 
amdgpu_device *, enum idh_request, unsigned int, unsigned int, unsigned int)') 
[-Werror,-Wincompatible-function-pointer-types-strict]
  .trans_msg = xgpu_nv_mailbox_trans_msg,
  ^
  1 error generated.

The type of the second parameter in the prototype should be 'enum
idh_request' instead of 'u32'. Update it to clear up the warnings.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reported-by: Sami Tolvanen 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index d94c31e68a14..bc4f079fd48c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -74,6 +74,8 @@ struct amdgpu_vf_error_buffer {
uint64_t data[AMDGPU_VF_ERROR_ENTRY_SIZE];
 };
 
+enum idh_request;
+
 /**
  * struct amdgpu_virt_ops - amdgpu device virt operations
  */
@@ -83,7 +85,8 @@ struct amdgpu_virt_ops {
int (*req_init_data)(struct amdgpu_device *adev);
int (*reset_gpu)(struct amdgpu_device *adev);
int (*wait_reset)(struct amdgpu_device *adev);
-   void (*trans_msg)(struct amdgpu_device *adev, u32 req, u32 data1, u32 
data2, u32 data3);
+   void (*trans_msg)(struct amdgpu_device *adev, enum idh_request req,
+ u32 data1, u32 data2, u32 data3);
 };
 
 /*

base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780
-- 
2.38.1



[PATCH 2/2] drm/amdgpu: Fix type of second parameter in odn_edit_dpm_table() callback

2022-11-02 Thread Nathan Chancellor
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:3008:29: error: 
incompatible function pointer types initializing 'int (*)(void *, uint32_t, 
long *, uint32_t)' (aka 'int (*)(void *, unsigned int, long *, unsigned int)') 
with an expression of type 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, long *, 
uint32_t)' (aka 'int (void *, enum PP_OD_DPM_TABLE_COMMAND, long *, unsigned 
int)') [-Werror,-Wincompatible-function-pointer-types-strict]
  .odn_edit_dpm_table  = smu_od_edit_dpm_table,
 ^
  1 error generated.

There are only two implementations of ->odn_edit_dpm_table() in 'struct
amd_pm_funcs': smu_od_edit_dpm_table() and pp_odn_edit_dpm_table(). One
has a second parameter type of 'enum PP_OD_DPM_TABLE_COMMAND' and the
other uses 'u32'. Ultimately, smu_od_edit_dpm_table() calls
->od_edit_dpm_table() from 'struct pptable_funcs' and
pp_odn_edit_dpm_table() calls ->odn_edit_dpm_table() from 'struct
pp_hwmgr_func', which both have a second parameter type of 'enum
PP_OD_DPM_TABLE_COMMAND'.

Update the type parameter in both the prototype in 'struct amd_pm_funcs'
and pp_odn_edit_dpm_table() to 'enum PP_OD_DPM_TABLE_COMMAND', which
cleans up the warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reported-by: Sami Tolvanen 
Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/include/kgd_pp_interface.h   | 3 ++-
 drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/kgd_pp_interface.h 
b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
index a40ead44778a..d18162e9ed1d 100644
--- a/drivers/gpu/drm/amd/include/kgd_pp_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_pp_interface.h
@@ -354,7 +354,8 @@ struct amd_pm_funcs {
int (*get_power_profile_mode)(void *handle, char *buf);
int (*set_power_profile_mode)(void *handle, long *input, uint32_t size);
int (*set_fine_grain_clk_vol)(void *handle, uint32_t type, long *input, 
uint32_t size);
-   int (*odn_edit_dpm_table)(void *handle, uint32_t type, long *input, 
uint32_t size);
+   int (*odn_edit_dpm_table)(void *handle, enum PP_OD_DPM_TABLE_COMMAND 
type,
+ long *input, uint32_t size);
int (*set_mp1_state)(void *handle, enum pp_mp1_state mp1_state);
int (*smu_i2c_bus_access)(void *handle, bool acquire);
int (*gfx_state_change_set)(void *handle, uint32_t state);
diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c 
b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
index ec055858eb95..1159ae114dd0 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c
@@ -838,7 +838,8 @@ static int pp_set_fine_grain_clk_vol(void *handle, uint32_t 
type, long *input, u
return hwmgr->hwmgr_func->set_fine_grain_clk_vol(hwmgr, type, input, 
size);
 }
 
-static int pp_odn_edit_dpm_table(void *handle, uint32_t type, long *input, 
uint32_t size)
+static int pp_odn_edit_dpm_table(void *handle, enum PP_OD_DPM_TABLE_COMMAND 
type,
+long *input, uint32_t size)
 {
struct pp_hwmgr *hwmgr = handle;
 
-- 
2.38.1



[PATCH 2/3] drm/amdgpu/gfx10: set gfx.funcs in early init

2022-11-02 Thread Alex Deucher
So the callbacks are set early in case we need them.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index af94ac580d3e..f69d6289347d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -4453,8 +4453,6 @@ static void gfx_v10_0_gpu_early_init(struct amdgpu_device 
*adev)
 {
u32 gb_addr_config;
 
-   adev->gfx.funcs = _v10_0_gfx_funcs;
-
switch (adev->ip_versions[GC_HWIP][0]) {
case IP_VERSION(10, 1, 10):
case IP_VERSION(10, 1, 1):
@@ -7593,6 +7591,8 @@ static int gfx_v10_0_early_init(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   adev->gfx.funcs = _v10_0_gfx_funcs;
+
switch (adev->ip_versions[GC_HWIP][0]) {
case IP_VERSION(10, 1, 10):
case IP_VERSION(10, 1, 1):
-- 
2.37.3



[PATCH 3/3] drm/amdgpu/gfx11: set gfx.funcs in early init

2022-11-02 Thread Alex Deucher
So the callbacks are set early in case we need them.

Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index f68e13b6282c..4ebef1b33f3c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -843,7 +843,6 @@ static const struct amdgpu_gfx_funcs gfx_v11_0_gfx_funcs = {
 
 static int gfx_v11_0_gpu_early_init(struct amdgpu_device *adev)
 {
-   adev->gfx.funcs = _v11_0_gfx_funcs;
 
switch (adev->ip_versions[GC_HWIP][0]) {
case IP_VERSION(11, 0, 0):
@@ -4657,6 +4656,8 @@ static int gfx_v11_0_early_init(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   adev->gfx.funcs = _v11_0_gfx_funcs;
+
adev->gfx.num_gfx_rings = GFX11_NUM_GFX_RINGS;
adev->gfx.num_compute_rings = min(amdgpu_gfx_get_num_kcq(adev),
  AMDGPU_MAX_COMPUTE_RINGS);
-- 
2.37.3



[PATCH 1/3] drm/amdgpu/gfx9: set gfx.funcs in early init

2022-11-02 Thread Alex Deucher
So the callbacks are set before we use them.

Fixes: 0c9646e1a043 ("drm/amdgpu: switch to select_se_sh wrapper for gfx v9_0")
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 877521230529..5d23a0f03615 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1921,8 +1921,6 @@ static int gfx_v9_0_gpu_early_init(struct amdgpu_device 
*adev)
u32 gb_addr_config;
int err;
 
-   adev->gfx.funcs = _v9_0_gfx_funcs;
-
switch (adev->ip_versions[GC_HWIP][0]) {
case IP_VERSION(9, 0, 1):
adev->gfx.config.max_hw_contexts = 8;
@@ -4541,6 +4539,8 @@ static int gfx_v9_0_early_init(void *handle)
 {
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+   adev->gfx.funcs = _v9_0_gfx_funcs;
+
if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 1) ||
adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 4, 2))
adev->gfx.num_gfx_rings = 0;
-- 
2.37.3



Re: [PATCH] drm/amd/amdgpu: skip ras late init if it is not supported

2022-11-02 Thread Deucher, Alexander
[Public]

Acked-by: Alex Deucher 

From: amd-gfx  on behalf of Kenneth Feng 

Sent: Wednesday, November 2, 2022 6:14 AM
To: amd-gfx@lists.freedesktop.org 
Cc: Feng, Kenneth 
Subject: [PATCH] drm/amd/amdgpu: skip ras late init if it is not supported

skip ras late init on gc_11_0_3 if it is not supported,
in order to prevent the hardware init exception.

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 84a76c36d9a7..afe1fadc1e9d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -4707,7 +4707,7 @@ static int gfx_v11_0_late_init(void *handle)
 if (r)
 return r;

-   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(11, 0, 3)) {
+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(11, 0, 3) && 
adev->ras_enabled) {
 r = gfx_v11_0_ras_late_init(handle);
 if (r)
 return r;
--
2.25.1



[PATCH] drm/amdgpu: workaround for TLB seq race

2022-11-02 Thread Christian König
It can happen that we query the sequence value before the callback
had a chance to run.

Work around that by grabbing the fence lock and releasing it again.
Should be replaced by hw handling soon.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 9ecb7f663e19..e51a46c9582b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -485,6 +485,21 @@ void amdgpu_debugfs_vm_bo_info(struct amdgpu_vm *vm, 
struct seq_file *m);
  */
 static inline uint64_t amdgpu_vm_tlb_seq(struct amdgpu_vm *vm)
 {
+   unsigned long flags;
+   spinlock_t *lock;
+
+   /*
+* Work around to stop racing between the fence signaling and handling
+* the cb. The lock is static after initially setting it up, just make
+* sure that the dma_fence structure isn't freed up.
+*/
+   rcu_read_lock();
+   lock = vm->last_tlb_flush->lock;
+   rcu_read_unlock();
+
+   spin_lock_irqsave(lock, flags);
+   spin_unlock_irqrestore(lock, flags);
+
return atomic64_read(>tlb_seq);
 }
 
-- 
2.34.1



Re: [PATCH] drm/amd/display: add parameter backlight_min

2022-11-02 Thread Harry Wentland



On 2022-11-01 11:33, Filip Moc wrote:
> Hello Harry,
> 
> thank you for your response.
> 
>> amdgpu.backlight_min=2:-1
> 
> almost :-)
> 
> Array elements in module parameters are separated by commas not colons.
> So for cmdline it should look like this:
> amdgpu.backlight_min=2,-1
> 
> Though you can just drop the ,-1 relying on kernel leaving the rest of array
> untouched. Which I would recommend as there is no point for user to
> follow AMDGPU_DM_MAX_NUM_EDP.
> Only when you need to override some other display than display 0 then you may
> need to use -1. E.g. backlight_min=-1,2 overrides display 1 to min backlight=2
> while keeping display 0 with no override.
> 
> When amdgpu is loaded as a kernel module, backlight_min can also be passed as 
> a
> parameter to modprobe, e.g.:
> modprobe backlight_min=2
> So in that case you probably want to add something like
> options amdgpu backlight_min=2 to /etc/modprobe.d/amdgpu.conf
> (and also run update-initramfs if amdgpu is loaded by initramfs).
> 
> I'll add some examples to commit message in v2.
> 

Awesome. Thanks.

Harry

> Filip
> 
> 
> V Mon, Oct 31, 2022 at 10:24:25AM -0400, Harry Wentland napsal(a):
>> On 2022-10-29 15:13, Filip Moc wrote:
>>> There are some devices on which amdgpu won't allow user to set brightness
>>> to sufficiently low values even though the hardware would support it just
>>> fine.
>>>
>>> This usually happens in two cases when either configuration of brightness
>>> levels via ACPI/ATIF is not available and amdgpu falls back to defaults
>>> (currently 12 for minimum level) which may be too high for some devices or
>>> even the configuration via ATIF is available but the minimum brightness
>>> level provided by the manufacturer is set to unreasonably high value.
>>>
>>> In either case user can use this new module parameter to adjust the
>>> minimum allowed backlight brightness level.
>>>
>>
>> Thanks for this patch and covering all the bases.
>>
>> It might be useful to have an example in the commit description on
>> how to set the array property. I assume it looks like this if I
>> wanted to set the first device to a minimum of 2 and leave the default
>> for the 2nd one:
>>
>> amdgpu.backlight_min=2:-1
>>
>> Either way, this patch is
>> Reviewed-by: Harry Wentland 
>>
>> Harry
>>
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203439 
>>> Signed-off-by: Filip Moc 
>>> ---
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h   |  3 +++
>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   | 15 +++
>>>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++
>>>  3 files changed, 33 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index 0e6ddf05c23c..c5445402c49d 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -200,6 +200,9 @@ extern uint amdgpu_dc_debug_mask;
>>>  extern uint amdgpu_dc_visual_confirm;
>>>  extern uint amdgpu_dm_abm_level;
>>>  extern int amdgpu_backlight;
>>> +#ifdef CONFIG_DRM_AMD_DC
>>> +extern int amdgpu_backlight_override_min[];
>>> +#endif
>>>  extern struct amdgpu_mgpu_info mgpu_info;
>>>  extern int amdgpu_ras_enable;
>>>  extern uint amdgpu_ras_mask;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index 16f6a313335e..f2fb549ac52f 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -43,6 +43,7 @@
>>>  #include "amdgpu_irq.h"
>>>  #include "amdgpu_dma_buf.h"
>>>  #include "amdgpu_sched.h"
>>> +#include "amdgpu_dm.h"
>>>  #include "amdgpu_fdinfo.h"
>>>  #include "amdgpu_amdkfd.h"
>>>  
>>> @@ -853,6 +854,20 @@ int amdgpu_backlight = -1;
>>>  MODULE_PARM_DESC(backlight, "Backlight control (0 = pwm, 1 = aux, -1 auto 
>>> (default))");
>>>  module_param_named(backlight, amdgpu_backlight, bint, 0444);
>>>  
>>> +/**
>>> + * DOC: backlight_min (array of int)
>>> + * Override minimum allowed backlight brightness signal (per display).
>>> + * Must be less than the maximum brightness signal.
>>> + * Negative value means no override.
>>> + *
>>> + * Defaults to all -1 (no override on any display).
>>> + */
>>> +#ifdef CONFIG_DRM_AMD_DC
>>> +int amdgpu_backlight_override_min[AMDGPU_DM_MAX_NUM_EDP] = {[0 ... 
>>> (AMDGPU_DM_MAX_NUM_EDP-1)] = -1};
>>> +MODULE_PARM_DESC(backlight_min, "Override minimum backlight brightness 
>>> signal (0..max-1, -1 = no override (default))");
>>> +module_param_array_named(backlight_min, amdgpu_backlight_override_min, 
>>> int, NULL, 0444);
>>> +#endif
>>> +
>>>  /**
>>>   * DOC: tmz (int)
>>>   * Trusted Memory Zone (TMZ) is a method to protect data being written
>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> index eb4ce7216104..e2c36ba93d05 100644
>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
>>> +++ 

Re: [PATCH] drm/amdkfd: Fix error handling in criu_checkpoint

2022-11-02 Thread Felix Kuehling

Am 2022-11-01 um 22:19 schrieb Bhardwaj, Rajneesh:


On 11/1/2022 3:15 PM, Felix Kuehling wrote:

Checkpoint BOs last. That way we don't need to close dmabuf FDs if
something else fails later. This avoids problematic access to user mode
memory in the error handling code path.

criu_checkpoint_bos has its own error handling and cleanup that does not
depend on access to user memory.



This seems to be breaking the restore operation. I did a quick pytorch 
based test and I can confirm that restore operation fails with this 
change applied.


Ah yes, we need to restore things from the private data area in the same 
order that they were saved. I'll send an updated patch.


What's the cause for the call trace below? Is this a kernel oops or a 
warning? If it's an oops, it's concerning because it could be caused by 
a corrupted checkpoint as well.


Thanks,
  Felix




[  +0.03] CR2: 55b6726e0020 CR3: 0001283fe000 CR4: 
003 50ee0

[  +0.02] Call Trace:
[  +0.02]  
[  +0.03]  kfd_ioctl_criu+0xd4c/0x1930 [amdgpu]
[  +0.000185]  ? __might_fault+0x32/0x80
[  +0.04]  ? lock_release+0x1fd/0x2b0
[  +0.10]  kfd_ioctl+0x29b/0x600 [amdgpu]
[  +0.000153]  ? kfd_ioctl_get_tile_config+0x130/0x130 [amdgpu]
[  +0.000158]  __x64_sys_ioctl+0x8b/0xd0
[  +0.03]  ? lockdep_hardirqs_on+0x79/0x100
[  +0.07]  do_syscall_64+0x34/0x80
[  +0.04]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.05] RIP: 0033:0x7f1c87e7f317
[  +0.02] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 
48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 
0f 05 <4 8> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 
01 48
[  +0.03] RSP: 002b:7fff630af518 EFLAGS: 0202 ORIG_RAX: 
00 10
[  +0.03] RAX: ffda RBX: 7f1c89351620 RCX: 
7f1c87e 7f317
[  +0.02] RDX: 7fff630af5c0 RSI: c0384b22 RDI: 
000 5
[  +0.02] RBP: 7fff630af550 R08:  R09: 
7f1c87e d7c10
[  +0.02] R10: 705f757067646d61 R11: 0202 R12: 
55a4a05 14c60
[  +0.02] R13: 55a49eb81540 R14: 55a49e90eea9 R15: 
7fff630 b069c

[  +0.10]  
[  +0.02] irq event stamp: 50181
[  +0.02] hardirqs last  enabled at (50187): [] 
__up _console_sem+0x52/0x60
[  +0.03] hardirqs last disabled at (50192): [] 
__up _console_sem+0x37/0x60
[  +0.03] softirqs last  enabled at (45940): [] 
sock _setsockopt+0x223/0xfa0
[  +0.03] softirqs last disabled at (45938): [] 
rele ase_sock+0x19/0xa0

[  +0.04] ---[ end trace  ]---
[  +0.02] amdgpu: Could not allocate idr
[  +0.000245] amdgpu: Failed to restore CRIU ret:-12
[Nov 1 22:11] loop0: detected capacity change from 0 to 8

https://github.com/checkpoint-restore/criu/blob/criu-dev/plugins/amdgpu/amdgpu_plugin.c 
:



(00.093977) 11: Added GPU mapping [0xC093 -> 0xC093]
(00.093982) 11: ===Maps===
(00.093987) 11: GPU: 0xC093 -> 0xC093
(00.093992) 11: CPU: 00 -> 00
(00.093997) 11: ==
(00.094002) 11: Matched destination node 0xC093
(00.094007) 11: All nodes mapped successfully
(00.094012) 11: Matched nodes 0xC093 and after
(00.094017) 11: Maps after all nodes matched
(00.094022) 11: ===Maps===
(00.094027) 11: GPU: 0xC093 -> 0xC093
(00.094032) 11: CPU: 00 -> 00
(00.094037) 11: ==
(00.094041) 11: amdgpu_plugin: Restoring 1 devices
(00.094319) 11: amdgpu_plugin: amdgpu_plugin: passing drm render 
fd = 10 to driver

(00.094326) 11: amdgpu_plugin: Restore devices Ok (ret:0)
(00.094331) 11: amdgpu_plugin: Restoring 184 BOs
(00.094349) 11: amdgpu_plugin: Restore BOs Ok
(00.095791) 11: Error (amdgpu_plugin.c:1830): amdgpu_plugin: 
Restore ioctl failed: Cannot allocate memory
(00.095916) 11: Error (amdgpu_plugin.c:1850): amdgpu_plugin: 
amdgpu_plugin: Failed to restore (ret:-1)

(00.095951) 11: Error (criu/files-ext.c:53): Unable to restore 0x143
(00.095961) 11: Error (criu/files.c:1213): Unable to open fd=4 
id=0x143

(00.096078) Unlink remap /dev/shm/fvKoKz.cr.1.ghost
(00.096152) Error (criu/cr-restore.c:2531): Restoring FAILED.
(00.096181) amdgpu_plugin: amdgpu_plugin: finished  amdgpu_plugin 
(AMDGPU/KFD)

"restore.log" 4194L, 201090C



Fixes: be072b06c739 ("drm/amdkfd: CRIU export BOs as prime dmabuf 
objects")

Reported-by: Jann Horn 
CC: Rajneesh Bhardwaj 
Signed-off-by: Felix Kuehling 
---
  drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 27 +++-
  1 file changed, 8 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c

index 5feaba6a77de..aabab9010812 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1994,38 +1994,27 @@ static int criu_checkpoint(struct file *filep,
  if (ret)
  goto exit_unlock;
  -    ret = 

Re: [6.1][regression] after commit dd80d9c8eecac8c516da5b240d01a35660ba6cb6 some games (Cyberpunk 2077, Forza Horizon 4/5) hang at start

2022-11-02 Thread Christian König

Am 02.11.22 um 14:36 schrieb Mikhail Gavrilov:

On Tue, Nov 1, 2022 at 10:52 PM Christian König
 wrote:

Let's focus on one problem at a time.

The issue here is that somehow userptr handling became racy after we
removed the lock, but I don't see why.

We need to fix this ASAP since it is probably a much wider problem and
the additional lock just hides it somehow.

Going to provide you with an updated patch tomorrow.

Thanks,
Christian.

Recently sackboy has been updated and now the kernel log contains a
trace very similar to the one in the first post, even with the patch
applied.

[  155.948044] [ cut here ]
[  155.948164] WARNING: CPU: 3 PID: 4850 at
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:678
amdgpu_ttm_tt_get_user_pages+0x14c/0x190 [amdgpu]
[  155.948342] Modules linked in: uinput rfcomm snd_seq_dummy
snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink
qrtr bnep intel_rapl_msr intel_rapl_common snd_hda_codec_realtek
snd_sof_amd_renoir snd_sof_amd_acp snd_hda_codec_generic
snd_hda_codec_hdmi snd_sof_pci sunrpc binfmt_misc snd_sof
snd_hda_intel snd_sof_utils snd_intel_dspcfg mt7921e
snd_intel_sdw_acpi snd_hda_codec mt7921_common snd_soc_core
edac_mce_amd mt76_connac_lib btusb snd_hda_core snd_compress snd_hwdep
mt76 btrtl ac97_bus kvm_amd snd_pcm_dmaengine btbcm snd_rpl_pci_acp6x
snd_pci_acp6x btintel mac80211 btmtk snd_seq snd_seq_device kvm
snd_pcm snd_pci_acp5x libarc4 bluetooth irqbypass vfat snd_timer
snd_rn_pci_acp3x fat rapl snd_acp_config asus_nb_wmi snd cfg80211
snd_soc_acpi wmi_bmof k10temp pcspkr
[  155.948436]  snd_pci_acp3x i2c_piix4 soundcore asus_wireless
amd_pmc joydev zram amdgpu drm_ttm_helper ttm crct10dif_pclmul
hid_asus crc32_pclmul asus_wmi crc32c_intel iommu_v2 ledtrig_audio
polyval_clmulni gpu_sched sparse_keymap polyval_generic
platform_profile drm_buddy drm_display_helper nvme rfkill
ghash_clmulni_intel hid_multitouch ucsi_acpi sha512_ssse3 nvme_core
typec_ucsi serio_raw sp5100_tco r8169 ccp cec nvme_common typec
i2c_hid_acpi i2c_hid video wmi ip6_tables ip_tables fuse
[  155.948540] CPU: 3 PID: 4850 Comm: Sackboy-Win64-T Tainted: G
  WL---  ---
6.1.0-0.rc3.20221101git5aaef24b5c6d.29.fc38.x86_64 #1
[  155.948544] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
G513QY_G513QY/G513QY, BIOS G513QY.318 03/29/2022
[  155.948547] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x14c/0x190 [amdgpu]
[  155.948748] Code: 9e f1 e9 32 ff ff ff 4c 89 e9 89 ea 48 c7 c6 a8
a3 fd c0 48 c7 c7 88 81 1e c1 e8 af 97 ea f1 eb 8e 66 90 bd f2 ff ff
ff eb 8d <0f> 0b eb f5 bd fd ff ff ff eb 82 bd f2 ff ff ff e9 62 ff ff
ff 48
[  155.948751] RSP: 0018:960b544d3a50 EFLAGS: 00010282
[  155.948756] RAX: 8a4e40d44e00 RBX: 8a4f0e564140 RCX: 0001
[  155.948759] RDX:  RSI: 8a4e40d44e00 RDI: 8a4f4b52b400
[  155.948761] RBP: 8a4e8c979000 R08: 0dc0 R09: 
[  155.948764] R10: 0001 R11:  R12: 8a4e8aaad558
[  155.948767] R13: 3b91 R14: 8a4f0e667180 R15: 8a4f4b52b458
[  155.948770] FS:  7fa13fe006c0() GS:8a5d16e0()
knlGS:36f8
[  155.948772] CS:  0010 DS:  ES:  CR0: 80050033
[  155.948775] CR2: 25c9e1d0 CR3: 00036199 CR4: 00750ee0
[  155.948778] PKRU: 5554
[  155.948780] Call Trace:
[  155.948783]  
[  155.948790]  amdgpu_cs_ioctl+0x9fd/0x2030 [amdgpu]
[  155.948992]  ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[  155.949155]  drm_ioctl_kernel+0xac/0x160
[  155.949165]  drm_ioctl+0x1e7/0x450
[  155.949172]  ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[  155.949344]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[  155.949528]  __x64_sys_ioctl+0x90/0xd0
[  155.949537]  do_syscall_64+0x5b/0x80
[  155.949547]  ? lock_is_held_type+0xe8/0x140
[  155.949559]  ? do_syscall_64+0x67/0x80
[  155.949565]  ? lockdep_hardirqs_on+0x7d/0x100
[  155.949573]  ? do_syscall_64+0x67/0x80
[  155.949579]  ? do_syscall_64+0x67/0x80
[  155.949586]  ? do_syscall_64+0x67/0x80
[  155.949592]  ? lockdep_hardirqs_on+0x7d/0x100
[  155.949597]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  155.949603] RIP: 0033:0x7fa1b7fd912f
[  155.949610] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24
10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28
00 00
[  155.949615] RSP: 002b:7fa13fdfe920 EFLAGS: 0246 ORIG_RAX:
0010
[  155.949621] RAX: ffda RBX: 7fa13fdfebe8 RCX: 7fa1b7fd912f
[  155.949625] RDX: 7fa13fdfea10 RSI: c0186444 RDI: 0165
[  155.949629] RBP: 7fa13fdfea10 R08: 7f9ff80018e0 R09: 7fa13fdfe9c0
[  155.949633] R10: 7eb11590 R11: 

Re: [6.1][regression] after commit dd80d9c8eecac8c516da5b240d01a35660ba6cb6 some games (Cyberpunk 2077, Forza Horizon 4/5) hang at start

2022-11-02 Thread Mikhail Gavrilov
On Tue, Nov 1, 2022 at 10:52 PM Christian König
 wrote:
>
> Let's focus on one problem at a time.
>
> The issue here is that somehow userptr handling became racy after we
> removed the lock, but I don't see why.
>
> We need to fix this ASAP since it is probably a much wider problem and
> the additional lock just hides it somehow.
>
> Going to provide you with an updated patch tomorrow.
>
> Thanks,
> Christian.

Recently sackboy has been updated and now the kernel log contains a
trace very similar to the one in the first post, even with the patch
applied.

[  155.948044] [ cut here ]
[  155.948164] WARNING: CPU: 3 PID: 4850 at
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:678
amdgpu_ttm_tt_get_user_pages+0x14c/0x190 [amdgpu]
[  155.948342] Modules linked in: uinput rfcomm snd_seq_dummy
snd_hrtimer nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink
qrtr bnep intel_rapl_msr intel_rapl_common snd_hda_codec_realtek
snd_sof_amd_renoir snd_sof_amd_acp snd_hda_codec_generic
snd_hda_codec_hdmi snd_sof_pci sunrpc binfmt_misc snd_sof
snd_hda_intel snd_sof_utils snd_intel_dspcfg mt7921e
snd_intel_sdw_acpi snd_hda_codec mt7921_common snd_soc_core
edac_mce_amd mt76_connac_lib btusb snd_hda_core snd_compress snd_hwdep
mt76 btrtl ac97_bus kvm_amd snd_pcm_dmaengine btbcm snd_rpl_pci_acp6x
snd_pci_acp6x btintel mac80211 btmtk snd_seq snd_seq_device kvm
snd_pcm snd_pci_acp5x libarc4 bluetooth irqbypass vfat snd_timer
snd_rn_pci_acp3x fat rapl snd_acp_config asus_nb_wmi snd cfg80211
snd_soc_acpi wmi_bmof k10temp pcspkr
[  155.948436]  snd_pci_acp3x i2c_piix4 soundcore asus_wireless
amd_pmc joydev zram amdgpu drm_ttm_helper ttm crct10dif_pclmul
hid_asus crc32_pclmul asus_wmi crc32c_intel iommu_v2 ledtrig_audio
polyval_clmulni gpu_sched sparse_keymap polyval_generic
platform_profile drm_buddy drm_display_helper nvme rfkill
ghash_clmulni_intel hid_multitouch ucsi_acpi sha512_ssse3 nvme_core
typec_ucsi serio_raw sp5100_tco r8169 ccp cec nvme_common typec
i2c_hid_acpi i2c_hid video wmi ip6_tables ip_tables fuse
[  155.948540] CPU: 3 PID: 4850 Comm: Sackboy-Win64-T Tainted: G
 WL---  ---
6.1.0-0.rc3.20221101git5aaef24b5c6d.29.fc38.x86_64 #1
[  155.948544] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
G513QY_G513QY/G513QY, BIOS G513QY.318 03/29/2022
[  155.948547] RIP: 0010:amdgpu_ttm_tt_get_user_pages+0x14c/0x190 [amdgpu]
[  155.948748] Code: 9e f1 e9 32 ff ff ff 4c 89 e9 89 ea 48 c7 c6 a8
a3 fd c0 48 c7 c7 88 81 1e c1 e8 af 97 ea f1 eb 8e 66 90 bd f2 ff ff
ff eb 8d <0f> 0b eb f5 bd fd ff ff ff eb 82 bd f2 ff ff ff e9 62 ff ff
ff 48
[  155.948751] RSP: 0018:960b544d3a50 EFLAGS: 00010282
[  155.948756] RAX: 8a4e40d44e00 RBX: 8a4f0e564140 RCX: 0001
[  155.948759] RDX:  RSI: 8a4e40d44e00 RDI: 8a4f4b52b400
[  155.948761] RBP: 8a4e8c979000 R08: 0dc0 R09: 
[  155.948764] R10: 0001 R11:  R12: 8a4e8aaad558
[  155.948767] R13: 3b91 R14: 8a4f0e667180 R15: 8a4f4b52b458
[  155.948770] FS:  7fa13fe006c0() GS:8a5d16e0()
knlGS:36f8
[  155.948772] CS:  0010 DS:  ES:  CR0: 80050033
[  155.948775] CR2: 25c9e1d0 CR3: 00036199 CR4: 00750ee0
[  155.948778] PKRU: 5554
[  155.948780] Call Trace:
[  155.948783]  
[  155.948790]  amdgpu_cs_ioctl+0x9fd/0x2030 [amdgpu]
[  155.948992]  ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[  155.949155]  drm_ioctl_kernel+0xac/0x160
[  155.949165]  drm_ioctl+0x1e7/0x450
[  155.949172]  ? amdgpu_cs_find_mapping+0xe0/0xe0 [amdgpu]
[  155.949344]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[  155.949528]  __x64_sys_ioctl+0x90/0xd0
[  155.949537]  do_syscall_64+0x5b/0x80
[  155.949547]  ? lock_is_held_type+0xe8/0x140
[  155.949559]  ? do_syscall_64+0x67/0x80
[  155.949565]  ? lockdep_hardirqs_on+0x7d/0x100
[  155.949573]  ? do_syscall_64+0x67/0x80
[  155.949579]  ? do_syscall_64+0x67/0x80
[  155.949586]  ? do_syscall_64+0x67/0x80
[  155.949592]  ? lockdep_hardirqs_on+0x7d/0x100
[  155.949597]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  155.949603] RIP: 0033:0x7fa1b7fd912f
[  155.949610] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24
10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28
00 00
[  155.949615] RSP: 002b:7fa13fdfe920 EFLAGS: 0246 ORIG_RAX:
0010
[  155.949621] RAX: ffda RBX: 7fa13fdfebe8 RCX: 7fa1b7fd912f
[  155.949625] RDX: 7fa13fdfea10 RSI: c0186444 RDI: 0165
[  155.949629] RBP: 7fa13fdfea10 R08: 7f9ff80018e0 R09: 7fa13fdfe9c0
[  155.949633] R10: 7eb11590 R11: 0246 R12: c0186444
[  

Re: [RESEND PATCH] drm/amd/amdgpu: Replace kmap() with kmap_local_page()

2022-11-02 Thread Alex Deucher
On Tue, Nov 1, 2022 at 7:21 PM Fabio M. De Francesco
 wrote:
>
> On lunedì 17 ottobre 2022 18:53:24 CET Alex Deucher wrote:
> > Applied.  Thanks!
> >
>
> The same report about which I just wrote in my previous email to you is also
> referring to this patch which later changed status to "Not Applicable".
>
> It points to https://patchwork.linuxtv.org/project/linux-media/patch/
> 20220812175753.22926-1-fmdefrance...@gmail.com/
>
> Can you please let me understand why?

I'm not sure I understand what you are asking.  The patch is applied:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/a2c554262d39f81be7422fd8bee2f2fe3779f7f5

Alex

>
> Thanks,
>
> Fabio
>
>
>


Re: [PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8)

2022-11-02 Thread Michel Dänzer


[ Dropping Andrey's no longer working address from Cc ]

On 2022-11-01 11:09, Michel Dänzer wrote:
> On 2022-11-01 10:58, Zhu, Jiadong wrote:
>>
>>> Patch 3 assigns preempt_ib in gfx_v9_0_sw_ring_funcs_gfx, but not in 
>>> gfx_v9_0_ring_funcs_gfx. mux->real_ring in amdgpu_mcbp_trigger_preempt 
>>> presumably uses the latter, which would explain why amdgpu_ring_preempt_ib 
>>> ends up dereferencing a NULL pointer.
>>
>> It's weird the assignment should be in gfx_v9_0_ring_funcs_gfx instead of 
>> gfx_v9_0_sw_ring_funcs_gfx.
>>
>> [PATCH 3/5] drm/amdgpu: Modify unmap_queue format for gfx9 (v4):
>> @@ -6925,6 +7047,7 @@ static const struct amdgpu_ring_funcs 
>> gfx_v9_0_ring_funcs_gfx = {
>> .emit_cntxcntl = gfx_v9_ring_emit_cntxcntl,
>> .init_cond_exec = gfx_v9_0_ring_emit_init_cond_exec,
>> .patch_cond_exec = gfx_v9_0_ring_emit_patch_cond_exec,
>> +   .preempt_ib = gfx_v9_0_ring_preempt_ib,
>> .emit_frame_cntl = gfx_v9_0_ring_emit_frame_cntl,
>> .emit_wreg = gfx_v9_0_ring_emit_wreg,
>> .emit_reg_wait = gfx_v9_0_ring_emit_reg_wait,
>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15d.h 
>> b/drivers/gpu/drm/amd/amdgpu/soc15d.h
> 
> Ah! Looks like stg applied patch 3 incorrectly for me. :(
> 
> I'll try and test with this fixed this week, and report back.

I'm now running with patch 3 applied correctly, and with patch 5 as well.


The good news is that I'm now seeing a positive effect with GpuTest benchmarks 
which are GPU-limited at low frame rates. In particular, with the pixmark piano 
benchmark, the GNOME Wayland session now actually stays more responsive on this 
machine than it does on my work laptop with an Intel iGPU. However, with the 
plot3d benchmark (with /plot3d_vertex_density=1750 on the command line to 
increase GPU load), it still doesn't quite manage to keep the desktop running 
at full frame rate, in contrast to the Intel iGPU.

The bad news is that this series still makes some things very slow. The most 
extreme examples so far are glxgears (runs at ~400 fps now, ~7000 fps before, 
i.e. almost 20x slowdown) and hexchat (scrolling one page now takes ~1 second, 
I can see it drawing line by line; before it was almost instantaneous). I 
suspect this series makes the overhead of running a single GPU job much bigger. 
On the bright side, I'm not noticing any significant intermittent freezes 
anymore.


In summary, while the benefits are promising, the downsides are unacceptable 
for enabling this by default.


-- 
Earthling Michel Dänzer|  https://redhat.com
Libre software enthusiast  | Mesa and Xwayland developer



Re: [PATCH v2 17/21] drm/fb-helper: Perform all fbdev I/O with the same implementation

2022-11-02 Thread Javier Martinez Canillas
On 11/2/22 11:33, Thomas Zimmermann wrote:

[...]

>>
>>> +static ssize_t __drm_fb_helper_write(struct fb_info *info, const char 
>>> __user *buf, size_t count,
>>> +loff_t *ppos, drm_fb_helper_write_screen 
>>> write_screen)
>>> +{
>>
>> [...]
>>
>>> +   /*
>>> +* Copy to framebuffer even if we already logged an error. Emulates
>>> +* the behavior of the original fbdev implementation.
>>> +*/
>>> +   ret = write_screen(info, buf, count, pos);
>>> +   if (ret < 0)
>>> +   return ret; /* return last error, if any */
>>> +   else if (!ret)
>>> +   return err; /* return previous error, if any */
>>> +
>>> +   *ppos += ret;
>>> +
>>
>> Should *ppos be incremented even if the previous error is returned?
> 
> Yes. It emulates the original fbdev code at [1]. Further down in that 
> function, the position is being updated even if an error occured. We 
> only return the initial error if no bytes got written.
> 
> It could happen that some userspace program hits to error, but still 
> relies on the output and position being updated. IIRC I even added 
> validation of this behavior to the IGT fbdev tests.  I agree that this 
> is somewhat bogus behavior, but changing it would change long-standing 
> userspace semantics.
>

Thanks for the explanation, feel free then to also add to this patch:

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 17/21] drm/fb-helper: Perform all fbdev I/O with the same implementation

2022-11-02 Thread Thomas Zimmermann

Hi

Am 02.11.22 um 10:32 schrieb Javier Martinez Canillas:

On 10/24/22 13:19, Thomas Zimmermann wrote:

Implement the fbdev's read/write helpers with the same functions. Use
the generic fbdev's code as template. Convert all drivers.

DRM's fb helpers must implement regular I/O functionality in struct
fb_ops and possibly perform a damage update. Handle all this in the
same functions and convert drivers. The functionality has been used
as part of the generic fbdev code for some time. The drivers don't
set struct drm_fb_helper.fb_dirty, so they will not be affected by
damage handling.

For I/O memory, fb helpers now provide drm_fb_helper_cfb_read() and
drm_fb_helper_cfb_write(). Several drivers require these. Until now
tegra used I/O read and write, although the memory buffer appears to
be in system memory. So use _sys_ helpers now.

Signed-off-by: Thomas Zimmermann 
---


[...]


+static ssize_t __drm_fb_helper_write(struct fb_info *info, const char __user 
*buf, size_t count,
+loff_t *ppos, drm_fb_helper_write_screen 
write_screen)
+{


[...]


+   /*
+* Copy to framebuffer even if we already logged an error. Emulates
+* the behavior of the original fbdev implementation.
+*/
+   ret = write_screen(info, buf, count, pos);
+   if (ret < 0)
+   return ret; /* return last error, if any */
+   else if (!ret)
+   return err; /* return previous error, if any */
+
+   *ppos += ret;
+


Should *ppos be incremented even if the previous error is returned?


Yes. It emulates the original fbdev code at [1]. Further down in that 
function, the position is being updated even if an error occured. We 
only return the initial error if no bytes got written.


It could happen that some userspace program hits to error, but still 
relies on the output and position being updated. IIRC I even added 
validation of this behavior to the IGT fbdev tests.  I agree that this 
is somewhat bogus behavior, but changing it would change long-standing 
userspace semantics.


[1] 
https://elixir.bootlin.com/linux/v6.0.6/source/drivers/video/fbdev/core/fbmem.c#L825




The write_screen() succeeded anyways, even when the count written was
smaller than what the caller asked for.


  /**
- * drm_fb_helper_sys_read - wrapper around fb_sys_read
+ * drm_fb_helper_sys_read - Implements struct _ops.fb_read for system memory
   * @info: fb_info struct pointer
   * @buf: userspace buffer to read from framebuffer memory
   * @count: number of bytes to read from framebuffer memory
   * @ppos: read offset within framebuffer memory
   *
- * A wrapper around fb_sys_read implemented by fbdev core
+ * Returns:
+ * The number of read bytes on success, or an error code otherwise.
   */


This sentence sounds a little bit off to me. Shouldn't be "number of bytes read"
instead? I'm not a native English speaker though, so feel free to just ignore 
me.


You're right.



[...]

  
+static ssize_t fb_read_screen_base(struct fb_info *info, char __user *buf, size_t count,

+  loff_t pos)
+{
+   const char __iomem *src = info->screen_base + pos;
+   size_t alloc_size = min_t(size_t, count, PAGE_SIZE);
+   ssize_t ret = 0;
+   int err = 0;


Do you really need these two? AFAIK ssize_t is a signed type


I think so. We'll go through the while loop multiple times. If we fail 
on the initial iteration, we return the error in err. If we fail on any 
later iteration, we return the number of processed bytes.  Having this 
in two variables simplifies the logic AFAICT.


Best regards
Thomas


so you can just use the ret variable to store and return the
errno value.

[...]


+static ssize_t fb_write_screen_base(struct fb_info *info, const char __user 
*buf, size_t count,
+   loff_t pos)
+{
+   char __iomem *dst = info->screen_base + pos;
+   size_t alloc_size = min_t(size_t, count, PAGE_SIZE);
+   ssize_t ret = 0;
+   int err = 0;


Same here.



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] drm/amd/amdgpu: skip ras late init if it is not supported

2022-11-02 Thread Kenneth Feng
skip ras late init on gc_11_0_3 if it is not supported,
in order to prevent the hardware init exception.

Signed-off-by: Kenneth Feng 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index 84a76c36d9a7..afe1fadc1e9d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -4707,7 +4707,7 @@ static int gfx_v11_0_late_init(void *handle)
if (r)
return r;
 
-   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(11, 0, 3)) {
+   if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(11, 0, 3) && 
adev->ras_enabled) {
r = gfx_v11_0_ras_late_init(handle);
if (r)
return r;
-- 
2.25.1



Re: [PATCH v2 21/21] drm/fb-helper: Remove unnecessary include statements

2022-11-02 Thread Javier Martinez Canillas
On 10/24/22 13:19, Thomas Zimmermann wrote:
> Remove include statements for  where it is not
> required (i.e., most of them). In a few places include other header
> files that are required by the source code.
> 
> Signed-off-by: Thomas Zimmermann 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 20/21] drm/fb-helper: Move generic fbdev emulation into separate source file

2022-11-02 Thread Javier Martinez Canillas
On 10/24/22 13:19, Thomas Zimmermann wrote:
> Move the generic fbdev implementation into its own source and header
> file. Adapt drivers. No functonal changes, but some of the internal
> helpers have been renamed to fit into the drm_fbdev_ naming scheme.
> 
> Signed-off-by: Thomas Zimmermann 
> ---
Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 19/21] drm/fb-helper: Always initialize generic fbdev emulation

2022-11-02 Thread Javier Martinez Canillas
On 10/24/22 13:19, Thomas Zimmermann wrote:
> Initialize the generic fbdev emulation even if it has been disabled
> on the kernel command line. The hotplug and mode initialization will
> fail accordingly.
> 
> The kernel parameter can still be changed at runtime and the emulation
> will initialize after hotplugging the connector.
> 
> Signed-off-by: Thomas Zimmermann 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 18/21] drm/fb_helper: Minimize damage-helper overhead

2022-11-02 Thread Javier Martinez Canillas
On 10/24/22 13:19, Thomas Zimmermann wrote:
> Pull the test for fb_dirty into the caller to avoid extra work
> if no callback has been set. In this case no damage handling is
> required and no damage area needs to be computed. Print a warning
> if the damage worker runs without getting an fb_dirty callback.
> 
> Signed-off-by: Thomas Zimmermann 
> ---

Reviewed-by: Javier Martinez Canillas 

But I've a trivial comment below:

>  drivers/gpu/drm/drm_fb_helper.c | 90 ++---
>  1 file changed, 60 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
> index 836523aef6a27..fbc5c5445fdb0 100644
> --- a/drivers/gpu/drm/drm_fb_helper.c
> +++ b/drivers/gpu/drm/drm_fb_helper.c
> @@ -449,12 +449,13 @@ static int drm_fb_helper_damage_blit(struct 
> drm_fb_helper *fb_helper,
>  static void drm_fb_helper_damage_work(struct work_struct *work)
>  {
>   struct drm_fb_helper *helper = container_of(work, struct drm_fb_helper, 
> damage_work);
> + struct drm_device *dev = helper->dev;

You removed this in patch #15, maybe just leaving it in that patch if you
plan to use it again here?

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 17/21] drm/fb-helper: Perform all fbdev I/O with the same implementation

2022-11-02 Thread Javier Martinez Canillas
On 10/24/22 13:19, Thomas Zimmermann wrote:
> Implement the fbdev's read/write helpers with the same functions. Use
> the generic fbdev's code as template. Convert all drivers.
> 
> DRM's fb helpers must implement regular I/O functionality in struct
> fb_ops and possibly perform a damage update. Handle all this in the
> same functions and convert drivers. The functionality has been used
> as part of the generic fbdev code for some time. The drivers don't
> set struct drm_fb_helper.fb_dirty, so they will not be affected by
> damage handling.
> 
> For I/O memory, fb helpers now provide drm_fb_helper_cfb_read() and
> drm_fb_helper_cfb_write(). Several drivers require these. Until now
> tegra used I/O read and write, although the memory buffer appears to
> be in system memory. So use _sys_ helpers now.
> 
> Signed-off-by: Thomas Zimmermann 
> ---

[...]

> +static ssize_t __drm_fb_helper_write(struct fb_info *info, const char __user 
> *buf, size_t count,
> +  loff_t *ppos, drm_fb_helper_write_screen 
> write_screen)
> +{

[...]

> + /*
> +  * Copy to framebuffer even if we already logged an error. Emulates
> +  * the behavior of the original fbdev implementation.
> +  */
> + ret = write_screen(info, buf, count, pos);
> + if (ret < 0)
> + return ret; /* return last error, if any */
> + else if (!ret)
> + return err; /* return previous error, if any */
> +
> + *ppos += ret;
> +

Should *ppos be incremented even if the previous error is returned?

The write_screen() succeeded anyways, even when the count written was
smaller than what the caller asked for.

>  /**
> - * drm_fb_helper_sys_read - wrapper around fb_sys_read
> + * drm_fb_helper_sys_read - Implements struct _ops.fb_read for system 
> memory
>   * @info: fb_info struct pointer
>   * @buf: userspace buffer to read from framebuffer memory
>   * @count: number of bytes to read from framebuffer memory
>   * @ppos: read offset within framebuffer memory
>   *
> - * A wrapper around fb_sys_read implemented by fbdev core
> + * Returns:
> + * The number of read bytes on success, or an error code otherwise.
>   */

This sentence sounds a little bit off to me. Shouldn't be "number of bytes read"
instead? I'm not a native English speaker though, so feel free to just ignore 
me.

[...]

>  
> +static ssize_t fb_read_screen_base(struct fb_info *info, char __user *buf, 
> size_t count,
> +loff_t pos)
> +{
> + const char __iomem *src = info->screen_base + pos;
> + size_t alloc_size = min_t(size_t, count, PAGE_SIZE);
> + ssize_t ret = 0;
> + int err = 0;

Do you really need these two? AFAIK ssize_t is a signed type
so you can just use the ret variable to store and return the
errno value.

[...]

> +static ssize_t fb_write_screen_base(struct fb_info *info, const char __user 
> *buf, size_t count,
> + loff_t pos)
> +{
> + char __iomem *dst = info->screen_base + pos;
> + size_t alloc_size = min_t(size_t, count, PAGE_SIZE);
> + ssize_t ret = 0;
> + int err = 0;

Same here.

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 16/21] drm/fb-helper: Call fb_sync in I/O functions

2022-11-02 Thread Javier Martinez Canillas
On 10/24/22 13:19, Thomas Zimmermann wrote:
> Call struct fb_ops.fb_sync in drm_fbdev_{read,write}() to mimic the
> behavior of fbdev. Fbdev implementations of fb_read and fb_write in
> struct fb_ops invoke fb_sync to synchronize with outstanding operations
> before I/O. Doing the same in DRM implementations will allow us to use
> them throughout DRM drivers.
> 
> Signed-off-by: Thomas Zimmermann 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2 15/21] drm/fb-helper: Disconnect damage worker from update logic

2022-11-02 Thread Javier Martinez Canillas
On 10/24/22 13:19, Thomas Zimmermann wrote:
> The fbdev helpers implement a damage worker that forwards fbdev
> updates to the DRM driver. The worker's update logic depends on
> the generic fbdev emulation. Separate the two via function pointer.
> 
> The generic fbdev emulation sets struct drm_fb_helper_funcs.fb_dirty,
> a new callback that hides the update logic from the damage worker.
> It's not possible to use the generic logic with other fbdev emulation,
> because it contains additional code for the shadow buffering that
> the generic emulation employs.
> 
> DRM drivers with internal fbdev emulation can set fb_dirty to their
> own implementation if they require damage handling; although no such
> drivers currently exist.
> 
> Signed-off-by: Thomas Zimmermann 
> ---

[...]

>  static void drm_fb_helper_damage_work(struct work_struct *work)
>  {
> - struct drm_fb_helper *helper = container_of(work, struct drm_fb_helper,
> - damage_work);
> - struct drm_device *dev = helper->dev;
> + struct drm_fb_helper *helper = container_of(work, struct drm_fb_helper, 
> damage_work);

This line is an unrelated code style change. But I guess it's OK.

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [PATCH v2] [next] drm/radeon: Replace one-element array with flexible-array member

2022-11-02 Thread Kees Cook
On Tue, Nov 01, 2022 at 06:09:16PM -0400, Alex Deucher wrote:
> On Tue, Nov 1, 2022 at 5:54 PM Kees Cook  wrote:
> > Does the ROM always only have a single byte there? This seems unlikely
> > given the member "ucFakeEDIDLength" (and the code below).
> 
> I'm not sure.  I'm mostly concerned about this:
>
> record += fake_edid_record->ucFakeEDIDLength ?
>   fake_edid_record->ucFakeEDIDLength + 2 :
>   sizeof(ATOM_FAKE_EDID_PATCH_RECORD);

But this is exactly what the code currently does, as noted in the commit
log: "It's worth mentioning that doing a build before/after this patch
results in no binary output differences.

> Presumably the record should only exist if ucFakeEDIDLength is non 0,
> but I don't know if there are some OEMs out there that just included
> an empty record for some reason.  Maybe the code is wrong today and
> there are some OEMs that include it and the array is already size 0.
> In that case, Paulo's original patches are probably more correct.

Right, but if true, that seems to be a distinctly separate bug fix?

-- 
Kees Cook


Re: [PATCH] drm/radeon: Replace kmap() with kmap_local_page()

2022-11-02 Thread Kees Cook
On Wed, Nov 02, 2022 at 12:11:53AM +0100, Fabio M. De Francesco wrote:
> On lunedì 17 ottobre 2022 18:52:10 CET Alex Deucher wrote:
> > Applied.  Thanks!
> 
> Many thanks to you!
> 
> However, about a week ago, I received a report saying that this patch is "Not 
> Applicable". 
> 
> That email was also referring to another patch, for which I'll reply in its 
> own thread.
> 
> That report has a link to https://patchwork.linuxtv.org/project/linux-media/
> patch/20221013210714.16320-1-fmdefrance...@gmail.com/
> 
> Can you please let me understand why, despite it was applied, this patch 
> later 
> shifted "State" to "Not Applicable"?

The kernel has multiple patchwork instances, so you got an "N/A" from
linux-media, but it was applied to the drm tree. (Yes, confusing. :P)

-- 
Kees Cook


Re: [RESEND PATCH] drm/amd/amdgpu: Replace kmap() with kmap_local_page()

2022-11-02 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 18:53:24 CET Alex Deucher wrote:
> Applied.  Thanks!
> 

The same report about which I just wrote in my previous email to you is also 
referring to this patch which later changed status to "Not Applicable".

It points to https://patchwork.linuxtv.org/project/linux-media/patch/
20220812175753.22926-1-fmdefrance...@gmail.com/

Can you please let me understand why?

Thanks,

Fabio





Re: [PATCH] drm/radeon: Replace kmap() with kmap_local_page()

2022-11-02 Thread Fabio M. De Francesco
On lunedì 17 ottobre 2022 18:52:10 CET Alex Deucher wrote:
> Applied.  Thanks!

Many thanks to you!

However, about a week ago, I received a report saying that this patch is "Not 
Applicable". 

That email was also referring to another patch, for which I'll reply in its 
own thread.

That report has a link to https://patchwork.linuxtv.org/project/linux-media/
patch/20221013210714.16320-1-fmdefrance...@gmail.com/

Can you please let me understand why, despite it was applied, this patch later 
shifted "State" to "Not Applicable"?

Thanks,

Fabio