[PATCH 2/2] drm/probe_helper: warning on poll_enabled for issue catching

2023-03-08 Thread Guchun Chen
In order to catch issues in other drivers to ensure proper call
sequence of polling function.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes: a4e771729a51("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki 
Suggested-by: Dmitry Baryshkov 
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/drm_probe_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/drm_probe_helper.c 
b/drivers/gpu/drm/drm_probe_helper.c
index 8127be134c39..85e0e80d4a52 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -852,6 +852,8 @@ EXPORT_SYMBOL(drm_kms_helper_is_poll_worker);
  */
 void drm_kms_helper_poll_disable(struct drm_device *dev)
 {
+   WARN_ON(!dev->mode_config.poll_enabled);
+
if (dev->mode_config.poll_running)
drm_kms_helper_disable_hpd(dev);
 
-- 
2.25.1



[PATCH 1/2] drm/amdgpu: move poll enabled/disable into non DC path

2023-03-08 Thread Guchun Chen
Some amd asics having reliable hotplug support don't call
drm_kms_helper_poll_init in driver init sequence. However,
due to the unified suspend/resume path for all asics, because
the output_poll_work->func is not set for these asics, a warning
arrives when suspending.

[   90.656049]  
[   90.656050]  ? console_unlock+0x4d/0x100
[   90.656053]  ? __irq_work_queue_local+0x27/0x60
[   90.656056]  ? irq_work_queue+0x2b/0x50
[   90.656057]  ? __wake_up_klogd+0x40/0x60
[   90.656059]  __cancel_work_timer+0xed/0x180
[   90.656061]  drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
[   90.656072]  amdgpu_device_suspend+0x81/0x170 [amdgpu]
[   90.656180]  amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
[   90.656269]  pci_pm_runtime_suspend+0x61/0x1b0

drm_kms_helper_poll_enable/disable is valid when poll_init is called in
amdgpu code, which is only used in non DC path. So move such codes into
non-DC path code to get rid of such warnings.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes: a4e771729a51("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki 
Suggested-by: Dmitry Baryshkov 
Suggested-by: Alex Deucher 
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  | 4 
 drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 4 
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c4a4e2fe6681..da5b0258a237 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4145,8 +4145,6 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D3))
DRM_WARN("smart shift update failed\n");
 
-   drm_kms_helper_poll_disable(dev);
-
if (fbcon)

drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
 
@@ -4243,8 +4241,6 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
if (fbcon)

drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, false);
 
-   drm_kms_helper_poll_enable(dev);
-
amdgpu_ras_resume(adev);
 
if (adev->mode_info.num_crtc) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 503f89a766c3..d60fe7eb5579 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1618,6 +1618,8 @@ int amdgpu_display_suspend_helper(struct amdgpu_device 
*adev)
struct drm_connector_list_iter iter;
int r;
 
+   drm_kms_helper_poll_disable(dev);
+
/* turn off display hw */
drm_modeset_lock_all(dev);
drm_connector_list_iter_begin(dev, );
@@ -1694,6 +1696,8 @@ int amdgpu_display_resume_helper(struct amdgpu_device 
*adev)
 
drm_modeset_unlock_all(dev);
 
+   drm_kms_helper_poll_enable(dev);
+
return 0;
 }
 
-- 
2.25.1



Re: [PATCH 9/9] drm: move ttm_execbuf_util into vmwgfx

2023-03-08 Thread Zack Rusin
On Wed, 2023-03-08 at 10:10 +0100, Christian König wrote:
> 
> Am 08.03.23 um 06:14 schrieb Zack Rusin:
> > On Tue, 2023-02-28 at 09:34 +0100, Christian König wrote:
> > > VMWGFX is the only remaining user of this and should probably moved over
> > > to drm_exec when it starts using GEM as well.
> > Is this because vmwgfx piggybacks buffer-id relocations on top of ttm
> > validations or
> > did you just find it too hard to port it over? I'd prefer to avoid ttm 
> > moves to
> > vmwgfx and at least have a clear idea of what we need to do to port.
> 
> I've just found it to hard to port it over because vmwgfx does some
> strange things with the validation code here.
> 
> If you want we can take a deeper look at this together, but I need to
> find some time.
> 
> Alternatively just tell me how to do it and I will add that to the patch
> set :)

I don't want to hold up the set (it looks good btw), because I had to look at
something else today and tomorrow. 

We overload the validation lists to do quite a bit more than just reservations
though. 

There are, I think, four separate things that need to be refactored there
(Christian, feel free to skip this section, this is mainly for VMware folks on 
the
team):
1) Relocations - userspace uses the id's of the bo's in the command stream, but 
on
the kernel side those id's are different (or in vmwgfx terminology gem id != mob
id), so the buffer id's in the command stream need to be replaced,
2) Resource validation. vmwgfx splits the userspace objects into buffers and
resources (shaders, surfaces, contexts). The resources are not buffers but are
backed by them. A single buffer can back multiple different resources and 
sometimes
the kernel has to actually allocate a buffer to back a resource and attach it 
to it
(i.e. in common terminology buffer is the memory and resources are placed in 
it) .
Now this shouldn't be in the kernel at all, the resources shouldn't have been 
kernel
objects and instead we should have left them completely to userspace.
3) Coherency tracking. We use validation lists as a central place for tracking 
which
bo's/resources are used in a command buffer and we use it to keep track of which
buffers/resources will endup dirty to implement coherency.
4) Central place to allocate memory for relocation/validation nodes.

Where we want to endup is with 2 completely gone from the kernel side and 1, 3 
and 4
refactored and cleaned up. I think there's at least 4 separate patches to this 
port,
so it's not a trivial thing. We will take a look at this on Friday in more 
detail to
see what we can do.

z


[pull] amdgpu, amdkfd drm-fixes-6.3

2023-03-08 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 6.3.

The following changes since commit 66305069eb6d17d9190cbcd196f3f7487df47ae8:

  Merge tag 'drm-misc-fixes-2023-02-23' of 
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2023-03-07 05:42:34 
+1000)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.3-2023-03-08

for you to fetch changes up to 41f95a0e40903fcf70463fcc060b7faf761b23f6:

  drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4 (2023-03-08 
14:39:09 -0500)


amd-drm-fixes-6.3-2023-03-08:

amdgpu:
- Misc display fixes
- UMC 8.10 fixes
- Driver unload fixes
- NBIO 7.3.0 fix
- Error checking fixes for soc15, nv, soc21 read register interface
- Fix video cap query for VCN 4.0.4

amdkfd:
- Fix BO offset for multi-VMA page migration
- Fix return check in doorbell handling


Alex Deucher (3):
  drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15
  drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc21
  drm/amdgpu: fix error checking in amdgpu_read_mm_registers for nv

Candice Li (2):
  drm/amdgpu: Support umc node harvest config on umc v8_10
  drm/amd/pm: Enable ecc_info table support for smu v13_0_10

Harry Wentland (2):
  drm/display: Don't block HDR_OUTPUT_METADATA on unknown EOTF
  drm/connector: print max_requested_bpc in state debugfs

Mario Limonciello (1):
  drm/amd: Fix initialization mistake for NBIO 7.3.0

Shashank Sharma (1):
  drm/amdgpu: fix return value check in kfd

Swapnil Patel (1):
  drm/amd/display: Update clock table to include highest clock setting

Veerabadhran Gopalakrishnan (1):
  drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4

Xiaogang Chen (1):
  drm/amdkfd: Fix BO offset for multi-VMA page migration

lyndonli (2):
  drm/amdgpu: Fix call trace warning and hang when removing amdgpu device
  drm/amdgpu: Fix the warning info when removing amdgpu device

 drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c  | 10 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c| 17 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h|  7 +-
 drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c |  1 -
 drivers/gpu/drm/amd/amdgpu/nbio_v7_2.c | 14 ++--
 drivers/gpu/drm/amd/amdgpu/nv.c|  7 +-
 drivers/gpu/drm/amd/amdgpu/soc15.c |  5 +-
 drivers/gpu/drm/amd/amdgpu/soc21.c |  8 ++-
 drivers/gpu/drm/amd/amdgpu/umc_v8_10.h |  4 +-
 drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c  |  2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c   | 17 +++--
 .../drm/amd/display/dc/clk_mgr/dcn301/vg_clk_mgr.c | 19 +-
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c   | 75 ++
 drivers/gpu/drm/display/drm_hdmi_helper.c  |  6 +-
 drivers/gpu/drm/drm_atomic.c   |  1 +
 16 files changed, 146 insertions(+), 49 deletions(-)


Re: [PATCH] drm/amdgpu: Remove useless else if

2023-03-08 Thread Alex Deucher
On Wed, Mar 8, 2023 at 10:37 PM Jiapeng Chong
 wrote:
>
> The assignment of the else and if branches is the same, so the if else
> here is redundant, so we remove it.
>
> ./drivers/gpu/drm/amd/amdgpu/nv.c:1048:2-4: WARNING: possible condition with 
> no effect (if == else).
>
> Reported-by: Abaci Robot 
> Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4454
> Signed-off-by: Jiapeng Chong 
> ---
>  drivers/gpu/drm/amd/amdgpu/nv.c | 18 +-
>  1 file changed, 5 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
> index 855d390c41de..84803929f7d9 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nv.c
> @@ -1045,19 +1045,11 @@ static int nv_common_late_init(void *handle)
>
> if (amdgpu_sriov_vf(adev)) {
> xgpu_nv_mailbox_get_irq(adev);
> -   if (adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) {
> -   amdgpu_virt_update_sriov_video_codec(adev,
> -
> sriov_sc_video_codecs_encode_array,
> -
> ARRAY_SIZE(sriov_sc_video_codecs_encode_array),
> -
> sriov_sc_video_codecs_decode_array_vcn1,
> -
> ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn1));
> -   } else {
> -   amdgpu_virt_update_sriov_video_codec(adev,
> -
> sriov_sc_video_codecs_encode_array,
> -
> ARRAY_SIZE(sriov_sc_video_codecs_encode_array),
> -
> sriov_sc_video_codecs_decode_array_vcn1,
> -
> ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn1));

This should be vcn0.  I'll send out a patch.  Thanks!

Alex


> -   }
> +   amdgpu_virt_update_sriov_video_codec(adev,
> +
> sriov_sc_video_codecs_encode_array,
> +
> ARRAY_SIZE(sriov_sc_video_codecs_encode_array),
> +
> sriov_sc_video_codecs_decode_array_vcn1,
> +
> ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn1));
> }
>
> return 0;
> --
> 2.20.1.7.g153144c
>


[PATCH] drm/amdgpu/nv: fix codec array for SR_IOV

2023-03-08 Thread Alex Deucher
Copy paste error.

Fixes: 384334120b66 ("drm/amdgpu/nv: don't expose AV1 if VCN0 is harvested")
Reported-by: Abaci Robot 
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4454
Cc: Jiapeng Chong 
Signed-off-by: Alex Deucher 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 855d390c41de..22e25ca285f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1055,8 +1055,8 @@ static int nv_common_late_init(void *handle)
amdgpu_virt_update_sriov_video_codec(adev,
 
sriov_sc_video_codecs_encode_array,
 
ARRAY_SIZE(sriov_sc_video_codecs_encode_array),
-
sriov_sc_video_codecs_decode_array_vcn1,
-
ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn1));
+
sriov_sc_video_codecs_decode_array_vcn0,
+
ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn0));
}
}
 
-- 
2.39.2



[PATCH] drm/amdgpu: Remove useless else if

2023-03-08 Thread Jiapeng Chong
The assignment of the else and if branches is the same, so the if else
here is redundant, so we remove it.

./drivers/gpu/drm/amd/amdgpu/nv.c:1048:2-4: WARNING: possible condition with no 
effect (if == else).

Reported-by: Abaci Robot 
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4454
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/amdgpu/nv.c | 18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c
index 855d390c41de..84803929f7d9 100644
--- a/drivers/gpu/drm/amd/amdgpu/nv.c
+++ b/drivers/gpu/drm/amd/amdgpu/nv.c
@@ -1045,19 +1045,11 @@ static int nv_common_late_init(void *handle)
 
if (amdgpu_sriov_vf(adev)) {
xgpu_nv_mailbox_get_irq(adev);
-   if (adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) {
-   amdgpu_virt_update_sriov_video_codec(adev,
-
sriov_sc_video_codecs_encode_array,
-
ARRAY_SIZE(sriov_sc_video_codecs_encode_array),
-
sriov_sc_video_codecs_decode_array_vcn1,
-
ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn1));
-   } else {
-   amdgpu_virt_update_sriov_video_codec(adev,
-
sriov_sc_video_codecs_encode_array,
-
ARRAY_SIZE(sriov_sc_video_codecs_encode_array),
-
sriov_sc_video_codecs_decode_array_vcn1,
-
ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn1));
-   }
+   amdgpu_virt_update_sriov_video_codec(adev,
+
sriov_sc_video_codecs_encode_array,
+
ARRAY_SIZE(sriov_sc_video_codecs_encode_array),
+
sriov_sc_video_codecs_decode_array_vcn1,
+
ARRAY_SIZE(sriov_sc_video_codecs_decode_array_vcn1));
}
 
return 0;
-- 
2.20.1.7.g153144c



[PATCH] drm/amd/display: Use swap() instead of open coding it

2023-03-08 Thread Jiapeng Chong
Swap is a function interface that provides exchange function. To avoid
code duplication, we can use swap function.

./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:359:57-58: WARNING 
opportunity for swap().

Reported-by: Abaci Robot 
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4448
Signed-off-by: Jiapeng Chong 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index ae994c6c65ac..f6d9bbce15b2 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -352,13 +352,9 @@ static inline void reverse_planes_order(struct 
dc_surface_update *array_of_surfa
int planes_count)
 {
int i, j;
-   struct dc_surface_update surface_updates_temp;
 
-   for (i = 0, j = planes_count - 1; i < j; i++, j--) {
-   surface_updates_temp = array_of_surface_update[i];
-   array_of_surface_update[i] = array_of_surface_update[j];
-   array_of_surface_update[j] = surface_updates_temp;
-   }
+   for (i = 0, j = planes_count - 1; i < j; i++, j--)
+   swap(array_of_surface_update[i], array_of_surface_update[j]);
 }
 
 /**
-- 
2.20.1.7.g153144c



RE: [PATCH 2/2] drm/amd/pm: Fix navi10 incorrect OD volage after resume

2023-03-08 Thread Quan, Evan
[AMD Official Use Only - General]



> -Original Message-
> From: Deucher, Alexander 
> Sent: Wednesday, March 8, 2023 11:20 PM
> To: amd-gfx@lists.freedesktop.org
> Cc: Deucher, Alexander ; Błażej Szczygieł
> ; Quan, Evan 
> Subject: [PATCH 2/2] drm/amd/pm: Fix navi10 incorrect OD volage after
> resume
> 
> Always setup overdrive tables after resume. Preserve only some
> user-defined settings in user_overdrive_table if they're set.
> 
> Copy restored user_overdrive_table into od_table to get correct
> values.
> 
> On cold boot, BTC was triggered and GfxVfCurve was calibrated. We
> got VfCurve settings (a). On resuming back, BTC will be triggered
> again and GfxVfCurve will be recalibrated. VfCurve settings (b)
> got may be different from those of cold boot.  So if we reuse
> those VfCurve settings (a) got on cold boot on suspend, we can
> run into discrepencies.
> 
> Based on the sienna cichlid patch from Błażej Szczygieł
> 
> 
> Cc: Błażej Szczygieł 
> Cc: Evan Quan 
> Signed-off-by: Alex Deucher 
> ---
>  .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 47
> +++
>  1 file changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 95da6dd1cc65..68201d8e1c72 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -2510,16 +2510,9 @@ static int navi10_set_default_od_settings(struct
> smu_context *smu)
>   (OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
>   OverDriveTable_t *user_od_table =
>   (OverDriveTable_t *)smu->smu_table.user_overdrive_table;
> + OverDriveTable_t user_od_table_bak;
>   int ret = 0;
> 
> - /*
> -  * For S3/S4/Runpm resume, no need to setup those overdrive
> tables again as
> -  *   - either they already have the default OD settings got during cold
> bootup
> -  *   - or they have some user customized OD settings which cannot be
> overwritten
> -  */
> - if (smu->adev->in_suspend)
> - return 0;
> -
>   ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE, 0,
> (void *)boot_od_table, false);
>   if (ret) {
>   dev_err(smu->adev->dev, "Failed to get overdrive table!\n");
> @@ -2553,7 +2546,27 @@ static int navi10_set_default_od_settings(struct
> smu_context *smu)
>   navi10_dump_od_table(smu, boot_od_table);
> 
>   memcpy(od_table, boot_od_table, sizeof(OverDriveTable_t));
> - memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
> +
> + /*
> +  * For S3/S4/Runpm resume, we need to setup those overdrive
> tables again,
> +  * but we have to preserve user defined values in "user_od_table".
> +  */
> + if (!smu->adev->in_suspend) {
> + memcpy(user_od_table, boot_od_table,
> sizeof(OverDriveTable_t));
> + smu->user_dpm_profile.user_od = false;
> + } else if (smu->user_dpm_profile.user_od) {
> + memcpy(_od_table_bak, user_od_table,
> sizeof(OverDriveTable_t));
> + memcpy(user_od_table, boot_od_table,
> sizeof(OverDriveTable_t));
> + user_od_table->GfxclkFmin =
> user_od_table_bak.GfxclkFmin;
> + user_od_table->GfxclkFmax =
> user_od_table_bak.GfxclkFmax;
> + user_od_table->UclkFmax = user_od_table_bak.UclkFmax;
> + user_od_table->GfxclkFreq1 =
> user_od_table_bak.GfxclkFreq1;
> + user_od_table->GfxclkVolt1 =
> user_od_table_bak.GfxclkVolt1;
> + user_od_table->GfxclkFreq2 =
> user_od_table_bak.GfxclkFreq2;
> + user_od_table->GfxclkVolt2 =
> user_od_table_bak.GfxclkVolt2;
> + user_od_table->GfxclkFreq3 =
> user_od_table_bak.GfxclkFreq3;
> + user_od_table->GfxclkVolt3 =
> user_od_table_bak.GfxclkVolt3;
> + }
Thing is a little tricky for navi10... 
For navi2x, the vfcurve settings(GfxVfCurve.a, GfxVfCurve.b, GfxVfCurve.c) are 
not configurable by user. We do not expose them to user.
So, we can just load the new vfcurve settings on resuming back without worry 
about overriding user's settings.

Unlike navi2x, user can customize the vfcurve settings(by setting 
GfxclkFreq/GfxVolt pairs) on navi10. More specifically:
- On cold boot, btc was triggered and vfcurve line was calibrated
- Driver calculated the target voltage(via 
navi10_overdrive_get_gfx_clk_base_voltage) for the point 
frequencies(GfxclkFreq1, GfxclkFreq2, GfxclkFreq3) and expose them to user
   - e.g. point1 frequency/voltage:  500Mhz/ 0.75v
- Then user customized the vfcurve line by setting a new target voltage for the 
point frequency.
   - e.g. 500Mhz / 0.76v  --> 10mv added
- On resuming back, the vfcurve line was recalibrated. The target voltage for 
the point1 frequency may be changed to for example 0.745v(for 500Mhz). Under 
such scenario, if we just restore user's settings(0.76v which will add 15mv),  
that might not fit user's 

[PATCH v2] drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs

2023-03-08 Thread YuBiao Wang
v2: Add comments to clarify in the code.

[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.

[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.

Signed-off-by: YuBiao Wang 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index faff4a3f96e6..ad7c5b70c35a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -673,6 +673,7 @@ void amdgpu_fence_driver_clear_job_fences(struct 
amdgpu_ring *ring)
 {
int i;
struct dma_fence *old, **ptr;
+   struct amdgpu_job *job;
 
for (i = 0; i <= ring->fence_drv.num_fences_mask; i++) {
ptr = >fence_drv.fences[i];
@@ -680,6 +681,13 @@ void amdgpu_fence_driver_clear_job_fences(struct 
amdgpu_ring *ring)
if (old && old->ops == _job_fence_ops) {
RCU_INIT_POINTER(*ptr, NULL);
dma_fence_put(old);
+   /* For non-sched bad job, i.e. failed ib test, we need 
to force
+* signal it right here or we won't be able to track 
them in fence drv
+* and they will remain unsignaled during sa_bo free.
+*/
+   job = container_of(old, struct amdgpu_job, hw_fence);
+   if (!job->base.s_fence && !dma_fence_is_signaled(old))
+   dma_fence_signal(old);
}
}
 }
-- 
2.25.1



Re: [PATCH v3 14/17] drm/amd/display: Add debugfs for testing output colorspace

2023-03-08 Thread Sebastian Wick
On Tue, Mar 7, 2023 at 4:12 PM Harry Wentland  wrote:
>
> In order to IGT test colorspace we'll want to print
> the currently enabled colorspace on a stream. We add
> a new debugfs to do so, using the same scheme as
> current bpc reporting.
>
> This might also come in handy when debugging display
> issues.
>
> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Reviewed-By: Joshua Ashton 
> ---
>  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 57 +++
>  1 file changed, 57 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> index 4a5dae578d97..f0022c16b708 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> @@ -906,6 +906,61 @@ static int amdgpu_current_bpc_show(struct seq_file *m, 
> void *data)
>  }
>  DEFINE_SHOW_ATTRIBUTE(amdgpu_current_bpc);
>
> +/*
> + * Returns the current bpc for the crtc.
> + * Example usage: cat 
> /sys/kernel/debug/dri/0/crtc-0/amdgpu_current_colorspace
> + */
> +static int amdgpu_current_colorspace_show(struct seq_file *m, void *data)
> +{
> +   struct drm_crtc *crtc = m->private;
> +   struct drm_device *dev = crtc->dev;
> +   struct dm_crtc_state *dm_crtc_state = NULL;
> +   int res = -ENODEV;
> +
> +   mutex_lock(>mode_config.mutex);
> +   drm_modeset_lock(>mutex, NULL);
> +   if (crtc->state == NULL)
> +   goto unlock;
> +
> +   dm_crtc_state = to_dm_crtc_state(crtc->state);
> +   if (dm_crtc_state->stream == NULL)
> +   goto unlock;
> +
> +   switch (dm_crtc_state->stream->output_color_space) {
> +   case COLOR_SPACE_SRGB:
> +   seq_printf(m, "RGB");
> +   break;

Why does it print "RGB" when it says the color space is sRGB? Looking
at the value when I didn't specify any color space it says RGB. Why is
your default color space sRGB?


> +   case COLOR_SPACE_YCBCR601:
> +   case COLOR_SPACE_YCBCR601_LIMITED:
> +   seq_printf(m, "BT601_YCC");
> +   break;
> +   case COLOR_SPACE_YCBCR709:
> +   case COLOR_SPACE_YCBCR709_LIMITED:
> +   seq_printf(m, "BT709_YCC");
> +   break;
> +   case COLOR_SPACE_ADOBERGB:
> +   seq_printf(m, "opRGB");
> +   break;
> +   case COLOR_SPACE_2020_RGB_FULLRANGE:
> +   seq_printf(m, "BT2020_RGB");
> +   break;
> +   case COLOR_SPACE_2020_YCBCR:
> +   seq_printf(m, "BT2020_YCC");
> +   break;
> +   default:
> +   goto unlock;
> +   }
> +   res = 0;
> +
> +unlock:
> +   drm_modeset_unlock(>mutex);
> +   mutex_unlock(>mode_config.mutex);
> +
> +   return res;
> +}
> +DEFINE_SHOW_ATTRIBUTE(amdgpu_current_colorspace);
> +
> +
>  /*
>   * Example usage:
>   * Disable dsc passthrough, i.e.,: have dsc decoding at converver, not 
> external RX
> @@ -3235,6 +3290,8 @@ void crtc_debugfs_init(struct drm_crtc *crtc)
>  #endif
> debugfs_create_file("amdgpu_current_bpc", 0644, crtc->debugfs_entry,
> crtc, _current_bpc_fops);
> +   debugfs_create_file("amdgpu_current_colorspace", 0644, 
> crtc->debugfs_entry,
> +   crtc, _current_colorspace_fops);
>  }
>
>  /*
> --
> 2.39.2
>



Re: [PATCH v3 11/17] drm/amd/display: Send correct DP colorspace infopacket

2023-03-08 Thread Sebastian Wick
On Tue, Mar 7, 2023 at 4:12 PM Harry Wentland  wrote:
>
> Look at connector->colorimetry to determine output colorspace.
>
> We don't want to impact current SDR behavior, so
> DRM_MODE_COLORIMETRY_DEFAULT preserves current behavior.
>
> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Reviewed-By: Joshua Ashton 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 38 +++
>  1 file changed, 22 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 58fc719bec8d..cdfd09d50ee6 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -5302,21 +5302,21 @@ get_aspect_ratio(const struct drm_display_mode 
> *mode_in)
>  }
>
>  static enum dc_color_space
> -get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing)
> +get_output_color_space(const struct dc_crtc_timing *dc_crtc_timing,
> +  const struct drm_connector_state *connector_state)
>  {
> enum dc_color_space color_space = COLOR_SPACE_SRGB;
>
> -   switch (dc_crtc_timing->pixel_encoding) {
> -   case PIXEL_ENCODING_YCBCR422:
> -   case PIXEL_ENCODING_YCBCR444:
> -   case PIXEL_ENCODING_YCBCR420:
> -   {
> +   switch (connector_state->colorspace) {
> +   case DRM_MODE_COLORIMETRY_DEFAULT: // ITU601

So, I do get random behavior with DRM_MODE_COLORIMETRY_DEFAULT instead
of the colorimetry that the EDID specifies? That doesn't sound good at
all.

> +   if (dc_crtc_timing->pixel_encoding == PIXEL_ENCODING_RGB) {
> +   color_space = COLOR_SPACE_SRGB;
> /*
>  * 27030khz is the separation point between HDTV and SDTV
>  * according to HDMI spec, we use YCbCr709 and YCbCr601
>  * respectively
>  */
> -   if (dc_crtc_timing->pix_clk_100hz > 270300) {
> +   } else if (dc_crtc_timing->pix_clk_100hz > 270300) {
> if (dc_crtc_timing->flags.Y_ONLY)
> color_space =
> COLOR_SPACE_YCBCR709_LIMITED;
> @@ -5329,15 +5329,21 @@ get_output_color_space(const struct dc_crtc_timing 
> *dc_crtc_timing)
> else
> color_space = COLOR_SPACE_YCBCR601;
> }
> -
> -   }
> -   break;
> -   case PIXEL_ENCODING_RGB:
> -   color_space = COLOR_SPACE_SRGB;
> break;
> -
> -   default:
> -   WARN_ON(1);
> +   case DRM_MODE_COLORIMETRY_BT709_YCC:
> +   if (dc_crtc_timing->flags.Y_ONLY)
> +   color_space = COLOR_SPACE_YCBCR709_LIMITED;
> +   else
> +   color_space = COLOR_SPACE_YCBCR709;
> +   break;
> +   case DRM_MODE_COLORIMETRY_OPRGB:
> +   color_space = COLOR_SPACE_ADOBERGB;
> +   break;
> +   case DRM_MODE_COLORIMETRY_BT2020:
> +   color_space = COLOR_SPACE_2020_RGB_FULLRANGE;
> +   break;
> +   case DRM_MODE_COLORIMETRY_BT2020_DEPRECATED:
> +   color_space = COLOR_SPACE_2020_YCBCR;
> break;
> }
>
> @@ -5476,7 +5482,7 @@ static void 
> fill_stream_properties_from_drm_display_mode(
> }
> }
>
> -   stream->output_color_space = get_output_color_space(timing_out);
> +   stream->output_color_space = get_output_color_space(timing_out, 
> connector_state);
>  }
>
>  static void fill_audio_info(struct audio_info *audio_info,
> --
> 2.39.2
>



RE: [PATCH 1/2] drm/amdgpu: add flag to enable/disable poll in suspend/resume path

2023-03-08 Thread Chen, Guchun
Relying on dc_enabled will be more simple, thanks for your suggestion. I will 
send v2 to address this.

Regards,
Guchun

-Original Message-
From: amd-gfx  On Behalf Of Alex Deucher
Sent: Thursday, March 9, 2023 12:29 AM
To: Chen, Guchun 
Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
dmitry.barysh...@linaro.org; spassw...@web.de; Deucher, Alexander 
; Zhang, Hawking 
Subject: Re: [PATCH 1/2] drm/amdgpu: add flag to enable/disable poll in 
suspend/resume path

On Wed, Mar 8, 2023 at 7:17 AM Guchun Chen  wrote:
>
> Some amd asics having reliable hotplug support don't call 
> drm_kms_helper_poll_init in driver init sequence. However, due to the 
> unified suspend/resume path for all asics, because the 
> output_poll_work->func is not set for these asics, a warning arrives 
> when suspending.
>
> [   90.656049]  
> [   90.656050]  ? console_unlock+0x4d/0x100
> [   90.656053]  ? __irq_work_queue_local+0x27/0x60
> [   90.656056]  ? irq_work_queue+0x2b/0x50
> [   90.656057]  ? __wake_up_klogd+0x40/0x60
> [   90.656059]  __cancel_work_timer+0xed/0x180
> [   90.656061]  drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
> [   90.656072]  amdgpu_device_suspend+0x81/0x170 [amdgpu]
> [   90.656180]  amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
> [   90.656269]  pci_pm_runtime_suspend+0x61/0x1b0
>
> So add use_kms_poll flag as the initialization check in amdgpu code 
> before calling drm_kms_helper_poll_disable/drm_kms_helper_poll_enable 
> in suspend/resume path.
>
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
> Fixes: a4e771729a51("drm/probe_helper: sort out poll_running vs 
> poll_enabled")
> Reported-by: Bert Karwatzki 
> Suggested-by: Dmitry Baryshkov 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h   | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c   | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c  | 1 +
>  7 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index c4a4e2fe6681..74af0b8c0d08 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4145,7 +4145,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
> fbcon)
> if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D3))
> DRM_WARN("smart shift update failed\n");
>
> -   drm_kms_helper_poll_disable(dev);
> +   if (adev->mode_info.use_kms_poll)
> +   drm_kms_helper_poll_disable(dev);
>
> if (fbcon)
> 
> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true); @@ 
> -4243,7 +4244,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
> fbcon)
> if (fbcon)
> 
> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, 
> false);
>
> -   drm_kms_helper_poll_enable(dev);
> +   if (adev->mode_info.use_kms_poll)
> +   drm_kms_helper_poll_enable(dev);
>

Since polling is only enabled for analog outputs and DC doesn't support any 
analog outputs, I think we can simplify this to

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c4a4e2fe6681..74af0b8c0d08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4145,7 +4145,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
  if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D3))
  DRM_WARN("smart shift update failed\n");

- drm_kms_helper_poll_disable(dev);
+ if (!adev->dc_enabled)
+ drm_kms_helper_poll_disable(dev);

  if (fbcon)
  drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true); @@ 
-4243,7 +4244,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon)
  if (fbcon)
  drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, false);

- drm_kms_helper_poll_enable(dev);
+ if (!adev->dc_enabled)
+ drm_kms_helper_poll_enable(dev);

  amdgpu_ras_resume(adev);

Alternatively, we could also just move drm_kms_helper_poll_disable() into 
amdgpu_display_suspend_helper() and drm_kms_helper_poll_enable() into 
amdgpu_display_resume_helper(), but I'm not sure if the ordering here is 
important or not off hand.

Alex



> amdgpu_ras_resume(adev);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> index 32fe05c810c6..d383ea3e8e94 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> @@ -343,6 +343,7 @@ struct amdgpu_mode_info {
> int disp_priority;
> const struct amdgpu_display_funcs *funcs;
> 

Re: [PATCH v3 05/17] drm/connector: Use common colorspace_names array

2023-03-08 Thread Sebastian Wick
On Tue, Mar 7, 2023 at 4:12 PM Harry Wentland  wrote:
>
> We an use bitfields to track the support ones for HDMI
> and DP. This allows us to print colorspaces in a consistent
> manner without needing to know whether we're dealing with
> DP or HDMI.
>
> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Cc: Joshua Ashton 
> Cc: Jani Nikula 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> ---
>  drivers/gpu/drm/drm_connector.c | 131 +++-
>  include/drm/drm_connector.h |   1 +
>  2 files changed, 78 insertions(+), 54 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index ff4af48c029a..7649f0ac454f 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -1012,64 +1012,70 @@ static const struct drm_prop_enum_list 
> drm_dp_subconnector_enum_list[] = {
>  DRM_ENUM_NAME_FN(drm_get_dp_subconnector_name,
>  drm_dp_subconnector_enum_list)
>
> -static const struct drm_prop_enum_list hdmi_colorspaces[] = {
> +
> +static const char * const colorspace_names[] = {
> /* For Default case, driver will set the colorspace */
> -   { DRM_MODE_COLORIMETRY_DEFAULT, "Default" },
> +   [DRM_MODE_COLORIMETRY_DEFAULT] = "Default",
> /* Standard Definition Colorimetry based on CEA 861 */
> -   { DRM_MODE_COLORIMETRY_SMPTE_170M_YCC, "SMPTE_170M_YCC" },
> -   { DRM_MODE_COLORIMETRY_BT709_YCC, "BT709_YCC" },
> +   [DRM_MODE_COLORIMETRY_SMPTE_170M_YCC] = "SMPTE_170M_YCC",
> +   [DRM_MODE_COLORIMETRY_BT709_YCC] = "BT709_YCC",
> /* Standard Definition Colorimetry based on IEC 61966-2-4 */
> -   { DRM_MODE_COLORIMETRY_XVYCC_601, "XVYCC_601" },
> +   [DRM_MODE_COLORIMETRY_XVYCC_601] = "XVYCC_601",
> /* High Definition Colorimetry based on IEC 61966-2-4 */
> -   { DRM_MODE_COLORIMETRY_XVYCC_709, "XVYCC_709" },
> +   [DRM_MODE_COLORIMETRY_XVYCC_709] = "XVYCC_709",
> /* Colorimetry based on IEC 61966-2-1/Amendment 1 */
> -   { DRM_MODE_COLORIMETRY_SYCC_601, "SYCC_601" },
> +   [DRM_MODE_COLORIMETRY_SYCC_601] = "SYCC_601",
> /* Colorimetry based on IEC 61966-2-5 [33] */
> -   { DRM_MODE_COLORIMETRY_OPYCC_601, "opYCC_601" },
> +   [DRM_MODE_COLORIMETRY_OPYCC_601] = "opYCC_601",
> /* Colorimetry based on IEC 61966-2-5 */
> -   { DRM_MODE_COLORIMETRY_OPRGB, "opRGB" },
> +   [DRM_MODE_COLORIMETRY_OPRGB] = "opRGB",
> /* Colorimetry based on ITU-R BT.2020 */
> -   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
> +   [DRM_MODE_COLORIMETRY_BT2020_CYCC] = "BT2020_CYCC",
> /* Colorimetry based on ITU-R BT.2020 */
> -   { DRM_MODE_COLORIMETRY_BT2020, "BT2020" },
> +   [DRM_MODE_COLORIMETRY_BT2020] = "BT2020",
> /* Colorimetry based on ITU-R BT.2020 */
> -   { DRM_MODE_COLORIMETRY_BT2020_DEPRECATED, "BT2020_DEPRECATED" },
> -   /* Added as part of Additional Colorimetry Extension in 861.G */
> -   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
> -   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER, "DCI-P3_RGB_Theater" },
> +   [DRM_MODE_COLORIMETRY_BT2020_DEPRECATED] = "BT2020_DEPRECATED",
> +   /* Colorimetry based on SMPTE RP 431-2 */
> +   [DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65] = "P3_RGB_D65",
> +   [DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER] = "P3_RGB_Theater",
> +   [DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED] = "RGB_WIDE_FIXED",
> +   /* Colorimetry based on scRGB (IEC 61966-2-2) */
> +   [DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT] = "RGB_WIDE_FLOAT",
> +   [DRM_MODE_COLORIMETRY_BT601_YCC] = "BT601_YCC",
>  };
>
> +static const u32 hdmi_colorspaces =
> +   BIT(DRM_MODE_COLORIMETRY_SMPTE_170M_YCC) |
> +   BIT(DRM_MODE_COLORIMETRY_BT709_YCC) |
> +   BIT(DRM_MODE_COLORIMETRY_XVYCC_601) |
> +   BIT(DRM_MODE_COLORIMETRY_XVYCC_709) |
> +   BIT(DRM_MODE_COLORIMETRY_SYCC_601) |
> +   BIT(DRM_MODE_COLORIMETRY_OPYCC_601) |
> +   BIT(DRM_MODE_COLORIMETRY_OPRGB) |
> +   BIT(DRM_MODE_COLORIMETRY_BT2020_CYCC) |
> +   BIT(DRM_MODE_COLORIMETRY_BT2020) |
> +   BIT(DRM_MODE_COLORIMETRY_BT2020_DEPRECATED) |
> +   BIT(DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65) |
> +   BIT(DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER);
> +
>  /*
>   * As per DP 1.4a spec, 2.2.5.7.5 VSC SDP Payload for Pixel 
> Encoding/Colorimetry
>   * Format Table 2-120
>   */
> -static const struct drm_prop_enum_list dp_colorspaces[] = {
> -   /* For Default case, driver will set the colorspace */
> -   { DRM_MODE_COLORIMETRY_DEFAULT, "Default" },
> -   { DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED, "RGB_Wide_Gamut_Fixed_Point" },
> -   /* Colorimetry based on scRGB (IEC 61966-2-2) */
> -   { DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT, 
> "RGB_Wide_Gamut_Floating_Point" },
> -   /* Colorimetry based on 

Re: [PATCH v3 03/17] drm/connector: Deprecate split for BT.2020 in drm_colorspace enum

2023-03-08 Thread Ville Syrjälä
On Thu, Mar 09, 2023 at 02:05:55AM +0100, Sebastian Wick wrote:
> On Wed, Mar 8, 2023 at 10:09 AM Pekka Paalanen  wrote:
> >
> > On Tue, 7 Mar 2023 10:10:53 -0500
> > Harry Wentland  wrote:
> >
> > > From: Joshua Ashton 
> > >
> > > Userspace has no way of controlling or knowing the pixel encoding
> > > currently, so there is no way for it to ever get the right values here.
> > >
> > > When we do add pixel_encoding control from userspace,we can pick the
> > > right value for the colorimetry packet based on the
> > > pixel_encoding + the colorspace.
> > >
> > > Let's deprecate these values, and have one BT.2020 colorspace entry
> > > that userspace can use.
> > >
> > > v2:
> > >  - leave CYCC alone for now; it serves a purpose
> > >  - leave BT2020_RGB the new default BT2020
> > >
> > > Signed-off-by: Joshua Ashton 
> > > Signed-off-by: Harry Wentland 
> > > Reviewed-by: Harry Wentland 
> > >
> > > Cc: Pekka Paalanen 
> > > Cc: Sebastian Wick 
> > > Cc: vitaly.pros...@amd.com
> > > Cc: Uma Shankar 
> > > Cc: Ville Syrjälä 
> > > Cc: Joshua Ashton 
> > > Cc: dri-de...@lists.freedesktop.org
> > > Cc: amd-gfx@lists.freedesktop.org
> > > ---
> > >  drivers/gpu/drm/display/drm_hdmi_helper.c |  7 +++
> > >  drivers/gpu/drm/drm_connector.c   |  8 
> > >  drivers/gpu/drm/i915/display/intel_dp.c   | 14 +++---
> > >  include/drm/drm_connector.h   | 15 +--
> > >  4 files changed, 23 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/display/drm_hdmi_helper.c 
> > > b/drivers/gpu/drm/display/drm_hdmi_helper.c
> > > index faf5e9efa7d3..05a0d03ffcda 100644
> > > --- a/drivers/gpu/drm/display/drm_hdmi_helper.c
> > > +++ b/drivers/gpu/drm/display/drm_hdmi_helper.c
> > > @@ -97,8 +97,7 @@ EXPORT_SYMBOL(drm_hdmi_infoframe_set_hdr_metadata);
> > >  #define HDMI_COLORIMETRY_OPYCC_601   (C(3) | EC(3) | ACE(0))
> > >  #define HDMI_COLORIMETRY_OPRGB   (C(3) | EC(4) | 
> > > ACE(0))
> > >  #define HDMI_COLORIMETRY_BT2020_CYCC (C(3) | EC(5) | ACE(0))
> > > -#define HDMI_COLORIMETRY_BT2020_RGB  (C(3) | EC(6) | ACE(0))
> > > -#define HDMI_COLORIMETRY_BT2020_YCC  (C(3) | EC(6) | ACE(0))
> > > +#define HDMI_COLORIMETRY_BT2020  (C(3) | EC(6) | 
> > > ACE(0))
> > >  #define HDMI_COLORIMETRY_DCI_P3_RGB_D65  (C(3) | EC(7) | 
> > > ACE(0))
> > >  #define HDMI_COLORIMETRY_DCI_P3_RGB_THEATER  (C(3) | EC(7) | ACE(1))
> > >
> > > @@ -112,8 +111,8 @@ static const u32 hdmi_colorimetry_val[] = {
> > >   [DRM_MODE_COLORIMETRY_OPYCC_601] = HDMI_COLORIMETRY_OPYCC_601,
> > >   [DRM_MODE_COLORIMETRY_OPRGB] = HDMI_COLORIMETRY_OPRGB,
> > >   [DRM_MODE_COLORIMETRY_BT2020_CYCC] = HDMI_COLORIMETRY_BT2020_CYCC,
> > > - [DRM_MODE_COLORIMETRY_BT2020_RGB] = HDMI_COLORIMETRY_BT2020_RGB,
> > > - [DRM_MODE_COLORIMETRY_BT2020_YCC] = HDMI_COLORIMETRY_BT2020_YCC,
> > > + [DRM_MODE_COLORIMETRY_BT2020_DEPRECATED] = HDMI_COLORIMETRY_BT2020,
> > > + [DRM_MODE_COLORIMETRY_BT2020] = HDMI_COLORIMETRY_BT2020,
> > >  };
> > >
> > >  #undef C
> > > diff --git a/drivers/gpu/drm/drm_connector.c 
> > > b/drivers/gpu/drm/drm_connector.c
> > > index 61c29ce74b03..fe7eab15f727 100644
> > > --- a/drivers/gpu/drm/drm_connector.c
> > > +++ b/drivers/gpu/drm/drm_connector.c
> > > @@ -1031,9 +1031,9 @@ static const struct drm_prop_enum_list 
> > > hdmi_colorspaces[] = {
> > >   /* Colorimetry based on ITU-R BT.2020 */
> > >   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
> > >   /* Colorimetry based on ITU-R BT.2020 */
> > > - { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" },
> > > + { DRM_MODE_COLORIMETRY_BT2020, "BT2020" },
> > >   /* Colorimetry based on ITU-R BT.2020 */
> > > - { DRM_MODE_COLORIMETRY_BT2020_YCC, "BT2020_YCC" },
> > > + { DRM_MODE_COLORIMETRY_BT2020_DEPRECATED, "BT2020_DEPRECATED" },
> > >   /* Added as part of Additional Colorimetry Extension in 861.G */
> > >   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
> > >   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER, "DCI-P3_RGB_Theater" },
> > > @@ -1054,7 +1054,7 @@ static const struct drm_prop_enum_list 
> > > dp_colorspaces[] = {
> > >   /* Colorimetry based on SMPTE RP 431-2 */
> > >   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
> > >   /* Colorimetry based on ITU-R BT.2020 */
> > > - { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" },
> > > + { DRM_MODE_COLORIMETRY_BT2020, "BT2020" },
> > >   { DRM_MODE_COLORIMETRY_BT601_YCC, "BT601_YCC" },
> > >   { DRM_MODE_COLORIMETRY_BT709_YCC, "BT709_YCC" },
> > >   /* Standard Definition Colorimetry based on IEC 61966-2-4 */
> > > @@ -1068,7 +1068,7 @@ static const struct drm_prop_enum_list 
> > > dp_colorspaces[] = {
> > >   /* Colorimetry based on ITU-R BT.2020 */
> > >   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
> > >   /* Colorimetry based on ITU-R BT.2020 */
> > 

Re: [PATCH v3 03/17] drm/connector: Deprecate split for BT.2020 in drm_colorspace enum

2023-03-08 Thread Sebastian Wick
On Wed, Mar 8, 2023 at 10:09 AM Pekka Paalanen  wrote:
>
> On Tue, 7 Mar 2023 10:10:53 -0500
> Harry Wentland  wrote:
>
> > From: Joshua Ashton 
> >
> > Userspace has no way of controlling or knowing the pixel encoding
> > currently, so there is no way for it to ever get the right values here.
> >
> > When we do add pixel_encoding control from userspace,we can pick the
> > right value for the colorimetry packet based on the
> > pixel_encoding + the colorspace.
> >
> > Let's deprecate these values, and have one BT.2020 colorspace entry
> > that userspace can use.
> >
> > v2:
> >  - leave CYCC alone for now; it serves a purpose
> >  - leave BT2020_RGB the new default BT2020
> >
> > Signed-off-by: Joshua Ashton 
> > Signed-off-by: Harry Wentland 
> > Reviewed-by: Harry Wentland 
> >
> > Cc: Pekka Paalanen 
> > Cc: Sebastian Wick 
> > Cc: vitaly.pros...@amd.com
> > Cc: Uma Shankar 
> > Cc: Ville Syrjälä 
> > Cc: Joshua Ashton 
> > Cc: dri-de...@lists.freedesktop.org
> > Cc: amd-gfx@lists.freedesktop.org
> > ---
> >  drivers/gpu/drm/display/drm_hdmi_helper.c |  7 +++
> >  drivers/gpu/drm/drm_connector.c   |  8 
> >  drivers/gpu/drm/i915/display/intel_dp.c   | 14 +++---
> >  include/drm/drm_connector.h   | 15 +--
> >  4 files changed, 23 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/display/drm_hdmi_helper.c 
> > b/drivers/gpu/drm/display/drm_hdmi_helper.c
> > index faf5e9efa7d3..05a0d03ffcda 100644
> > --- a/drivers/gpu/drm/display/drm_hdmi_helper.c
> > +++ b/drivers/gpu/drm/display/drm_hdmi_helper.c
> > @@ -97,8 +97,7 @@ EXPORT_SYMBOL(drm_hdmi_infoframe_set_hdr_metadata);
> >  #define HDMI_COLORIMETRY_OPYCC_601   (C(3) | EC(3) | ACE(0))
> >  #define HDMI_COLORIMETRY_OPRGB   (C(3) | EC(4) | 
> > ACE(0))
> >  #define HDMI_COLORIMETRY_BT2020_CYCC (C(3) | EC(5) | ACE(0))
> > -#define HDMI_COLORIMETRY_BT2020_RGB  (C(3) | EC(6) | ACE(0))
> > -#define HDMI_COLORIMETRY_BT2020_YCC  (C(3) | EC(6) | ACE(0))
> > +#define HDMI_COLORIMETRY_BT2020  (C(3) | EC(6) | 
> > ACE(0))
> >  #define HDMI_COLORIMETRY_DCI_P3_RGB_D65  (C(3) | EC(7) | 
> > ACE(0))
> >  #define HDMI_COLORIMETRY_DCI_P3_RGB_THEATER  (C(3) | EC(7) | ACE(1))
> >
> > @@ -112,8 +111,8 @@ static const u32 hdmi_colorimetry_val[] = {
> >   [DRM_MODE_COLORIMETRY_OPYCC_601] = HDMI_COLORIMETRY_OPYCC_601,
> >   [DRM_MODE_COLORIMETRY_OPRGB] = HDMI_COLORIMETRY_OPRGB,
> >   [DRM_MODE_COLORIMETRY_BT2020_CYCC] = HDMI_COLORIMETRY_BT2020_CYCC,
> > - [DRM_MODE_COLORIMETRY_BT2020_RGB] = HDMI_COLORIMETRY_BT2020_RGB,
> > - [DRM_MODE_COLORIMETRY_BT2020_YCC] = HDMI_COLORIMETRY_BT2020_YCC,
> > + [DRM_MODE_COLORIMETRY_BT2020_DEPRECATED] = HDMI_COLORIMETRY_BT2020,
> > + [DRM_MODE_COLORIMETRY_BT2020] = HDMI_COLORIMETRY_BT2020,
> >  };
> >
> >  #undef C
> > diff --git a/drivers/gpu/drm/drm_connector.c 
> > b/drivers/gpu/drm/drm_connector.c
> > index 61c29ce74b03..fe7eab15f727 100644
> > --- a/drivers/gpu/drm/drm_connector.c
> > +++ b/drivers/gpu/drm/drm_connector.c
> > @@ -1031,9 +1031,9 @@ static const struct drm_prop_enum_list 
> > hdmi_colorspaces[] = {
> >   /* Colorimetry based on ITU-R BT.2020 */
> >   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
> >   /* Colorimetry based on ITU-R BT.2020 */
> > - { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" },
> > + { DRM_MODE_COLORIMETRY_BT2020, "BT2020" },
> >   /* Colorimetry based on ITU-R BT.2020 */
> > - { DRM_MODE_COLORIMETRY_BT2020_YCC, "BT2020_YCC" },
> > + { DRM_MODE_COLORIMETRY_BT2020_DEPRECATED, "BT2020_DEPRECATED" },
> >   /* Added as part of Additional Colorimetry Extension in 861.G */
> >   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
> >   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER, "DCI-P3_RGB_Theater" },
> > @@ -1054,7 +1054,7 @@ static const struct drm_prop_enum_list 
> > dp_colorspaces[] = {
> >   /* Colorimetry based on SMPTE RP 431-2 */
> >   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
> >   /* Colorimetry based on ITU-R BT.2020 */
> > - { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" },
> > + { DRM_MODE_COLORIMETRY_BT2020, "BT2020" },
> >   { DRM_MODE_COLORIMETRY_BT601_YCC, "BT601_YCC" },
> >   { DRM_MODE_COLORIMETRY_BT709_YCC, "BT709_YCC" },
> >   /* Standard Definition Colorimetry based on IEC 61966-2-4 */
> > @@ -1068,7 +1068,7 @@ static const struct drm_prop_enum_list 
> > dp_colorspaces[] = {
> >   /* Colorimetry based on ITU-R BT.2020 */
> >   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
> >   /* Colorimetry based on ITU-R BT.2020 */
> > - { DRM_MODE_COLORIMETRY_BT2020_YCC, "BT2020_YCC" },
> > + { DRM_MODE_COLORIMETRY_BT2020_DEPRECATED, "BT2020_DEPRECATED" },
>
> Let's hope no-one complains about missing the old string names in UABI. :-)
>
> Actually, you should write in the commit message why 

Re: [PATCH v3 02/17] drm/connector: Add enum documentation to drm_colorspace

2023-03-08 Thread Sebastian Wick
On Wed, Mar 8, 2023 at 9:59 AM Pekka Paalanen  wrote:
>
> On Tue, 7 Mar 2023 10:10:52 -0500
> Harry Wentland  wrote:
>
> > From: Joshua Ashton 
> >
> > To match the other enums, and add more information about these values.
> >
> > v2:
> >  - Specify where an enum entry comes from
> >  - Clarify DEFAULT and NO_DATA behavior
> >  - BT.2020 CYCC is "constant luminance"
> >  - correct type for BT.601
> >
> > Signed-off-by: Joshua Ashton 
> > Signed-off-by: Harry Wentland 
> > Reviewed-by: Harry Wentland 
>
> Hi,
>
> this effort is really good, but of course I still find things to
> nitpick about. If there is no answer to my questions, then I would
> prefer the documentation to spell out the unknowns and ambiguities.
>
> > Cc: Pekka Paalanen 
> > Cc: Sebastian Wick 
> > Cc: vitaly.pros...@amd.com
> > Cc: Uma Shankar 
> > Cc: Ville Syrjälä 
> > Cc: Joshua Ashton 
> > Cc: dri-de...@lists.freedesktop.org
> > Cc: amd-gfx@lists.freedesktop.org
> > ---
> >  include/drm/drm_connector.h | 67 +++--
> >  1 file changed, 65 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> > index 6d6a53a6b010..bb078666dc34 100644
> > --- a/include/drm/drm_connector.h
> > +++ b/include/drm/drm_connector.h
> > @@ -363,13 +363,76 @@ enum drm_privacy_screen_status {
> >   PRIVACY_SCREEN_ENABLED_LOCKED,
> >  };
> >
> > -/*
> > - * This is a consolidated colorimetry list supported by HDMI and
> > +/**
> > + * enum drm_colorspace - color space
> > + *
> > + * This enum is a consolidated colorimetry list supported by HDMI and
> >   * DP protocol standard. The respective connectors will register
> >   * a property with the subset of this list (supported by that
> >   * respective protocol). Userspace will set the colorspace through
> >   * a colorspace property which will be created and exposed to
> >   * userspace.
> > + *
> > + * DP definitions come from the DP v2.0 spec
> > + * HDMI definitions come from the CTA-861-H spec
> > + *
> > + * @DRM_MODE_COLORIMETRY_DEFAULT:
> > + *   Driver specific behavior.
> > + *   For DP:
> > + *   RGB encoded: sRGB (IEC 61966-2-1)
> > + *   YCbCr encoded: ITU-R BT.601 colorimetry format
>
> Does this mean that HDMI behavior is driver-specific while DP behavior
> is as defined?
>
> Is it intentional that YCbCr encoding also uses different RGB-primaries
> than RGB-encoded signal? (BT.601 vs. BT.709/sRGB)
>
> Or do you need to be more explicit on which parts of each spec apply
> (ColourPrimaries vs. TransferCharacteristics vs. MatrixCoefficients in
> CICP parlance)?
>
> E.g. BT.709/sRGB ColourPrimaries with BT.601 MatrixCoefficients.

Yeah, just adding to this: The Default Colorspace is something well
defined. CTA-861 says:

"If bits C0 and C1 are zero, the colorimetry shall correspond to the
default colorimetry defined in Section 5.1"

and in Section 5.1

"In all cases described above, the RGB color space used should be the
RGB color space the Sink declares in the Basic Display Parameters and
Feature Block of its EDID."

If I set DRM_MODE_COLORIMETRY_DEFAULT, I expect the Colorimetry the
EDID reports to be in effect and not some driver specific nonsense.

> > + * @DRM_MODE_COLORIMETRY_NO_DATA:
> > + *   Driver specific behavior.
> > + *   For HDMI:
> > + *   Sets "No Data" in infoframe
>
> Does DEFAULT mean that something else than "No Data" may be set in the
> HDMI infoframe?
>
> If so, since these two have the same value, where is the difference? Is
> DEFAULT purely an UAPI token, and NO_DATA used internally? Or NO_DATA
> used only when crafting actual infoframe packets?
>
> Should NO_DATA be documented to be a strictly driver-internal value,
> and not documented with UAPI?
>
> I am unclear if userspace is using these enum values directly, or do
> they use the string names only.
>
> > + * @DRM_MODE_COLORIMETRY_SMPTE_170M_YCC:
> > + *   (HDMI)
> > + *   SMPTE ST 170M colorimetry format
>
> Does "colorimetry format" mean that the spec is used in full, for all
> of ColourPrimaries, TransferCharacteristics and MatrixCoefficients?
>
> If yes, good. If not, the wording misleads me.
>
> > + * @DRM_MODE_COLORIMETRY_BT709_YCC:
> > + *   (HDMI, DP)
> > + *   ITU-R BT.709 colorimetry format
> > + * @DRM_MODE_COLORIMETRY_XVYCC_601:
> > + *   (HDMI, DP)
> > + *   xvYCC601 colorimetry format
> > + * @DRM_MODE_COLORIMETRY_XVYCC_709:
> > + *   (HDMI, DP)
> > + *   xvYCC709 colorimetry format
>
> Btw. xvYCC are funny because they require limited quantization range
> encoding, but use the foot- and headroom to encode out-of-nominal-range
> values in order to expand the color gamut with negative and greater
> than unity values.
>
> Just for curiosity, is it in any way possible today to make use of that
> extended color gamut through KMS? Has it ever been possible?
>
> I mean, the KMS color pipeline assumes full-range RGB, so I don't see
> any way to make use of xvYCC.
>
> > + * @DRM_MODE_COLORIMETRY_SYCC_601:
> > + 

Re: [PATCH v2] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-08 Thread Felix Kuehling

On 2023-03-08 17:03, David Belanger wrote:

Handle case when module is unloaded (kfd_exit) before a process space
(mm_struct) is released.

v2: Fixed potential race conditions by removing all kfd_process from
the process table first, then working on releasing the resources.

Signed-off-by: David Belanger 
---
  drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
  drivers/gpu/drm/amd/amdkfd/kfd_process.c | 80 +---
  2 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 09b966dc3768..8ef4bd9e4f7d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -26,6 +26,9 @@
  #include "kfd_priv.h"
  #include "amdgpu_amdkfd.h"
  
+void kfd_cleanup_processes(void);


This should be declared in a header file.



+
+
  static int kfd_init(void)
  {
int err;
@@ -77,6 +80,7 @@ static int kfd_init(void)
  
  static void kfd_exit(void)

  {
+   kfd_cleanup_processes();
kfd_debugfs_fini();
kfd_process_destroy_wq();
kfd_procfs_shutdown();
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ebabe92f7edb..dd396a93a68d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1167,6 +1167,19 @@ static void kfd_process_free_notifier(struct 
mmu_notifier *mn)
kfd_unref_process(container_of(mn, struct kfd_process, mmu_notifier));
  }
  
+

+static void kfd_process_notifier_release_internal(struct kfd_process *p)
+{
+   cancel_delayed_work_sync(>eviction_work);
+   cancel_delayed_work_sync(>restore_work);
+
+   /* Indicate to other users that MM is no longer valid */
+   p->mm = NULL;
+
+   mmu_notifier_put(>mmu_notifier);
+}
+
+


You seem to like double emtpy newlines, as you're adding them before and 
after every function in this patch. It doesn't make sense here at least, 
because kfd_process_notifier_release_internal is so closely related to 
kfd_process_notifier_release.




  static void kfd_process_notifier_release(struct mmu_notifier *mn,
struct mm_struct *mm)
  {
@@ -1181,25 +1194,78 @@ static void kfd_process_notifier_release(struct 
mmu_notifier *mn,
return;
  
  	mutex_lock(_processes_mutex);

+   /*
+* Do early return if p is not in the table.
+*
+* This could potentially happen if this function is called concurrently
+* by mmu_notifier and by kfd_cleanup_pocesses.
+*
+*/
+   if (!hash_hashed(>kfd_processes)) {
+   mutex_unlock(_processes_mutex);


This won't give you the expected result when the process is still in the 
local cleanup_list in kfd_cleanup_processes, because it just tells you 
whether the process is on any list. However, if you get here holding the 
kfd_processes_mutex, kfd_cleanup_processes has either not entered its 
critical section yet, or it has completed it and the kfd_processes_table 
is empty. So you can check hash_empty(kfd_processes_table) here and exit 
early if it is empty.




+   return;
+   }
hash_del_rcu(>kfd_processes);
mutex_unlock(_processes_mutex);
synchronize_srcu(_processes_srcu);
  
-	cancel_delayed_work_sync(>eviction_work);

-   cancel_delayed_work_sync(>restore_work);
-
-   /* Indicate to other users that MM is no longer valid */
-   p->mm = NULL;
-
-   mmu_notifier_put(>mmu_notifier);
+   kfd_process_notifier_release_internal(p);
  }
  
+


Extra newline.



  static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = {
.release = kfd_process_notifier_release,
.alloc_notifier = kfd_process_alloc_notifier,
.free_notifier = kfd_process_free_notifier,
  };
  
+

+void kfd_cleanup_processes(void)
+{
+   /*
+* This code handles the case when driver is being unloaded before all
+* mm_struct are released.  We need to safely free the kfd_process and
+* avoid race conditions with mmu_notifier that might try to free them.
+*
+*/
+
+   struct kfd_process *p;
+   struct hlist_node *p_temp;
+   unsigned int temp;
+   HLIST_HEAD(cleanup_list);
+
+   /*
+* Move all remaining kfd_process from the process table to a
+* temp list for processing.   Once done, callback from mmu_notifier
+* release will not see the kfd_process in the table and do early 
return,
+* avoiding double free issues.
+*/
+   mutex_lock(_processes_mutex);
+   hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {


This needs to use hash_for_each_safe to allow safe removal of elements 
in the loop. You can't use hash_for_each_rcu because you're not in an 
SRCU read-side critical section.




+   hash_del_rcu(>kfd_processes);
+   

Re: [RFC v2 0/6] drm/amd/display: Pass proper parent for DM backlight device v2

2023-03-08 Thread Hans de Goede
Hi,

On 3/8/23 22:58, Hans de Goede wrote:
> Hi All,
> 
> Here is version 2 of my patch series to pass the proper parent device
> to backlight_device_register().
> 
> New in version 2 is delaying the registering of the backlight_dev till
> after the drm_connector is registered by doing it from
> drm_connector_funcs.late_register.
> 
> This involves first reworking the code a bit to allow delaying
> the registering, so this has turned from a single patch into
> a 6 patch set.
> 
> Regards,
> 
> Hans

p.s.

Like last time this series is marked as RFC because I don't have hw
to test the fix myself. The previous version was tested by 2 reporters
of: https://gitlab.gnome.org/GNOME/gnome-settings-daemon/-/issues/730

I hope to get test results from them for this new version soon.


> 
> 
> Hans de Goede (6):
>   drm/amd/display/amdgpu_dm: Fix backlight_device_register() error
> handling
>   drm/amd/display/amdgpu_dm: Refactor register_backlight_device()
>   drm/amd/display/amdgpu_dm: Add a bl_idx to amdgpu_dm_connector
>   drm/amd/display/amdgpu_dm: Move most backlight setup into
> setup_backlight_device()
>   drm/amd/display/amdgpu_dm: Make amdgpu_dm_register_backlight_device()
> take an amdgpu_dm_connector
>   drm/amd/display: Pass proper parent for DM backlight device
> registration v2
> 
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 99 ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  1 +
>  2 files changed, 44 insertions(+), 56 deletions(-)
> 



[PATCH v2] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-08 Thread David Belanger
Handle case when module is unloaded (kfd_exit) before a process space
(mm_struct) is released.

v2: Fixed potential race conditions by removing all kfd_process from
the process table first, then working on releasing the resources.

Signed-off-by: David Belanger 
---
 drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 80 +---
 2 files changed, 77 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 09b966dc3768..8ef4bd9e4f7d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -26,6 +26,9 @@
 #include "kfd_priv.h"
 #include "amdgpu_amdkfd.h"
 
+void kfd_cleanup_processes(void);
+
+
 static int kfd_init(void)
 {
int err;
@@ -77,6 +80,7 @@ static int kfd_init(void)
 
 static void kfd_exit(void)
 {
+   kfd_cleanup_processes();
kfd_debugfs_fini();
kfd_process_destroy_wq();
kfd_procfs_shutdown();
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ebabe92f7edb..dd396a93a68d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1167,6 +1167,19 @@ static void kfd_process_free_notifier(struct 
mmu_notifier *mn)
kfd_unref_process(container_of(mn, struct kfd_process, mmu_notifier));
 }
 
+
+static void kfd_process_notifier_release_internal(struct kfd_process *p)
+{
+   cancel_delayed_work_sync(>eviction_work);
+   cancel_delayed_work_sync(>restore_work);
+
+   /* Indicate to other users that MM is no longer valid */
+   p->mm = NULL;
+
+   mmu_notifier_put(>mmu_notifier);
+}
+
+
 static void kfd_process_notifier_release(struct mmu_notifier *mn,
struct mm_struct *mm)
 {
@@ -1181,25 +1194,78 @@ static void kfd_process_notifier_release(struct 
mmu_notifier *mn,
return;
 
mutex_lock(_processes_mutex);
+   /*
+* Do early return if p is not in the table.
+*
+* This could potentially happen if this function is called concurrently
+* by mmu_notifier and by kfd_cleanup_pocesses.
+*
+*/
+   if (!hash_hashed(>kfd_processes)) {
+   mutex_unlock(_processes_mutex);
+   return;
+   }
hash_del_rcu(>kfd_processes);
mutex_unlock(_processes_mutex);
synchronize_srcu(_processes_srcu);
 
-   cancel_delayed_work_sync(>eviction_work);
-   cancel_delayed_work_sync(>restore_work);
-
-   /* Indicate to other users that MM is no longer valid */
-   p->mm = NULL;
-
-   mmu_notifier_put(>mmu_notifier);
+   kfd_process_notifier_release_internal(p);
 }
 
+
 static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = {
.release = kfd_process_notifier_release,
.alloc_notifier = kfd_process_alloc_notifier,
.free_notifier = kfd_process_free_notifier,
 };
 
+
+void kfd_cleanup_processes(void)
+{
+   /*
+* This code handles the case when driver is being unloaded before all
+* mm_struct are released.  We need to safely free the kfd_process and
+* avoid race conditions with mmu_notifier that might try to free them.
+*
+*/
+
+   struct kfd_process *p;
+   struct hlist_node *p_temp;
+   unsigned int temp;
+   HLIST_HEAD(cleanup_list);
+
+   /*
+* Move all remaining kfd_process from the process table to a
+* temp list for processing.   Once done, callback from mmu_notifier
+* release will not see the kfd_process in the table and do early 
return,
+* avoiding double free issues.
+*/
+   mutex_lock(_processes_mutex);
+   hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
+   hash_del_rcu(>kfd_processes);
+   hlist_add_head(>kfd_processes, _list);
+   }
+   mutex_unlock(_processes_mutex);
+   synchronize_srcu(_processes_srcu);
+
+   /*
+* Release resources for all outstanding kfd_process collected.
+*/
+   hlist_for_each_entry_safe(p, p_temp, _list, kfd_processes) {
+   kfd_process_notifier_release_internal(p);
+   }
+
+   /*
+* Must be called after all mmu_notifier_put are done and before
+* kfd_process_wq is released.
+*
+* Ensures that all outstanding free_notifier get called, triggering
+* the release of the kfd_process struct.
+*/
+   mmu_notifier_synchronize();
+}
+
+
 static int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file *filep)
 {
unsigned long  offset;
-- 
2.38.1



[RFC v2 2/6] drm/amd/display/amdgpu_dm: Refactor register_backlight_device()

2023-03-08 Thread Hans de Goede
Refactor register_backlight_device():

1) Turn the connector-type + signal check into an early exit
condition to avoid the indentation level of the rest of the code

2) Add an array bounds check for the arrays indexed by dm->num_of_edps

3) register_backlight_device() always increases dm->num_of_edps if
amdgpu_dm_register_backlight_device() has assigned a backlight_dev to
the current dm->backlight_link[dm->num_of_edps] slot.

So on its next call dm->backlight_dev[dm->num_of_edps] always point to
the next empty slot and the "if (!dm->backlight_dev[dm->num_of_edps])"
check will thus always succeed and can be removed.

4) Add a bl_idx local variable to use as array index, rather then
using dm->num_of_edps to improve the code readability.

Signed-off-by: Hans de Goede 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 28 ++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 42b88ab5552d..1b5efa56ec15 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4231,21 +4231,23 @@ static int initialize_plane(struct 
amdgpu_display_manager *dm,
 static void register_backlight_device(struct amdgpu_display_manager *dm,
  struct dc_link *link)
 {
-   if ((link->connector_signal & (SIGNAL_TYPE_EDP | SIGNAL_TYPE_LVDS)) &&
-   link->type != dc_connection_none) {
-   /*
-* Event if registration failed, we should continue with
-* DM initialization because not having a backlight control
-* is better then a black screen.
-*/
-   if (!dm->backlight_dev[dm->num_of_edps])
-   amdgpu_dm_register_backlight_device(dm);
+   int bl_idx = dm->num_of_edps;
 
-   if (dm->backlight_dev[dm->num_of_edps]) {
-   dm->backlight_link[dm->num_of_edps] = link;
-   dm->num_of_edps++;
-   }
+   if (!(link->connector_signal & (SIGNAL_TYPE_EDP | SIGNAL_TYPE_LVDS)) ||
+   link->type == dc_connection_none)
+   return;
+
+   if (dm->num_of_edps >= AMDGPU_DM_MAX_NUM_EDP) {
+   drm_warn(adev_to_drm(dm->adev), "Too much eDP connections, 
skipping backlight setup for additional eDPs\n");
+   return;
}
+
+   amdgpu_dm_register_backlight_device(dm);
+   if (!dm->backlight_dev[bl_idx])
+   return;
+
+   dm->backlight_link[bl_idx] = link;
+   dm->num_of_edps++;
 }
 
 static void amdgpu_set_panel_orientation(struct drm_connector *connector);
-- 
2.39.1



[RFC v2 1/6] drm/amd/display/amdgpu_dm: Fix backlight_device_register() error handling

2023-03-08 Thread Hans de Goede
backlight_device_register() returns an ERR_PTR on error, but other code
such as amdgpu_dm_connector_destroy() assumes dm->backlight_dev[i] is NULL
if no backlight is registered.

Clear dm->backlight_dev[i] on registration failure, to avoid other code
trying to deref an ERR_PTR pointer.

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 009ef917dad4..42b88ab5552d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4180,9 +4180,10 @@ amdgpu_dm_register_backlight_device(struct 
amdgpu_display_manager *dm)
   
_dm_backlight_ops,
   );
 
-   if (IS_ERR(dm->backlight_dev[dm->num_of_edps]))
+   if (IS_ERR(dm->backlight_dev[dm->num_of_edps])) {
DRM_ERROR("DM: Backlight registration failed!\n");
-   else
+   dm->backlight_dev[dm->num_of_edps] = NULL;
+   } else
DRM_DEBUG_DRIVER("DM: Registered Backlight device: %s\n", 
bl_name);
 }
 
-- 
2.39.1



[RFC v2 5/6] drm/amd/display/amdgpu_dm: Make amdgpu_dm_register_backlight_device() take an amdgpu_dm_connector

2023-03-08 Thread Hans de Goede
Make amdgpu_dm_register_backlight_device() take an amdgpu_dm_connector
pointer to the connector for which it should register the backlight
as its only argument.

This is a preparation patch for moving the actual backlight class device
registering to drm_connector_funcs.late_register.

Signed-off-by: Hans de Goede 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 24 +--
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 757202af2eec..038bf897cc28 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4140,13 +4140,15 @@ static const struct backlight_ops 
amdgpu_dm_backlight_ops = {
 };
 
 static void
-amdgpu_dm_register_backlight_device(struct amdgpu_display_manager *dm)
+amdgpu_dm_register_backlight_device(struct amdgpu_dm_connector *aconnector)
 {
-   char bl_name[16];
+   struct drm_device *drm = aconnector->base.dev;
+   struct amdgpu_display_manager *dm = _to_adev(drm)->dm;
struct backlight_properties props = { 0 };
+   char bl_name[16];
 
if (!acpi_video_backlight_use_native()) {
-   drm_info(adev_to_drm(dm->adev), "Skipping amdgpu DM backlight 
registration\n");
+   drm_info(drm, "Skipping amdgpu DM backlight registration\n");
/* Try registering an ACPI video backlight device instead. */
acpi_video_register_backlight();
return;
@@ -4157,17 +4159,15 @@ amdgpu_dm_register_backlight_device(struct 
amdgpu_display_manager *dm)
props.type = BACKLIGHT_RAW;
 
snprintf(bl_name, sizeof(bl_name), "amdgpu_bl%d",
-adev_to_drm(dm->adev)->primary->index + dm->num_of_edps);
+drm->primary->index + aconnector->bl_idx);
 
-   dm->backlight_dev[dm->num_of_edps] = backlight_device_register(bl_name,
-  
adev_to_drm(dm->adev)->dev,
-  dm,
-  
_dm_backlight_ops,
-  );
+   dm->backlight_dev[aconnector->bl_idx] =
+   backlight_device_register(bl_name, drm->dev, dm,
+ _dm_backlight_ops, );
 
-   if (IS_ERR(dm->backlight_dev[dm->num_of_edps])) {
+   if (IS_ERR(dm->backlight_dev[aconnector->bl_idx])) {
DRM_ERROR("DM: Backlight registration failed!\n");
-   dm->backlight_dev[dm->num_of_edps] = NULL;
+   dm->backlight_dev[aconnector->bl_idx] = NULL;
} else
DRM_DEBUG_DRIVER("DM: Registered Backlight device: %s\n", 
bl_name);
 }
@@ -4233,7 +4233,7 @@ static void setup_backlight_device(struct 
amdgpu_display_manager *dm,
amdgpu_dm_update_backlight_caps(dm, bl_idx);
dm->brightness[bl_idx] = AMDGPU_MAX_BL_LEVEL;
 
-   amdgpu_dm_register_backlight_device(dm);
+   amdgpu_dm_register_backlight_device(aconnector);
if (!dm->backlight_dev[bl_idx]) {
aconnector->bl_idx = -1;
return;
-- 
2.39.1



[RFC v2 4/6] drm/amd/display/amdgpu_dm: Move most backlight setup into setup_backlight_device()

2023-03-08 Thread Hans de Goede
Rename register_backlight_device() to setup_backlight_device()
and move all backlight setup related calls from
amdgpu_dm_register_backlight_device() and from
amdgpu_dm_initialize_drm_device() there.

This leaves amdgpu_dm_register_backlight_device() dealing purely
with registering the actual backlight class device.

This is a preparation patch for moving the actual backlight class device
registering to drm_connector_funcs.late_register.

Signed-off-by: Hans de Goede 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index eb1f2073b0cf..757202af2eec 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4145,9 +4145,6 @@ amdgpu_dm_register_backlight_device(struct 
amdgpu_display_manager *dm)
char bl_name[16];
struct backlight_properties props = { 0 };
 
-   amdgpu_dm_update_backlight_caps(dm, dm->num_of_edps);
-   dm->brightness[dm->num_of_edps] = AMDGPU_MAX_BL_LEVEL;
-
if (!acpi_video_backlight_use_native()) {
drm_info(adev_to_drm(dm->adev), "Skipping amdgpu DM backlight 
registration\n");
/* Try registering an ACPI video backlight device instead. */
@@ -4216,8 +4213,8 @@ static int initialize_plane(struct amdgpu_display_manager 
*dm,
 }
 
 
-static void register_backlight_device(struct amdgpu_display_manager *dm,
- struct amdgpu_dm_connector *aconnector)
+static void setup_backlight_device(struct amdgpu_display_manager *dm,
+  struct amdgpu_dm_connector *aconnector)
 {
struct dc_link *link = aconnector->dc_link;
int bl_idx = dm->num_of_edps;
@@ -4233,6 +4230,9 @@ static void register_backlight_device(struct 
amdgpu_display_manager *dm,
 
aconnector->bl_idx = bl_idx;
 
+   amdgpu_dm_update_backlight_caps(dm, bl_idx);
+   dm->brightness[bl_idx] = AMDGPU_MAX_BL_LEVEL;
+
amdgpu_dm_register_backlight_device(dm);
if (!dm->backlight_dev[bl_idx]) {
aconnector->bl_idx = -1;
@@ -4241,6 +4241,8 @@ static void register_backlight_device(struct 
amdgpu_display_manager *dm,
 
dm->backlight_link[bl_idx] = link;
dm->num_of_edps++;
+
+   update_connector_ext_caps(aconnector);
 }
 
 static void amdgpu_set_panel_orientation(struct drm_connector *connector);
@@ -4423,10 +4425,7 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
 
if (ret) {

amdgpu_dm_update_connector_after_detect(aconnector);
-   register_backlight_device(dm, aconnector);
-
-   if (dm->num_of_edps)
-   update_connector_ext_caps(aconnector);
+   setup_backlight_device(dm, aconnector);
 
if (psr_feature_enabled)
amdgpu_dm_set_psr_caps(link);
-- 
2.39.1



[RFC v2 0/6] drm/amd/display: Pass proper parent for DM backlight device v2

2023-03-08 Thread Hans de Goede
Hi All,

Here is version 2 of my patch series to pass the proper parent device
to backlight_device_register().

New in version 2 is delaying the registering of the backlight_dev till
after the drm_connector is registered by doing it from
drm_connector_funcs.late_register.

This involves first reworking the code a bit to allow delaying
the registering, so this has turned from a single patch into
a 6 patch set.

Regards,

Hans


Hans de Goede (6):
  drm/amd/display/amdgpu_dm: Fix backlight_device_register() error
handling
  drm/amd/display/amdgpu_dm: Refactor register_backlight_device()
  drm/amd/display/amdgpu_dm: Add a bl_idx to amdgpu_dm_connector
  drm/amd/display/amdgpu_dm: Move most backlight setup into
setup_backlight_device()
  drm/amd/display/amdgpu_dm: Make amdgpu_dm_register_backlight_device()
take an amdgpu_dm_connector
  drm/amd/display: Pass proper parent for DM backlight device
registration v2

 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 99 ---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  1 +
 2 files changed, 44 insertions(+), 56 deletions(-)

-- 
2.39.1



[RFC v2 3/6] drm/amd/display/amdgpu_dm: Add a bl_idx to amdgpu_dm_connector

2023-03-08 Thread Hans de Goede
Currently functions like update_connector_ext_caps() and
amdgpu_dm_connector_destroy() are iterating over dm->backlight_link[i]
to find the index of the (optional) backlight_dev associated with
the connector.

Instead make register_backlight_device() store the dm->backlight_dev[]
index used for the connector inside the amdgpu_dm_connector struct.

This removes the need to iterate over the dm->backlight_link[]
array and this is necessary as a preparation patch for moving
the actual backlight_device_register()
call to drm_connector_funcs.late_register.

While reworking update_connector_ext_caps() also remove the aconnector
and aconnector->dc_link NULL checks in this function. These are both
never NULL and are unconditionally derefed in its callers.

Signed-off-by: Hans de Goede 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 42 +++
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  1 +
 2 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 1b5efa56ec15..eb1f2073b0cf 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2936,30 +2936,18 @@ static struct drm_mode_config_helper_funcs 
amdgpu_dm_mode_config_helperfuncs = {
 static void update_connector_ext_caps(struct amdgpu_dm_connector *aconnector)
 {
struct amdgpu_dm_backlight_caps *caps;
-   struct amdgpu_display_manager *dm;
struct drm_connector *conn_base;
struct amdgpu_device *adev;
-   struct dc_link *link = NULL;
struct drm_luminance_range_info *luminance_range;
-   int i;
-
-   if (!aconnector || !aconnector->dc_link)
-   return;
 
-   link = aconnector->dc_link;
-   if (link->connector_signal != SIGNAL_TYPE_EDP)
+   if (aconnector->bl_idx == -1 ||
+   aconnector->dc_link->connector_signal != SIGNAL_TYPE_EDP)
return;
 
conn_base = >base;
adev = drm_to_adev(conn_base->dev);
-   dm = >dm;
-   for (i = 0; i < dm->num_of_edps; i++) {
-   if (link == dm->backlight_link[i])
-   break;
-   }
-   if (i >= dm->num_of_edps)
-   return;
-   caps = >backlight_caps[i];
+
+   caps = >dm.backlight_caps[aconnector->bl_idx];
caps->ext_caps = >dc_link->dpcd_sink_ext_caps;
caps->aux_support = false;
 
@@ -4229,8 +4217,9 @@ static int initialize_plane(struct amdgpu_display_manager 
*dm,
 
 
 static void register_backlight_device(struct amdgpu_display_manager *dm,
- struct dc_link *link)
+ struct amdgpu_dm_connector *aconnector)
 {
+   struct dc_link *link = aconnector->dc_link;
int bl_idx = dm->num_of_edps;
 
if (!(link->connector_signal & (SIGNAL_TYPE_EDP | SIGNAL_TYPE_LVDS)) ||
@@ -4242,9 +4231,13 @@ static void register_backlight_device(struct 
amdgpu_display_manager *dm,
return;
}
 
+   aconnector->bl_idx = bl_idx;
+
amdgpu_dm_register_backlight_device(dm);
-   if (!dm->backlight_dev[bl_idx])
+   if (!dm->backlight_dev[bl_idx]) {
+   aconnector->bl_idx = -1;
return;
+   }
 
dm->backlight_link[bl_idx] = link;
dm->num_of_edps++;
@@ -4430,7 +4423,7 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
 
if (ret) {

amdgpu_dm_update_connector_after_detect(aconnector);
-   register_backlight_device(dm, link);
+   register_backlight_device(dm, aconnector);
 
if (dm->num_of_edps)
update_connector_ext_caps(aconnector);
@@ -6211,10 +6204,8 @@ static void amdgpu_dm_connector_unregister(struct 
drm_connector *connector)
 static void amdgpu_dm_connector_destroy(struct drm_connector *connector)
 {
struct amdgpu_dm_connector *aconnector = 
to_amdgpu_dm_connector(connector);
-   const struct dc_link *link = aconnector->dc_link;
struct amdgpu_device *adev = drm_to_adev(connector->dev);
struct amdgpu_display_manager *dm = >dm;
-   int i;
 
/*
 * Call only if mst_mgr was initialized before since it's not done
@@ -6223,11 +6214,9 @@ static void amdgpu_dm_connector_destroy(struct 
drm_connector *connector)
if (aconnector->mst_mgr.dev)
drm_dp_mst_topology_mgr_destroy(>mst_mgr);
 
-   for (i = 0; i < dm->num_of_edps; i++) {
-   if ((link == dm->backlight_link[i]) && dm->backlight_dev[i]) {
-   backlight_device_unregister(dm->backlight_dev[i]);
-   dm->backlight_dev[i] = NULL;
-   }
+   if (aconnector->bl_idx != -1) {
+   

[RFC v2 6/6] drm/amd/display: Pass proper parent for DM backlight device registration v2

2023-03-08 Thread Hans de Goede
The parent for the backlight device should be the drm-connector object,
not the PCI device.

Userspace relies on this to be able to detect which backlight class device
to use on hybrid gfx devices where there may be multiple native (raw)
backlight devices registered.

Specifically gnome-settings-daemon expects the parent device to have
an "enabled" sysfs attribute (as drm_connector devices do) and tests
that this returns "enabled" when read.

This aligns the parent of the backlight device with i915, nouveau, radeon.
Note that drivers/gpu/drm/amd/amdgpu/atombios_encoders.c also already
uses the drm_connector as parent, only amdgpu_dm.c used the PCI device
as parent before this change.

Changes in v2:
Together with changing the parent, also move the registration to
drm_connector_funcs.late_register() this is necessary because the parent
device (which now is the drm_connector) must be registered before
the backlight class device is, otherwise the backlight class device ends
up without any parent set at all.

This brings the backlight class device registration timing inline with
nouveau and i915 which also use drm_connector_funcs.late_register()
for this.

Note this slightly changes backlight_device_register() error handling,
instead of not increasing dm->num_of_edps and re-using the current
bl_idx for a potential other backlight device, dm->backlight_dev[bl_idx]
is now simply left NULL on failure. This is ok because all code
looking at dm->backlight_dev[i] also checks it is not NULL.

Link: https://gitlab.gnome.org/GNOME/gnome-settings-daemon/-/issues/730
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 038bf897cc28..051074d5812f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4162,7 +4162,7 @@ amdgpu_dm_register_backlight_device(struct 
amdgpu_dm_connector *aconnector)
 drm->primary->index + aconnector->bl_idx);
 
dm->backlight_dev[aconnector->bl_idx] =
-   backlight_device_register(bl_name, drm->dev, dm,
+   backlight_device_register(bl_name, aconnector->base.kdev, dm,
  _dm_backlight_ops, );
 
if (IS_ERR(dm->backlight_dev[aconnector->bl_idx])) {
@@ -4232,13 +4232,6 @@ static void setup_backlight_device(struct 
amdgpu_display_manager *dm,
 
amdgpu_dm_update_backlight_caps(dm, bl_idx);
dm->brightness[bl_idx] = AMDGPU_MAX_BL_LEVEL;
-
-   amdgpu_dm_register_backlight_device(aconnector);
-   if (!dm->backlight_dev[bl_idx]) {
-   aconnector->bl_idx = -1;
-   return;
-   }
-
dm->backlight_link[bl_idx] = link;
dm->num_of_edps++;
 
@@ -6297,6 +6290,8 @@ amdgpu_dm_connector_late_register(struct drm_connector 
*connector)
to_amdgpu_dm_connector(connector);
int r;
 
+   amdgpu_dm_register_backlight_device(amdgpu_dm_connector);
+
if ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort) ||
(connector->connector_type == DRM_MODE_CONNECTOR_eDP)) {
amdgpu_dm_connector->dm_dp_aux.aux.dev = connector->kdev;
-- 
2.39.1



[PATCH] drm/amdkfd: fix potential kgd_mem UAFs

2023-03-08 Thread Chia-I Wu
kgd_mem should be accessed with p->mutex locked, or it could have been
freed by kfd_ioctl_free_memory_of_gpu.

Signed-off-by: Chia-I Wu 
---
 drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 6d291aa6386bd..3c630114210d6 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -1293,14 +1293,14 @@ static int kfd_ioctl_map_memory_to_gpu(struct file 
*filep,
args->n_success = i+1;
}
 
-   mutex_unlock(>mutex);
-
err = amdgpu_amdkfd_gpuvm_sync_memory(dev->adev, (struct kgd_mem *) 
mem, true);
if (err) {
pr_debug("Sync memory failed, wait interrupted by user 
signal\n");
goto sync_memory_failed;
}
 
+   mutex_unlock(>mutex);
+
/* Flush TLBs after waiting for the page table updates to complete */
for (i = 0; i < args->n_devices; i++) {
peer_pdd = kfd_process_device_data_by_id(p, devices_arr[i]);
@@ -1316,9 +1316,9 @@ static int kfd_ioctl_map_memory_to_gpu(struct file *filep,
 bind_process_to_device_failed:
 get_mem_obj_from_handle_failed:
 map_memory_to_gpu_failed:
+sync_memory_failed:
mutex_unlock(>mutex);
 copy_from_user_failed:
-sync_memory_failed:
kfree(devices_arr);
 
return err;
@@ -1332,6 +1332,7 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file 
*filep,
void *mem;
long err = 0;
uint32_t *devices_arr = NULL, i;
+   bool flush_tlb;
 
if (!args->n_devices) {
pr_debug("Device IDs array empty\n");
@@ -1384,16 +1385,19 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file 
*filep,
}
args->n_success = i+1;
}
-   mutex_unlock(>mutex);
 
-   if (kfd_flush_tlb_after_unmap(pdd->dev)) {
+   flush_tlb = kfd_flush_tlb_after_unmap(pdd->dev);
+   if (flush_tlb) {
err = amdgpu_amdkfd_gpuvm_sync_memory(pdd->dev->adev,
(struct kgd_mem *) mem, true);
if (err) {
pr_debug("Sync memory failed, wait interrupted by user 
signal\n");
goto sync_memory_failed;
}
+   }
+   mutex_unlock(>mutex);
 
+   if (flush_tlb) {
/* Flush TLBs after waiting for the page table updates to 
complete */
for (i = 0; i < args->n_devices; i++) {
peer_pdd = kfd_process_device_data_by_id(p, 
devices_arr[i]);
@@ -1409,9 +1413,9 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file 
*filep,
 bind_process_to_device_failed:
 get_mem_obj_from_handle_failed:
 unmap_memory_from_gpu_failed:
+sync_memory_failed:
mutex_unlock(>mutex);
 copy_from_user_failed:
-sync_memory_failed:
kfree(devices_arr);
return err;
 }
-- 
2.40.0.rc1.284.g88254d51c5-goog



Re: [PATCH 1/2] drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume

2023-03-08 Thread Alex Deucher
On Wed, Mar 8, 2023 at 10:20 AM Alex Deucher  wrote:
>
> From: Błażej Szczygieł 
>
> Always setup overdrive tables after resume. Preserve only some
> user-defined settings in user_overdrive_table if they're set.
>
> Copy restored user_overdrive_table into od_table to get correct
> values.
>
> On cold boot, BTC was triggered and GfxVfCurve was calibrated. We
> got VfCurve settings (a). On resuming back, BTC will be triggered
> again and GfxVfCurve will be recalibrated. VfCurve settings (b)
> got may be different from those of cold boot.  So if we reuse
> those VfCurve settings (a) got on cold boot on suspend, we can
> run into discrepencies.
>
> Reviewed-by: Evan Quan 
> Signed-off-by: Błażej Szczygieł 
> Signed-off-by: Alex Deucher 

Will add the bug references as well when I commit this:
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1897
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2276

Thanks for the patch.

Alex

> ---
>  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 43 ++-
>  1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> index 697e98a0a20a..75f18681e984 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> @@ -2143,16 +2143,9 @@ static int 
> sienna_cichlid_set_default_od_settings(struct smu_context *smu)
> (OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
> OverDriveTable_t *user_od_table =
> (OverDriveTable_t *)smu->smu_table.user_overdrive_table;
> +   OverDriveTable_t user_od_table_bak;
> int ret = 0;
>
> -   /*
> -* For S3/S4/Runpm resume, no need to setup those overdrive tables 
> again as
> -*   - either they already have the default OD settings got during 
> cold bootup
> -*   - or they have some user customized OD settings which cannot be 
> overwritten
> -*/
> -   if (smu->adev->in_suspend)
> -   return 0;
> -
> ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE,
>0, (void *)boot_od_table, false);
> if (ret) {
> @@ -2163,7 +2156,23 @@ static int 
> sienna_cichlid_set_default_od_settings(struct smu_context *smu)
> sienna_cichlid_dump_od_table(smu, boot_od_table);
>
> memcpy(od_table, boot_od_table, sizeof(OverDriveTable_t));
> -   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
> +
> +   /*
> +* For S3/S4/Runpm resume, we need to setup those overdrive tables 
> again,
> +* but we have to preserve user defined values in "user_od_table".
> +*/
> +   if (!smu->adev->in_suspend) {
> +   memcpy(user_od_table, boot_od_table, 
> sizeof(OverDriveTable_t));
> +   smu->user_dpm_profile.user_od = false;
> +   } else if (smu->user_dpm_profile.user_od) {
> +   memcpy(_od_table_bak, user_od_table, 
> sizeof(OverDriveTable_t));
> +   memcpy(user_od_table, boot_od_table, 
> sizeof(OverDriveTable_t));
> +   user_od_table->GfxclkFmin = user_od_table_bak.GfxclkFmin;
> +   user_od_table->GfxclkFmax = user_od_table_bak.GfxclkFmax;
> +   user_od_table->UclkFmin = user_od_table_bak.UclkFmin;
> +   user_od_table->UclkFmax = user_od_table_bak.UclkFmax;
> +   user_od_table->VddGfxOffset = user_od_table_bak.VddGfxOffset;
> +   }
>
> return 0;
>  }
> @@ -2373,6 +2382,20 @@ static int sienna_cichlid_od_edit_dpm_table(struct 
> smu_context *smu,
> return ret;
>  }
>
> +static int sienna_cichlid_restore_user_od_settings(struct smu_context *smu)
> +{
> +   struct smu_table_context *table_context = >smu_table;
> +   OverDriveTable_t *od_table = table_context->overdrive_table;
> +   OverDriveTable_t *user_od_table = table_context->user_overdrive_table;
> +   int res;
> +
> +   res = smu_v11_0_restore_user_od_settings(smu);
> +   if (res == 0)
> +   memcpy(od_table, user_od_table, sizeof(OverDriveTable_t));
> +
> +   return res;
> +}
> +
>  static int sienna_cichlid_run_btc(struct smu_context *smu)
>  {
> int res;
> @@ -4400,7 +4423,7 @@ static const struct pptable_funcs 
> sienna_cichlid_ppt_funcs = {
> .set_soft_freq_limited_range = smu_v11_0_set_soft_freq_limited_range,
> .set_default_od_settings = sienna_cichlid_set_default_od_settings,
> .od_edit_dpm_table = sienna_cichlid_od_edit_dpm_table,
> -   .restore_user_od_settings = smu_v11_0_restore_user_od_settings,
> +   .restore_user_od_settings = sienna_cichlid_restore_user_od_settings,
> .run_btc = sienna_cichlid_run_btc,
> .set_power_source = smu_v11_0_set_power_source,
> .get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
> --
> 2.39.2
>


Re: [PATCH] drm/amdkfd: Get prange->offset after svm_range_vram_node_new

2023-03-08 Thread Chen, Xiaogang



On 3/8/2023 11:11 AM, Felix Kuehling wrote:

On 2023-03-08 02:45, Xiaogang.Chen wrote:

From: Xiaogang Chen 

During miration to vram prange->offset is valid after vram buffer is 
located,
either use old one or allocate a new one. Move 
svm_range_vram_node_new before migrate

for each vma to get valid prange->offset.

Signed-off-by: Xiaogang Chen 


I'd  prefer to keep svm_range_vram_node_new in 
svm_migrate_copy_to_vram. Logically the memory allocation should be 
after migrate_vma_setup. If migrate_vma_setup finds that there is 
nothing to migrate, we should not allocate any memory.


Does this fix a real issue, or is this a theoretical fix? I think it 
should probably work correctly without this patch. 
svm_range_vram_node_new sets prange->offset to 0. If no VRAM was 
previously allocated, it should already be 0, so nothing changes. 
Maybe we just need a fix to set prange->offset = 0 in 
svm_range_vram_node_free.


A real issue is same prange migrate vram->cpu, then cpu->vram. During 
vram->cpu pragne got split, so prange->offset got changed, then vram 
node got freed by svm_range_vram_node_free, but not update 
prange->offset. It is the case KFDSVMRangeTes.MigrateTest. I will check 
by set prange->offset = 0 at svm_range_vram_node_free.


In theory, getting prange->offset after svm_range_vram_node_new makes 
code logically clearer? svm_range_vram_node_new handles different cases, 
we are not sure what prange->offset would be before call it.


If migrate_vma_setup fail for a vma, we can svm_range_vram_node_free the 
vram buffer got from svm_range_vram_node_new.




Regards,
  Felix



---
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c

index fd54a00e7229..15791490c23e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -310,12 +310,6 @@ svm_migrate_copy_to_vram(struct amdgpu_device 
*adev, struct svm_range *prange,

  src = scratch;
  dst = (uint64_t *)(scratch + npages);
  -    r = svm_range_vram_node_new(adev, prange, true);
-    if (r) {
-    dev_dbg(adev->dev, "fail %d to alloc vram\n", r);
-    goto out;
-    }
-
  amdgpu_res_first(prange->ttm_res, ttm_res_offset,
   npages << PAGE_SHIFT, );
  for (i = j = 0; i < npages; i++) {
@@ -525,6 +519,12 @@ svm_migrate_ram_to_vram(struct svm_range 
*prange, uint32_t best_loc,

    start = prange->start << PAGE_SHIFT;
  end = (prange->last + 1) << PAGE_SHIFT;
+
+    r = svm_range_vram_node_new(adev, prange, true);
+    if (r) {
+    dev_dbg(adev->dev, "fail %d to alloc vram\n", r);
+    return r;
+    }
  ttm_res_offset = prange->offset << PAGE_SHIFT;
    for (addr = start; addr < end;) {


Re: [PATCH] amd/display/debugfs: add sysfs entry to read PSR residency from firmware

2023-03-08 Thread Hamza Mahfooz



On 3/8/23 02:10, Shirish S wrote:

[Why]
Currently there aren't any methods to determine PSR state residency.

[How]
create a sysfs entry for reading residency and internally hook it up
to existing functionality of reading PSR residency from firmware.

Signed-off-by: Shirish S 
---
  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
index abf7895d1608..8ff2802db5b5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
@@ -27,6 +27,7 @@
  #include 
  
  #include "dc.h"

+#include "dc_link.h"


Please drop this include, the relevant function should already be
accessible from dc.h.


  #include "amdgpu.h"
  #include "amdgpu_dm.h"
  #include "amdgpu_dm_debugfs.h"
@@ -2793,6 +2794,22 @@ static int psr_get(void *data, u64 *val)
return 0;
  }
  
+/*

+ *  Read PSR state residency
+ */
+static int psr_read_residency(void *data, u64 *val)
+{
+   struct amdgpu_dm_connector *connector = data;
+   struct dc_link *link = connector->dc_link;
+   u32 residency;
+
+   dc_link_get_psr_residency(link, );


Did you mean to use link_get_psr_residency() here?


+
+   *val = (u64)residency;
+
+   return 0;
+}
+
  /*
   * Set dmcub trace event IRQ enable or disable.
   * Usage to enable dmcub trace event IRQ: echo 1 > 
/sys/kernel/debug/dri/0/amdgpu_dm_dmcub_trace_event_en
@@ -2828,6 +2845,7 @@ DEFINE_DEBUGFS_ATTRIBUTE(dmcub_trace_event_state_fops, 
dmcub_trace_event_state_g
 dmcub_trace_event_state_set, "%llu\n");
  
  DEFINE_DEBUGFS_ATTRIBUTE(psr_fops, psr_get, NULL, "%llu\n");

+DEFINE_DEBUGFS_ATTRIBUTE(psr_residency_fops, psr_read_residency, NULL, 
"%llu\n");
  
  DEFINE_SHOW_ATTRIBUTE(current_backlight);

  DEFINE_SHOW_ATTRIBUTE(target_backlight);
@@ -2991,6 +3009,7 @@ void connector_debugfs_init(struct amdgpu_dm_connector 
*connector)
if (connector->base.connector_type == DRM_MODE_CONNECTOR_eDP) {
debugfs_create_file_unsafe("psr_capability", 0444, dir, connector, 
_capability_fops);
debugfs_create_file_unsafe("psr_state", 0444, dir, connector, 
_fops);
+   debugfs_create_file_unsafe("psr_residency", 0444, dir, connector, 
_residency_fops);
debugfs_create_file("amdgpu_current_backlight_pwm", 0444, dir, 
connector,
_backlight_fops);
debugfs_create_file("amdgpu_target_backlight_pwm", 0444, dir, 
connector,


--
Hamza



Re: [PATCH] drm/amdkfd: fix a potential double free in pqm_create_queue

2023-03-08 Thread Felix Kuehling

On 2023-03-07 19:19, Chia-I Wu wrote:

Set *q to NULL on errors, otherwise pqm_create_queue would free it
again.

Signed-off-by: Chia-I Wu 


Thank you! I'm applying this patch to amd-staging-drm-next.

Reviewed-by: Felix Kuehling 



---
  drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 5137476ec18e6..4236539d9f932 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -218,8 +218,8 @@ static int init_user_queue(struct process_queue_manager 
*pqm,
return 0;
  
  cleanup:

-   if (dev->shared_resources.enable_mes)
-   uninit_queue(*q);
+   uninit_queue(*q);
+   *q = NULL;
return retval;
  }
  


Re: [PATCH] drm/amdkfd: Get prange->offset after svm_range_vram_node_new

2023-03-08 Thread Felix Kuehling

On 2023-03-08 02:45, Xiaogang.Chen wrote:

From: Xiaogang Chen 

During miration to vram prange->offset is valid after vram buffer is located,
either use old one or allocate a new one. Move svm_range_vram_node_new before 
migrate
for each vma to get valid prange->offset.

Signed-off-by: Xiaogang Chen 


I'd  prefer to keep svm_range_vram_node_new in svm_migrate_copy_to_vram. 
Logically the memory allocation should be after migrate_vma_setup. If 
migrate_vma_setup finds that there is nothing to migrate, we should not 
allocate any memory.


Does this fix a real issue, or is this a theoretical fix? I think it 
should probably work correctly without this patch. 
svm_range_vram_node_new sets prange->offset to 0. If no VRAM was 
previously allocated, it should already be 0, so nothing changes. Maybe 
we just need a fix to set prange->offset = 0 in svm_range_vram_node_free.


Regards,
  Felix



---
  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index fd54a00e7229..15791490c23e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -310,12 +310,6 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, 
struct svm_range *prange,
src = scratch;
dst = (uint64_t *)(scratch + npages);
  
-	r = svm_range_vram_node_new(adev, prange, true);

-   if (r) {
-   dev_dbg(adev->dev, "fail %d to alloc vram\n", r);
-   goto out;
-   }
-
amdgpu_res_first(prange->ttm_res, ttm_res_offset,
 npages << PAGE_SHIFT, );
for (i = j = 0; i < npages; i++) {
@@ -525,6 +519,12 @@ svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t 
best_loc,
  
  	start = prange->start << PAGE_SHIFT;

end = (prange->last + 1) << PAGE_SHIFT;
+
+   r = svm_range_vram_node_new(adev, prange, true);
+   if (r) {
+   dev_dbg(adev->dev, "fail %d to alloc vram\n", r);
+   return r;
+   }
ttm_res_offset = prange->offset << PAGE_SHIFT;
  
  	for (addr = start; addr < end;) {


RE: [PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Zhuo, Qingqing (Lillian)
[AMD Official Use Only - General]

On Wed, Mar 8, 2023 at 11:32 AM Maxime Ripard  wrote:
>
> On Wed, Mar 08, 2023 at 04:27:01PM +, Zhuo, Qingqing (Lillian) wrote:
> > [AMD Official Use Only - General]
> >
> > > Hi,
> >
> > On Wed, Mar 08, 2023 at 11:11:22AM -0500, Hamza Mahfooz wrote:
> > > + vc4 maintainers
> > >
> > > On 3/8/23 04:34, Qingqing Zhuo wrote:
> > > > [Why]
> > > > drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
> > > > drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label 
> > > > ‘err_disable_runtime_pm’ used but not defined
> > > >
> > > > [How]
> > > > update err_disable_runtime_pm to err_put_runtime_pm.
> > > >
> > > > Signed-off-by: Qingqing Zhuo 
> > > > ---
> > > >   drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c 
> > > > b/drivers/gpu/drm/vc4/vc4_hdmi.c index 
> > > > 9e145690c480..edf882360d24
> > > > 100644
> > > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > > @@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, 
> > > > struct device *master, void *data)
> > > >*/
> > > >   ret = pm_runtime_resume_and_get(dev);
> > > >   if (ret)
> > > > - goto err_disable_runtime_pm;
> > > > + goto err_put_runtime_pm;
> > > >   if ((of_device_is_compatible(dev->of_node, 
> > > > "brcm,bcm2711-hdmi0") ||
> > > >of_device_is_compatible(dev->of_node, 
> > > > "brcm,bcm2711-hdmi1")) &&
> >
> > > The current drm-misc-next branch doesn't have that context at all. What 
> > > tree is this based on?
> >
> > This is for amd-staging-drm-next.
>
> I don't get it, why is there a vc4 patch in an AMD tree?

> There isn't. it just happens to have an vc4 driver with this issue when we 
> branched it.  Lillian, please double check drm-next or linux-next for non-AMD 
> drivers

Thanks for letting me know and apologies for the confusion! Will for sure do in 
the future.

Thanks,
Lillian


Re: [PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Hamza Mahfooz



On 3/8/23 11:39, Alex Deucher wrote:

On Wed, Mar 8, 2023 at 11:32 AM Maxime Ripard  wrote:


On Wed, Mar 08, 2023 at 04:27:01PM +, Zhuo, Qingqing (Lillian) wrote:

[AMD Official Use Only - General]


Hi,


On Wed, Mar 08, 2023 at 11:11:22AM -0500, Hamza Mahfooz wrote:

+ vc4 maintainers

On 3/8/23 04:34, Qingqing Zhuo wrote:

[Why]
drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label
‘err_disable_runtime_pm’ used but not defined

[How]
update err_disable_runtime_pm to err_put_runtime_pm.

Signed-off-by: Qingqing Zhuo 
---
   drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c
b/drivers/gpu/drm/vc4/vc4_hdmi.c index 9e145690c480..edf882360d24
100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, struct 
device *master, void *data)
*/
   ret = pm_runtime_resume_and_get(dev);
   if (ret)
- goto err_disable_runtime_pm;
+ goto err_put_runtime_pm;
   if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||
of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1"))
&&



The current drm-misc-next branch doesn't have that context at all. What tree is 
this based on?


This is for amd-staging-drm-next.


I don't get it, why is there a vc4 patch in an AMD tree?


There isn't. it just happens to have an vc4 driver with this issue
when we branched it.  Lillian, please double check drm-next or
linux-next for non-AMD drivers


I think we can cherry pick commit 932d860f4672 ("drm/vc4: hdmi: Switch
to devm_pm_runtime_enable") to resolve the compile issue, that Lillian
is observing.



Alex


--
Hamza



Re: [PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Maxime Ripard
On Wed, Mar 08, 2023 at 04:27:01PM +, Zhuo, Qingqing (Lillian) wrote:
> [AMD Official Use Only - General]
> 
> > Hi,
> 
> On Wed, Mar 08, 2023 at 11:11:22AM -0500, Hamza Mahfooz wrote:
> > + vc4 maintainers
> > 
> > On 3/8/23 04:34, Qingqing Zhuo wrote:
> > > [Why]
> > > drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
> > > drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label 
> > > ‘err_disable_runtime_pm’ used but not defined
> > > 
> > > [How]
> > > update err_disable_runtime_pm to err_put_runtime_pm.
> > > 
> > > Signed-off-by: Qingqing Zhuo 
> > > ---
> > >   drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c 
> > > b/drivers/gpu/drm/vc4/vc4_hdmi.c index 9e145690c480..edf882360d24 
> > > 100644
> > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > @@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> > > device *master, void *data)
> > >*/
> > >   ret = pm_runtime_resume_and_get(dev);
> > >   if (ret)
> > > - goto err_disable_runtime_pm;
> > > + goto err_put_runtime_pm;
> > >   if ((of_device_is_compatible(dev->of_node, 
> > > "brcm,bcm2711-hdmi0") ||
> > >of_device_is_compatible(dev->of_node, 
> > > "brcm,bcm2711-hdmi1")) 
> > > &&
> 
> > The current drm-misc-next branch doesn't have that context at all. What 
> > tree is this based on?
>
> This is for amd-staging-drm-next.

I don't get it, why is there a vc4 patch in an AMD tree?

Maxime


signature.asc
Description: PGP signature


Re: [PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Alex Deucher
On Wed, Mar 8, 2023 at 11:32 AM Maxime Ripard  wrote:
>
> On Wed, Mar 08, 2023 at 04:27:01PM +, Zhuo, Qingqing (Lillian) wrote:
> > [AMD Official Use Only - General]
> >
> > > Hi,
> >
> > On Wed, Mar 08, 2023 at 11:11:22AM -0500, Hamza Mahfooz wrote:
> > > + vc4 maintainers
> > >
> > > On 3/8/23 04:34, Qingqing Zhuo wrote:
> > > > [Why]
> > > > drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
> > > > drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label
> > > > ‘err_disable_runtime_pm’ used but not defined
> > > >
> > > > [How]
> > > > update err_disable_runtime_pm to err_put_runtime_pm.
> > > >
> > > > Signed-off-by: Qingqing Zhuo 
> > > > ---
> > > >   drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > > b/drivers/gpu/drm/vc4/vc4_hdmi.c index 9e145690c480..edf882360d24
> > > > 100644
> > > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > > > @@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, 
> > > > struct device *master, void *data)
> > > >*/
> > > >   ret = pm_runtime_resume_and_get(dev);
> > > >   if (ret)
> > > > - goto err_disable_runtime_pm;
> > > > + goto err_put_runtime_pm;
> > > >   if ((of_device_is_compatible(dev->of_node, 
> > > > "brcm,bcm2711-hdmi0") ||
> > > >of_device_is_compatible(dev->of_node, 
> > > > "brcm,bcm2711-hdmi1"))
> > > > &&
> >
> > > The current drm-misc-next branch doesn't have that context at all. What 
> > > tree is this based on?
> >
> > This is for amd-staging-drm-next.
>
> I don't get it, why is there a vc4 patch in an AMD tree?

There isn't. it just happens to have an vc4 driver with this issue
when we branched it.  Lillian, please double check drm-next or
linux-next for non-AMD drivers

Alex


Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-08 Thread Felix Kuehling

On 2023-03-08 11:20, Christian König wrote:

Am 08.03.23 um 17:17 schrieb Felix Kuehling:

On 2023-03-08 04:07, Christian König wrote:

Am 07.03.23 um 16:28 schrieb Belanger, David:

[AMD Official Use Only - General]


The test case is a python program that will load the driver, do 
some operations, then unload the driver.


What do you mean with unloading the driver? Removing the module? Or 
destroying the device?


When the driver exists, there is still the python process space 
around holding on the address space.
When the python process space exits, the mmu_notifier gets called 
but the driver has already been unloaded.


The goal of the fix is to address case where there could be 
outstanding address space / worker threads for process

cleanup that needs to be cleared/completed at exit time.


Yeah and when the module is unloaded this is a completely futile 
effort.


The general upstream approach is to take references on the struct 
device and module and prevent unloading as long as those references 
exists.


That's not how it always works. In case of RCU callbacks, the 
documented strategy is to use rcu_barrier in the module exit function 
to ensure the grace period and all callbacks have completed 
(https://www.kernel.org/doc/html/latest/RCU/rcubarrier.html). 
mmu_notifier_synchronize is meant to do something similar for pending 
mmu_notifier_put work 
(https://elixir.bootlin.com/linux/v6.2.2/source/mm/mmu_notifier.c#L1116).


But this implies that we need to call mmu_notifier_put for all the 
MMU notifiers registered by the module first. I think closing 
/dev/kfd drops the module reference count, but the MMU notifiers we 
register for process cleanup persist until the address space is 
destroyed. We need to trigger that cleanup for any processes that 
still exist in that state when the module is unloaded. Or we need to 
find a way to increment the module refcount for every process that 
registers a KFD cleanup MMU notifier.


The later is what I've meant. Cleaning up when the module unloads is 
certainly possible as well, but harder to get right.


I think we can get the cleanup right. I suggested a strategy to David in 
my code review.





And I don't really see an use case that we should do the cleanup way.


I'm not sure it's a question of use cases. I see it more as a risk 
trade-off. If we manually add module refcounts for our cleanup notifiers 
(try_module_get(THIS_MODULE)/module_put(THIS_MODULE)), there is a risk 
of leaks that could prevent module unloading, or underflows that could 
allow the module to be unloaded too early.


I guess this particular test (app trying to unload the module after 
using KFD) would just fail if we add module refcounts. But I agree that 
this is not a valid usecase.


Regards,
  Felix




Regards,
Christian.



Regards,
  Felix





The device might be non-functional any more (because for example of 
hot plug), but the driver should never be unloaded before the python 
program exits.


Regards,
Christian.



Regards,
David B.


-Original Message-
From: Koenig, Christian 
Sent: Tuesday, March 7, 2023 2:05 AM
To: Belanger, David ; amd-
g...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
exit.

Am 06.03.23 um 22:58 schrieb David Belanger:
Handle case when module is unloaded (kfd_exit) before a process 
space

(mm_struct) is released.
Well that should never ever happen in the first place. It sounds 
like we are

missing grabbing module references.

Regards,
Christian.


Signed-off-by: David Belanger 
---
   drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 57



   2 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 09b966dc3768..8ef4bd9e4f7d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -26,6 +26,9 @@
   #include "kfd_priv.h"
   #include "amdgpu_amdkfd.h"

+void kfd_cleanup_processes(void);
+
+
   static int kfd_init(void)
   {
   int err;
@@ -77,6 +80,7 @@ static int kfd_init(void)

   static void kfd_exit(void)
   {
+    kfd_cleanup_processes();
   kfd_debugfs_fini();
   kfd_process_destroy_wq();
   kfd_procfs_shutdown();
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ebabe92f7edb..b5b28a32639d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1181,6 +1181,17 @@ static void 
kfd_process_notifier_release(struct

mmu_notifier *mn,

   return;

   mutex_lock(_processes_mutex);
+    /*
+ * Do early return if p is not in the table.
+ *
+ * This could potentially happen if this function is called 
concurrently

+ * by mmu_notifier and by kfd_cleanup_pocesses.
+ *
+ */
+    if (!hash_hashed(>kfd_processes)) {
+    

Re: [PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Maxime Ripard
Hi,

On Wed, Mar 08, 2023 at 11:11:22AM -0500, Hamza Mahfooz wrote:
> + vc4 maintainers
> 
> On 3/8/23 04:34, Qingqing Zhuo wrote:
> > [Why]
> > drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
> > drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label 
> > ‘err_disable_runtime_pm’ used but not defined
> > 
> > [How]
> > update err_disable_runtime_pm to err_put_runtime_pm.
> > 
> > Signed-off-by: Qingqing Zhuo 
> > ---
> >   drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > index 9e145690c480..edf882360d24 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > @@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> > device *master, void *data)
> >  */
> > ret = pm_runtime_resume_and_get(dev);
> > if (ret)
> > -   goto err_disable_runtime_pm;
> > +   goto err_put_runtime_pm;
> > if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||
> >  of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1")) &&

The current drm-misc-next branch doesn't have that context at all. What
tree is this based on?

Maxime


RE: [PATCH 2/2] drm/amd/amdkfd: Fix build error with unmatched argument type

2023-03-08 Thread Zhuo, Qingqing (Lillian)
[AMD Official Use Only - General]

On 3/8/23 04:34, Qingqing Zhuo wrote:
> [Why]
> drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_migrate.c: In function 
> ‘svm_migrate_copy_to_vram’:
> ./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:35:21:
> error: format ‘%lx’ expects argument of type ‘long unsigned int’, but 
> argument 6 has type ‘uint64_t’ {aka ‘long long unsigned int’} 
> [-Werror=format=]
> 35 | #define pr_fmt(fmt) "amdgpu: " fmt
>| ^~
> 
> [How]
> use %llx instead of %lx for ttm_res_offset.
> 
> Fixes: d5db9d377c021 ("drm/amdkfd: Fix BO offset for multi-VMA page 
> migration")
> Signed-off-by: Qingqing Zhuo 
> 
> Cc: Xiaogang Chen 
> Cc: Felix Kuehling 
> 
> ---

> I believe this has already been fixed as of commit 271acc541327
> ("drm/amdkfd: fix warning in SVM debug statement"), in amd-staging-drm-next.

Thanks for sharing it. Please ignore this patch then.

>   drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> index 373cd7b0e1ca..fd54a00e7229 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
> @@ -304,7 +304,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, 
> struct svm_range *prange,
>   uint64_t i, j;
>   int r;
>   
> - pr_debug("svms 0x%p [0x%lx 0x%lx 0x%lx]\n", prange->svms, prange->start,
> + pr_debug("svms 0x%p [0x%lx 0x%lx 0x%llx]\n", prange->svms, 
> +prange->start,
>prange->last, ttm_res_offset);
>   
>   src = scratch;

--
Lillian


Re: [PATCH 1/2] drm/amdgpu: add flag to enable/disable poll in suspend/resume path

2023-03-08 Thread Alex Deucher
On Wed, Mar 8, 2023 at 7:17 AM Guchun Chen  wrote:
>
> Some amd asics having reliable hotplug support don't call
> drm_kms_helper_poll_init in driver init sequence. However,
> due to the unified suspend/resume path for all asics, because
> the output_poll_work->func is not set for these asics, a warning
> arrives when suspending.
>
> [   90.656049]  
> [   90.656050]  ? console_unlock+0x4d/0x100
> [   90.656053]  ? __irq_work_queue_local+0x27/0x60
> [   90.656056]  ? irq_work_queue+0x2b/0x50
> [   90.656057]  ? __wake_up_klogd+0x40/0x60
> [   90.656059]  __cancel_work_timer+0xed/0x180
> [   90.656061]  drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
> [   90.656072]  amdgpu_device_suspend+0x81/0x170 [amdgpu]
> [   90.656180]  amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
> [   90.656269]  pci_pm_runtime_suspend+0x61/0x1b0
>
> So add use_kms_poll flag as the initialization check in amdgpu code before
> calling drm_kms_helper_poll_disable/drm_kms_helper_poll_enable in 
> suspend/resume
> path.
>
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
> Fixes: a4e771729a51("drm/probe_helper: sort out poll_running vs poll_enabled")
> Reported-by: Bert Karwatzki 
> Suggested-by: Dmitry Baryshkov 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
>  drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h   | 1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c   | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  | 1 +
>  drivers/gpu/drm/amd/amdgpu/dce_v8_0.c  | 1 +
>  7 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index c4a4e2fe6681..74af0b8c0d08 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -4145,7 +4145,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
> fbcon)
> if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D3))
> DRM_WARN("smart shift update failed\n");
>
> -   drm_kms_helper_poll_disable(dev);
> +   if (adev->mode_info.use_kms_poll)
> +   drm_kms_helper_poll_disable(dev);
>
> if (fbcon)
> 
> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
> @@ -4243,7 +4244,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
> fbcon)
> if (fbcon)
> 
> drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, false);
>
> -   drm_kms_helper_poll_enable(dev);
> +   if (adev->mode_info.use_kms_poll)
> +   drm_kms_helper_poll_enable(dev);
>

Since polling is only enabled for analog outputs and DC doesn't
support any analog outputs, I think we can simplify this to

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c4a4e2fe6681..74af0b8c0d08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4145,7 +4145,8 @@ int amdgpu_device_suspend(struct drm_device
*dev, bool fbcon)
  if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D3))
  DRM_WARN("smart shift update failed\n");

- drm_kms_helper_poll_disable(dev);
+ if (!adev->dc_enabled)
+ drm_kms_helper_poll_disable(dev);

  if (fbcon)
  drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
@@ -4243,7 +4244,8 @@ int amdgpu_device_resume(struct drm_device *dev,
bool fbcon)
  if (fbcon)
  drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, false);

- drm_kms_helper_poll_enable(dev);
+ if (!adev->dc_enabled)
+ drm_kms_helper_poll_enable(dev);

  amdgpu_ras_resume(adev);

Alternatively, we could also just move drm_kms_helper_poll_disable()
into amdgpu_display_suspend_helper() and drm_kms_helper_poll_enable()
into amdgpu_display_resume_helper(), but I'm not sure if the ordering
here is important or not off hand.

Alex



> amdgpu_ras_resume(adev);
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> index 32fe05c810c6..d383ea3e8e94 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
> @@ -343,6 +343,7 @@ struct amdgpu_mode_info {
> int disp_priority;
> const struct amdgpu_display_funcs *funcs;
> const enum drm_plane_type *plane_type;
> +   bool use_kms_poll;
>  };
>
>  #define AMDGPU_MAX_BL_LEVEL 0xFF
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> index 53ff91fc6cf6..3277799a80bb 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
> @@ -518,6 +518,7 @@ static int amdgpu_vkms_sw_init(void *handle)
> return r;
>
> drm_kms_helper_poll_init(adev_to_drm(adev));
> +

RE: [PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Zhuo, Qingqing (Lillian)
[AMD Official Use Only - General]

> Hi,

On Wed, Mar 08, 2023 at 11:11:22AM -0500, Hamza Mahfooz wrote:
> + vc4 maintainers
> 
> On 3/8/23 04:34, Qingqing Zhuo wrote:
> > [Why]
> > drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
> > drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label 
> > ‘err_disable_runtime_pm’ used but not defined
> > 
> > [How]
> > update err_disable_runtime_pm to err_put_runtime_pm.
> > 
> > Signed-off-by: Qingqing Zhuo 
> > ---
> >   drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c 
> > b/drivers/gpu/drm/vc4/vc4_hdmi.c index 9e145690c480..edf882360d24 
> > 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > @@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> > device *master, void *data)
> >  */
> > ret = pm_runtime_resume_and_get(dev);
> > if (ret)
> > -   goto err_disable_runtime_pm;
> > +   goto err_put_runtime_pm;
> > if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||
> >  of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1")) 
> > &&

> The current drm-misc-next branch doesn't have that context at all. What tree 
> is this based on?

> Maxime

Hi Maxime,

This is for amd-staging-drm-next.

Thanks,
Lillian


Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-08 Thread Christian König

Am 08.03.23 um 17:17 schrieb Felix Kuehling:

On 2023-03-08 04:07, Christian König wrote:

Am 07.03.23 um 16:28 schrieb Belanger, David:

[AMD Official Use Only - General]


The test case is a python program that will load the driver, do some 
operations, then unload the driver.


What do you mean with unloading the driver? Removing the module? Or 
destroying the device?


When the driver exists, there is still the python process space 
around holding on the address space.
When the python process space exits, the mmu_notifier gets called 
but the driver has already been unloaded.


The goal of the fix is to address case where there could be 
outstanding address space / worker threads for process

cleanup that needs to be cleared/completed at exit time.


Yeah and when the module is unloaded this is a completely futile effort.

The general upstream approach is to take references on the struct 
device and module and prevent unloading as long as those references 
exists.


That's not how it always works. In case of RCU callbacks, the 
documented strategy is to use rcu_barrier in the module exit function 
to ensure the grace period and all callbacks have completed 
(https://www.kernel.org/doc/html/latest/RCU/rcubarrier.html). 
mmu_notifier_synchronize is meant to do something similar for pending 
mmu_notifier_put work 
(https://elixir.bootlin.com/linux/v6.2.2/source/mm/mmu_notifier.c#L1116).


But this implies that we need to call mmu_notifier_put for all the MMU 
notifiers registered by the module first. I think closing /dev/kfd 
drops the module reference count, but the MMU notifiers we register 
for process cleanup persist until the address space is destroyed. We 
need to trigger that cleanup for any processes that still exist in 
that state when the module is unloaded. Or we need to find a way to 
increment the module refcount for every process that registers a KFD 
cleanup MMU notifier.


The later is what I've meant. Cleaning up when the module unloads is 
certainly possible as well, but harder to get right.


And I don't really see an use case that we should do the cleanup way.

Regards,
Christian.



Regards,
  Felix





The device might be non-functional any more (because for example of 
hot plug), but the driver should never be unloaded before the python 
program exits.


Regards,
Christian.



Regards,
David B.


-Original Message-
From: Koenig, Christian 
Sent: Tuesday, March 7, 2023 2:05 AM
To: Belanger, David ; amd-
g...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
exit.

Am 06.03.23 um 22:58 schrieb David Belanger:

Handle case when module is unloaded (kfd_exit) before a process space
(mm_struct) is released.
Well that should never ever happen in the first place. It sounds 
like we are

missing grabbing module references.

Regards,
Christian.


Signed-off-by: David Belanger 
---
   drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 57



   2 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 09b966dc3768..8ef4bd9e4f7d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -26,6 +26,9 @@
   #include "kfd_priv.h"
   #include "amdgpu_amdkfd.h"

+void kfd_cleanup_processes(void);
+
+
   static int kfd_init(void)
   {
   int err;
@@ -77,6 +80,7 @@ static int kfd_init(void)

   static void kfd_exit(void)
   {
+    kfd_cleanup_processes();
   kfd_debugfs_fini();
   kfd_process_destroy_wq();
   kfd_procfs_shutdown();
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ebabe92f7edb..b5b28a32639d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1181,6 +1181,17 @@ static void 
kfd_process_notifier_release(struct

mmu_notifier *mn,

   return;

   mutex_lock(_processes_mutex);
+    /*
+ * Do early return if p is not in the table.
+ *
+ * This could potentially happen if this function is called 
concurrently

+ * by mmu_notifier and by kfd_cleanup_pocesses.
+ *
+ */
+    if (!hash_hashed(>kfd_processes)) {
+    mutex_unlock(_processes_mutex);
+    return;
+    }
   hash_del_rcu(>kfd_processes);
   mutex_unlock(_processes_mutex);
   synchronize_srcu(_processes_srcu);
@@ -1200,6 +1211,52 @@ static const struct mmu_notifier_ops

kfd_process_mmu_notifier_ops = {

   .free_notifier = kfd_process_free_notifier,
   };

+
+void kfd_cleanup_processes(void)
+{
+    struct kfd_process *p;
+    unsigned int temp;
+
+    /*
+ * Iterate over remaining processes in table, calling 
notifier release

+ * to free up notifier and process resources.
+ *
+ * This code handles the case when driver is unloaded before all

mm_struct

+ * are released.
+ */
+    int idx = 

Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-08 Thread Felix Kuehling

On 2023-03-08 04:07, Christian König wrote:

Am 07.03.23 um 16:28 schrieb Belanger, David:

[AMD Official Use Only - General]


The test case is a python program that will load the driver, do some 
operations, then unload the driver.


What do you mean with unloading the driver? Removing the module? Or 
destroying the device?


When the driver exists, there is still the python process space 
around holding on the address space.
When the python process space exits, the mmu_notifier gets called but 
the driver has already been unloaded.


The goal of the fix is to address case where there could be 
outstanding address space / worker threads for process

cleanup that needs to be cleared/completed at exit time.


Yeah and when the module is unloaded this is a completely futile effort.

The general upstream approach is to take references on the struct 
device and module and prevent unloading as long as those references 
exists.


That's not how it always works. In case of RCU callbacks, the documented 
strategy is to use rcu_barrier in the module exit function to ensure the 
grace period and all callbacks have completed 
(https://www.kernel.org/doc/html/latest/RCU/rcubarrier.html). 
mmu_notifier_synchronize is meant to do something similar for pending 
mmu_notifier_put work 
(https://elixir.bootlin.com/linux/v6.2.2/source/mm/mmu_notifier.c#L1116).


But this implies that we need to call mmu_notifier_put for all the MMU 
notifiers registered by the module first. I think closing /dev/kfd drops 
the module reference count, but the MMU notifiers we register for 
process cleanup persist until the address space is destroyed. We need to 
trigger that cleanup for any processes that still exist in that state 
when the module is unloaded. Or we need to find a way to increment the 
module refcount for every process that registers a KFD cleanup MMU notifier.


Regards,
  Felix





The device might be non-functional any more (because for example of 
hot plug), but the driver should never be unloaded before the python 
program exits.


Regards,
Christian.



Regards,
David B.


-Original Message-
From: Koenig, Christian 
Sent: Tuesday, March 7, 2023 2:05 AM
To: Belanger, David ; amd-
g...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
exit.

Am 06.03.23 um 22:58 schrieb David Belanger:

Handle case when module is unloaded (kfd_exit) before a process space
(mm_struct) is released.
Well that should never ever happen in the first place. It sounds 
like we are

missing grabbing module references.

Regards,
Christian.


Signed-off-by: David Belanger 
---
   drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 57



   2 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 09b966dc3768..8ef4bd9e4f7d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -26,6 +26,9 @@
   #include "kfd_priv.h"
   #include "amdgpu_amdkfd.h"

+void kfd_cleanup_processes(void);
+
+
   static int kfd_init(void)
   {
   int err;
@@ -77,6 +80,7 @@ static int kfd_init(void)

   static void kfd_exit(void)
   {
+    kfd_cleanup_processes();
   kfd_debugfs_fini();
   kfd_process_destroy_wq();
   kfd_procfs_shutdown();
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ebabe92f7edb..b5b28a32639d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1181,6 +1181,17 @@ static void kfd_process_notifier_release(struct

mmu_notifier *mn,

   return;

   mutex_lock(_processes_mutex);
+    /*
+ * Do early return if p is not in the table.
+ *
+ * This could potentially happen if this function is called 
concurrently

+ * by mmu_notifier and by kfd_cleanup_pocesses.
+ *
+ */
+    if (!hash_hashed(>kfd_processes)) {
+    mutex_unlock(_processes_mutex);
+    return;
+    }
   hash_del_rcu(>kfd_processes);
   mutex_unlock(_processes_mutex);
   synchronize_srcu(_processes_srcu);
@@ -1200,6 +1211,52 @@ static const struct mmu_notifier_ops

kfd_process_mmu_notifier_ops = {

   .free_notifier = kfd_process_free_notifier,
   };

+
+void kfd_cleanup_processes(void)
+{
+    struct kfd_process *p;
+    unsigned int temp;
+
+    /*
+ * Iterate over remaining processes in table, calling notifier 
release

+ * to free up notifier and process resources.
+ *
+ * This code handles the case when driver is unloaded before all

mm_struct

+ * are released.
+ */
+    int idx = srcu_read_lock(_processes_srcu);
+
+    hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
+    if (p) {
+    /*
+ * Obtain a reference on p to avoid a late

mmu_notifier release

+ * call triggering freeing the process.
+  

Re: [PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Hamza Mahfooz

+ vc4 maintainers

On 3/8/23 04:34, Qingqing Zhuo wrote:

[Why]
drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label ‘err_disable_runtime_pm’ 
used but not defined

[How]
update err_disable_runtime_pm to err_put_runtime_pm.

Signed-off-by: Qingqing Zhuo 
---
  drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 9e145690c480..edf882360d24 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, struct 
device *master, void *data)
 */
ret = pm_runtime_resume_and_get(dev);
if (ret)
-   goto err_disable_runtime_pm;
+   goto err_put_runtime_pm;
  
  	if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||

 of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1")) &&


--
Hamza



Re: [PATCH 2/2] drm/amd/amdkfd: Fix build error with unmatched argument type

2023-03-08 Thread Hamza Mahfooz



On 3/8/23 04:34, Qingqing Zhuo wrote:

[Why]
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_migrate.c: In function 
‘svm_migrate_copy_to_vram’:
./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:35:21:
error: format ‘%lx’ expects argument of type ‘long unsigned int’,
but argument 6 has type ‘uint64_t’ {aka ‘long long unsigned int’} 
[-Werror=format=]
35 | #define pr_fmt(fmt) "amdgpu: " fmt
   | ^~

[How]
use %llx instead of %lx for ttm_res_offset.

Fixes: d5db9d377c021 ("drm/amdkfd: Fix BO offset for multi-VMA page migration")
Signed-off-by: Qingqing Zhuo 

Cc: Xiaogang Chen 
Cc: Felix Kuehling 

---


I believe this has already been fixed as of commit 271acc541327
("drm/amdkfd: fix warning in SVM debug statement"), in amd-staging-drm-next.


  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 373cd7b0e1ca..fd54a00e7229 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -304,7 +304,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
uint64_t i, j;
int r;
  
-	pr_debug("svms 0x%p [0x%lx 0x%lx 0x%lx]\n", prange->svms, prange->start,

+   pr_debug("svms 0x%p [0x%lx 0x%lx 0x%llx]\n", prange->svms, 
prange->start,
 prange->last, ttm_res_offset);
  
  	src = scratch;


--
Hamza



Re: [PATCH] drm/amd/display: remove unused variable available

2023-03-08 Thread Alex Deucher
Applied.  Thanks!

On Wed, Mar 8, 2023 at 9:11 AM Tom Rix  wrote:
>
> With gcc and W=1, there is this error
> drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:297:13:
>  error:
>   variable ‘available’ set but not used [-Werror=unused-but-set-variable]
>   297 | int available = 0;
>   | ^
>
> Since available is unused, remove it.
>
> Signed-off-by: Tom Rix 
> ---
>  .../drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c   | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c 
> b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
> index f14217cc16fd..2f0311c42f90 100644
> --- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
> +++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
> @@ -294,7 +294,6 @@ bool link_dp_dpia_set_dptx_usb4_bw_alloc_support(struct 
> dc_link *link)
>  void dpia_handle_bw_alloc_response(struct dc_link *link, uint8_t bw, uint8_t 
> result)
>  {
> int bw_needed = 0;
> -   int available = 0;
> int estimated = 0;
> int host_router_total_estimated_bw = 0;
>
> @@ -373,20 +372,13 @@ void dpia_handle_bw_alloc_response(struct dc_link 
> *link, uint8_t bw, uint8_t res
>
> // 1. If due to unplug of other sink
> if (estimated == host_router_total_estimated_bw) {
> -
> // First update the estimated & max_bw fields
> if (link->dpia_bw_alloc_config.estimated_bw < 
> estimated) {
> -   available = estimated - 
> link->dpia_bw_alloc_config.estimated_bw;
> link->dpia_bw_alloc_config.estimated_bw = 
> estimated;
> }
> }
> // 2. If due to realloc bw btw 2 dpia due to plug OR realloc 
> unused Bw
> else {
> -
> -   // We took from another unplugged/problematic sink to 
> give to us
> -   if (link->dpia_bw_alloc_config.estimated_bw < 
> estimated)
> -   available = estimated - 
> link->dpia_bw_alloc_config.estimated_bw;
> -
> // We lost estimated bw usually due to plug event of 
> other dpia
> link->dpia_bw_alloc_config.estimated_bw = estimated;
> }
> --
> 2.27.0
>


Re: [PATCH] drm/amd/display: remove unused variable res_pool

2023-03-08 Thread Alex Deucher
Applied.  Thanks!

On Wed, Mar 8, 2023 at 9:10 AM Tom Rix  wrote:
>
> With gcc and W=1, there is this error
> drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:1214:31:
>   error: variable ‘res_pool’ set but not used 
> [-Werror=unused-but-set-variable]
>  1214 | struct resource_pool *res_pool;
>   |   ^~~~
>
> Since res_pool is unused, remove it.
>
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> index 2739bef9b90c..4b9b5e4050fc 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
> @@ -1211,7 +1211,6 @@ static int pre_compute_mst_dsc_configs_for_state(struct 
> drm_atomic_state *state,
> bool computed_streams[MAX_PIPES];
> struct amdgpu_dm_connector *aconnector;
> struct drm_dp_mst_topology_mgr *mst_mgr;
> -   struct resource_pool *res_pool;
> int link_vars_start_index = 0;
> int ret = 0;
>
> @@ -1220,7 +1219,6 @@ static int pre_compute_mst_dsc_configs_for_state(struct 
> drm_atomic_state *state,
>
> for (i = 0; i < dc_state->stream_count; i++) {
> stream = dc_state->streams[i];
> -   res_pool = stream->ctx->dc->res_pool;
>
> if (stream->signal != SIGNAL_TYPE_DISPLAY_PORT_MST)
> continue;
> --
> 2.27.0
>


[linux-next:master] BUILD SUCCESS WITH WARNING fc31900c948610e7b5c2f15fb7795832c8325327

2023-03-08 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: fc31900c948610e7b5c2f15fb7795832c8325327  Add linux-next specific 
files for 20230308

Warning reports:

https://lore.kernel.org/oe-kbuild-all/202302100744.d1zzxxfn-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202302111601.jty4lkra-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202303081345.oamwqah7-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202303081432.d9jwidy9-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202303081657.6ble80uy-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202303081807.lblwkmpx-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202303082135.njdx1bij-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202303082325.ywfmfbaj-...@intel.com

Warning: (recently discovered and may have been fixed)

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:1214:31: 
warning: variable 'res_pool' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.c:294:6: warning: no 
previous prototype for 'optc3_wait_drr_doublebuffer_pending_clear' 
[-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:2182:
 warning: expecting prototype for Check if there is a native DP or passive 
DP(). Prototype was for dp_is_sink_present() instead
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_capability.c:2184:
 warning: expecting prototype for Check if there is a native DP or passive 
DP(). Prototype was for dp_is_sink_present() instead
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:297:13:
 warning: variable 'available' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:1146:3: warning: 
variable 'hotspotlimit' is uninitialized when used here [-Wuninitialized]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:1149:24: 
warning: variable 'memlimit' is uninitialized when used here [-Wuninitialized]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:1152:34: 
warning: variable 'software_shutdown_temp' is uninitialized when used here 
[-Wuninitialized]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:315:17: sparse:  
  int
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:315:17: sparse:  
  void
drivers/soc/renesas/pwc-rzv2m.c:124:34: warning: unused variable 
'rzv2m_pwc_of_match' [-Wunused-const-variable]
security/security.c:3647: warning: expecting prototype for 
security_socket_create(). Prototype was for security_socket_post_create() 
instead
security/security.c:4110: warning: expecting prototype for 
security_socket_create(). Prototype was for security_socket_post_create() 
instead

Unverified Warning (likely false positive, please contact us if interested):

drivers/usb/gadget/composite.c:2082:33: sparse: sparse: restricted __le16 
degrades to integer
drivers/watchdog/imx2_wdt.c:442:22: sparse: sparse: symbol 'imx_wdt' was not 
declared. Should it be static?
drivers/watchdog/imx2_wdt.c:446:22: sparse: sparse: symbol 'imx_wdt_legacy' was 
not declared. Should it be static?

Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-randconfig-s033-20230305
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-link-protocols-link_dp_dpia_bw.c:warning:variable-available-set-but-not-used
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-pm-swsmu-smu13-smu_v13_0_6_ppt.c:sparse:int
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-pm-swsmu-smu13-smu_v13_0_6_ppt.c:sparse:sparse:incompatible-types-in-conditional-expression-(different-base-types):
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-pm-swsmu-smu13-smu_v13_0_6_ppt.c:sparse:void
|-- arm64-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dcn30-dcn30_optc.c:warning:no-previous-prototype-for-optc3_wait_drr_doublebuffer_pending_clear
|-- csky-randconfig-s031-20230305
|   |-- 
drivers-usb-gadget-composite.c:sparse:sparse:restricted-__le16-degrades-to-integer
|   |-- 
drivers-watchdog-imx2_wdt.c:sparse:sparse:symbol-imx_wdt-was-not-declared.-Should-it-be-static
|   `-- 
drivers-watchdog-imx2_wdt.c:sparse:sparse:symbol-imx_wdt_legacy-was-not-declared.-Should-it-be-static
|-- i386-allyesconfig
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dcn30-dcn30_optc.c:warning:no-previous-prototype-for-optc3_wait_drr_doublebuffer_pending_clear
|-- i386-buildonly-randconfig-r006-20230306
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-dcn30-dcn30_optc.c:warning:no-previous-prototype-for-optc3_wait_drr_doublebuffer_pending_clear
|-- microblaze-randconfig-c033-20230305
|   `-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-link-protocols-link_dp_dpia_bw.c:warning:variable-available-set-but-not-used
|-- openrisc-randconfig-s052-20230305
|   `-- 
drivers-usb-gadget-composite.c:sparse:sparse:restricted-__le16-degrades-to-integer
|-- powerpc-allmodconfig
|   |-- 
drivers-gpu-drm-amd-amdgpu-..-display-dc-link-protocols

Re: [PATCH] drm/amd/pm: bump SMU 13.0.4 driver_if header version

2023-03-08 Thread Alex Deucher
Reviewed-by: Alex Deucher 

On Tue, Mar 7, 2023 at 10:44 PM Tim Huang  wrote:
>
> Align the SMU driver interface version with PMFW to
> suppress the version mismatch message on driver loading.
>
> Signed-off-by: Tim Huang 
> ---
>  .../drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_4.h| 4 ++--
>  drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h  | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git 
> a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_4.h 
> b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_4.h
> index f77401709d83..2162ecd1057d 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_4.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/pmfw_if/smu13_driver_if_v13_0_4.h
> @@ -27,7 +27,7 @@
>  // *** IMPORTANT ***
>  // SMU TEAM: Always increment the interface version if
>  // any structure is changed in this file
> -#define PMFW_DRIVER_IF_VERSION 7
> +#define PMFW_DRIVER_IF_VERSION 8
>
>  typedef struct {
>int32_t value;
> @@ -198,7 +198,7 @@ typedef struct {
>uint16_t SkinTemp;
>uint16_t DeviceState;
>uint16_t CurTemp; //[centi-Celsius]
> -  uint16_t spare2;
> +  uint16_t FilterAlphaValue;
>
>uint16_t AverageGfxclkFrequency;
>uint16_t AverageFclkFrequency;
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h 
> b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
> index e7d8b4eb4b56..0ef37837b164 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/inc/smu_v13_0.h
> @@ -29,7 +29,7 @@
>  #define SMU13_DRIVER_IF_VERSION_YELLOW_CARP 0x04
>  #define SMU13_DRIVER_IF_VERSION_ALDE 0x08
>  #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_0_0 0x37
> -#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_4 0x07
> +#define SMU13_DRIVER_IF_VERSION_SMU_V13_0_4 0x08
>  #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_5 0x04
>  #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_0_10 0x32
>  #define SMU13_DRIVER_IF_VERSION_SMU_V13_0_7 0x37
> --
> 2.25.1
>


[PATCH 2/2] drm/amd/pm: Fix navi10 incorrect OD volage after resume

2023-03-08 Thread Alex Deucher
Always setup overdrive tables after resume. Preserve only some
user-defined settings in user_overdrive_table if they're set.

Copy restored user_overdrive_table into od_table to get correct
values.

On cold boot, BTC was triggered and GfxVfCurve was calibrated. We
got VfCurve settings (a). On resuming back, BTC will be triggered
again and GfxVfCurve will be recalibrated. VfCurve settings (b)
got may be different from those of cold boot.  So if we reuse
those VfCurve settings (a) got on cold boot on suspend, we can
run into discrepencies.

Based on the sienna cichlid patch from Błażej Szczygieł 

Cc: Błażej Szczygieł 
Cc: Evan Quan 
Signed-off-by: Alex Deucher 
---
 .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c   | 47 +++
 1 file changed, 37 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
index 95da6dd1cc65..68201d8e1c72 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
@@ -2510,16 +2510,9 @@ static int navi10_set_default_od_settings(struct 
smu_context *smu)
(OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
OverDriveTable_t *user_od_table =
(OverDriveTable_t *)smu->smu_table.user_overdrive_table;
+   OverDriveTable_t user_od_table_bak;
int ret = 0;
 
-   /*
-* For S3/S4/Runpm resume, no need to setup those overdrive tables 
again as
-*   - either they already have the default OD settings got during cold 
bootup
-*   - or they have some user customized OD settings which cannot be 
overwritten
-*/
-   if (smu->adev->in_suspend)
-   return 0;
-
ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE, 0, (void 
*)boot_od_table, false);
if (ret) {
dev_err(smu->adev->dev, "Failed to get overdrive table!\n");
@@ -2553,7 +2546,27 @@ static int navi10_set_default_od_settings(struct 
smu_context *smu)
navi10_dump_od_table(smu, boot_od_table);
 
memcpy(od_table, boot_od_table, sizeof(OverDriveTable_t));
-   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
+
+   /*
+* For S3/S4/Runpm resume, we need to setup those overdrive tables 
again,
+* but we have to preserve user defined values in "user_od_table".
+*/
+   if (!smu->adev->in_suspend) {
+   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
+   smu->user_dpm_profile.user_od = false;
+   } else if (smu->user_dpm_profile.user_od) {
+   memcpy(_od_table_bak, user_od_table, 
sizeof(OverDriveTable_t));
+   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
+   user_od_table->GfxclkFmin = user_od_table_bak.GfxclkFmin;
+   user_od_table->GfxclkFmax = user_od_table_bak.GfxclkFmax;
+   user_od_table->UclkFmax = user_od_table_bak.UclkFmax;
+   user_od_table->GfxclkFreq1 = user_od_table_bak.GfxclkFreq1;
+   user_od_table->GfxclkVolt1 = user_od_table_bak.GfxclkVolt1;
+   user_od_table->GfxclkFreq2 = user_od_table_bak.GfxclkFreq2;
+   user_od_table->GfxclkVolt2 = user_od_table_bak.GfxclkVolt2;
+   user_od_table->GfxclkFreq3 = user_od_table_bak.GfxclkFreq3;
+   user_od_table->GfxclkVolt3 = user_od_table_bak.GfxclkVolt3;
+   }
 
return 0;
 }
@@ -2733,6 +2746,20 @@ static int navi10_od_edit_dpm_table(struct smu_context 
*smu, enum PP_OD_DPM_TABL
return ret;
 }
 
+static int navi10_restore_user_od_settings(struct smu_context *smu)
+{
+   struct smu_table_context *table_context = >smu_table;
+   OverDriveTable_t *od_table = table_context->overdrive_table;
+   OverDriveTable_t *user_od_table = table_context->user_overdrive_table;
+   int res;
+
+   res = smu_v11_0_restore_user_od_settings(smu);
+   if (res == 0)
+   memcpy(od_table, user_od_table, sizeof(OverDriveTable_t));
+
+   return res;
+}
+
 static int navi10_run_btc(struct smu_context *smu)
 {
int ret = 0;
@@ -3560,7 +3587,7 @@ static const struct pptable_funcs navi10_ppt_funcs = {
.set_soft_freq_limited_range = smu_v11_0_set_soft_freq_limited_range,
.set_default_od_settings = navi10_set_default_od_settings,
.od_edit_dpm_table = navi10_od_edit_dpm_table,
-   .restore_user_od_settings = smu_v11_0_restore_user_od_settings,
+   .restore_user_od_settings = navi10_restore_user_od_settings,
.run_btc = navi10_run_btc,
.set_power_source = smu_v11_0_set_power_source,
.get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
-- 
2.39.2



[PATCH 1/2] drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume

2023-03-08 Thread Alex Deucher
From: Błażej Szczygieł 

Always setup overdrive tables after resume. Preserve only some
user-defined settings in user_overdrive_table if they're set.

Copy restored user_overdrive_table into od_table to get correct
values.

On cold boot, BTC was triggered and GfxVfCurve was calibrated. We
got VfCurve settings (a). On resuming back, BTC will be triggered
again and GfxVfCurve will be recalibrated. VfCurve settings (b)
got may be different from those of cold boot.  So if we reuse
those VfCurve settings (a) got on cold boot on suspend, we can
run into discrepencies.

Reviewed-by: Evan Quan 
Signed-off-by: Błażej Szczygieł 
Signed-off-by: Alex Deucher 
---
 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 43 ++-
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
index 697e98a0a20a..75f18681e984 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
@@ -2143,16 +2143,9 @@ static int sienna_cichlid_set_default_od_settings(struct 
smu_context *smu)
(OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
OverDriveTable_t *user_od_table =
(OverDriveTable_t *)smu->smu_table.user_overdrive_table;
+   OverDriveTable_t user_od_table_bak;
int ret = 0;
 
-   /*
-* For S3/S4/Runpm resume, no need to setup those overdrive tables 
again as
-*   - either they already have the default OD settings got during cold 
bootup
-*   - or they have some user customized OD settings which cannot be 
overwritten
-*/
-   if (smu->adev->in_suspend)
-   return 0;
-
ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE,
   0, (void *)boot_od_table, false);
if (ret) {
@@ -2163,7 +2156,23 @@ static int sienna_cichlid_set_default_od_settings(struct 
smu_context *smu)
sienna_cichlid_dump_od_table(smu, boot_od_table);
 
memcpy(od_table, boot_od_table, sizeof(OverDriveTable_t));
-   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
+
+   /*
+* For S3/S4/Runpm resume, we need to setup those overdrive tables 
again,
+* but we have to preserve user defined values in "user_od_table".
+*/
+   if (!smu->adev->in_suspend) {
+   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
+   smu->user_dpm_profile.user_od = false;
+   } else if (smu->user_dpm_profile.user_od) {
+   memcpy(_od_table_bak, user_od_table, 
sizeof(OverDriveTable_t));
+   memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
+   user_od_table->GfxclkFmin = user_od_table_bak.GfxclkFmin;
+   user_od_table->GfxclkFmax = user_od_table_bak.GfxclkFmax;
+   user_od_table->UclkFmin = user_od_table_bak.UclkFmin;
+   user_od_table->UclkFmax = user_od_table_bak.UclkFmax;
+   user_od_table->VddGfxOffset = user_od_table_bak.VddGfxOffset;
+   }
 
return 0;
 }
@@ -2373,6 +2382,20 @@ static int sienna_cichlid_od_edit_dpm_table(struct 
smu_context *smu,
return ret;
 }
 
+static int sienna_cichlid_restore_user_od_settings(struct smu_context *smu)
+{
+   struct smu_table_context *table_context = >smu_table;
+   OverDriveTable_t *od_table = table_context->overdrive_table;
+   OverDriveTable_t *user_od_table = table_context->user_overdrive_table;
+   int res;
+
+   res = smu_v11_0_restore_user_od_settings(smu);
+   if (res == 0)
+   memcpy(od_table, user_od_table, sizeof(OverDriveTable_t));
+
+   return res;
+}
+
 static int sienna_cichlid_run_btc(struct smu_context *smu)
 {
int res;
@@ -4400,7 +4423,7 @@ static const struct pptable_funcs 
sienna_cichlid_ppt_funcs = {
.set_soft_freq_limited_range = smu_v11_0_set_soft_freq_limited_range,
.set_default_od_settings = sienna_cichlid_set_default_od_settings,
.od_edit_dpm_table = sienna_cichlid_od_edit_dpm_table,
-   .restore_user_od_settings = smu_v11_0_restore_user_od_settings,
+   .restore_user_od_settings = sienna_cichlid_restore_user_od_settings,
.run_btc = sienna_cichlid_run_btc,
.set_power_source = smu_v11_0_set_power_source,
.get_pp_feature_mask = smu_cmn_get_pp_feature_mask,
-- 
2.39.2



Re: [PATCH] drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4

2023-03-08 Thread Alex Deucher
On Wed, Mar 8, 2023 at 9:58 AM  wrote:
>
> From: Veerabadhran Gopalakrishnan 
>
> Added the video capability query support for VCN version 4_0_4
>
> Signed-off-by: Veerabadhran Gopalakrishnan 
> 
> Reviewed-by: Leo Liu 

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/amd/amdgpu/soc21.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
> b/drivers/gpu/drm/amd/amdgpu/soc21.c
> index 9df223600..061793d39 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc21.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
> @@ -111,6 +111,7 @@ static int soc21_query_video_codecs(struct amdgpu_device 
> *adev, bool encode,
> switch (adev->ip_versions[UVD_HWIP][0]) {
> case IP_VERSION(4, 0, 0):
> case IP_VERSION(4, 0, 2):
> +   case IP_VERSION(4, 0, 4):
> if (adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) {
> if (encode)
> *codecs = _4_0_0_video_codecs_encode_vcn1;
> --
> 2.34.1
>


RE: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-08 Thread Belanger, David
[AMD Official Use Only - General]



> -Original Message-
> From: Christian König 
> Sent: Wednesday, March 8, 2023 4:08 AM
> To: Belanger, David ; Koenig, Christian
> ; amd-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
> exit.
> 
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
> 
> 
> Am 07.03.23 um 16:28 schrieb Belanger, David:
> > [AMD Official Use Only - General]
> >
> >
> > The test case is a python program that will load the driver, do some
> operations, then unload the driver.
> 
> What do you mean with unloading the driver? Removing the module? Or
> destroying the device?
> 

The python program calls a shell script that does "modprobe amdgpu".
Calls some SMI operation to get some events.
Then it calls a shell scripts that does "modprobe -r amdgpu".
Then it exits.

There will be ref on the kfd_process that will remain, which will be released 
only when mmu_notifier ops->release is called.
This does not get called until the python process ends.

The test program is definitively not the typical use a general user would do.

> > When the driver exists, there is still the python process space around
> holding on the address space.
> > When the python process space exits, the mmu_notifier gets called but the
> driver has already been unloaded.
> >
> > The goal of the fix is to address case where there could be
> > outstanding address space / worker threads for process cleanup that needs
> to be cleared/completed at exit time.
> 
> Yeah and when the module is unloaded this is a completely futile effort.
> 
> The general upstream approach is to take references on the struct device and
> module and prevent unloading as long as those references exists.
> 
> The device might be non-functional any more (because for example of hot
> plug), but the driver should never be unloaded before the python program
> exits.

Thank you for your feedback, I will investigate that approach.

> 
> Regards,
> Christian.
> 
> >
> > Regards,
> > David B.
> >
> >> -Original Message-
> >> From: Koenig, Christian 
> >> Sent: Tuesday, March 7, 2023 2:05 AM
> >> To: Belanger, David ; amd-
> >> g...@lists.freedesktop.org
> >> Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
> >> exit.
> >>
> >> Am 06.03.23 um 22:58 schrieb David Belanger:
> >>> Handle case when module is unloaded (kfd_exit) before a process
> >>> space
> >>> (mm_struct) is released.
> >> Well that should never ever happen in the first place. It sounds like
> >> we are missing grabbing module references.
> >>
> >> Regards,
> >> Christian.
> >>
> >>> Signed-off-by: David Belanger 
> >>> ---
> >>>drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
> >>>drivers/gpu/drm/amd/amdkfd/kfd_process.c | 57
> >> 
> >>>2 files changed, 61 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> >>> b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> >>> index 09b966dc3768..8ef4bd9e4f7d 100644
> >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
> >>> @@ -26,6 +26,9 @@
> >>>#include "kfd_priv.h"
> >>>#include "amdgpu_amdkfd.h"
> >>>
> >>> +void kfd_cleanup_processes(void);
> >>> +
> >>> +
> >>>static int kfd_init(void)
> >>>{
> >>> int err;
> >>> @@ -77,6 +80,7 @@ static int kfd_init(void)
> >>>
> >>>static void kfd_exit(void)
> >>>{
> >>> +   kfd_cleanup_processes();
> >>> kfd_debugfs_fini();
> >>> kfd_process_destroy_wq();
> >>> kfd_procfs_shutdown();
> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >>> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >>> index ebabe92f7edb..b5b28a32639d 100644
> >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> >>> @@ -1181,6 +1181,17 @@ static void
> >>> kfd_process_notifier_release(struct
> >> mmu_notifier *mn,
> >>> return;
> >>>
> >>> mutex_lock(_processes_mutex);
> >>> +   /*
> >>> +* Do early return if p is not in the table.
> >>> +*
> >>> +* This could potentially happen if this function is called 
> >>> concurrently
> >>> +* by mmu_notifier and by kfd_cleanup_pocesses.
> >>> +*
> >>> +*/
> >>> +   if (!hash_hashed(>kfd_processes)) {
> >>> +   mutex_unlock(_processes_mutex);
> >>> +   return;
> >>> +   }
> >>> hash_del_rcu(>kfd_processes);
> >>> mutex_unlock(_processes_mutex);
> >>> synchronize_srcu(_processes_srcu);
> >>> @@ -1200,6 +1211,52 @@ static const struct mmu_notifier_ops
> >> kfd_process_mmu_notifier_ops = {
> >>> .free_notifier = kfd_process_free_notifier,
> >>>};
> >>>
> >>> +
> >>> +void kfd_cleanup_processes(void)
> >>> +{
> >>> +   struct kfd_process *p;
> >>> +   unsigned int temp;
> >>> +
> >>> +   /*
> >>> +* Iterate over remaining processes in table, calling 

Re: [PATCH v2] drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume

2023-03-08 Thread Alex Deucher
On Tue, Mar 7, 2023 at 9:34 PM Quan, Evan  wrote:
>
> [AMD Official Use Only - General]
>
> Thanks Alex. I probably get the root cause of the issue. It should be related 
> with the BTC feature.
> - On cold boot, BTC was triggered and GfxVfCurve was calibrated.
>We got VfCurve settings (a).
> - On resuming back, BTC will be triggered again and GfxVfCurve will be 
> recalibrated.
>VfCurve settings (b) got may be different from those of cold boot.
>So if to reuse those VfCurve settings (a) got on cold boot, we might got 
> some V/f issues.
>
> These can be confirmed by comparing the CustomGfxVfCurve settings got on cold 
> boot and resuming.
> +   dev_err(smu->adev->dev, "OD: GfxVfCurve: (%d, %d, %d)\n",
> +   
> od_table->CustomGfxVfCurve.a,
> +   
> od_table->CustomGfxVfCurve.b,
> +   
> od_table->CustomGfxVfCurve.c);
> Below are some data collected on my nv21 platform and we can see the 
> GfxVfCurve settings are different.
> - On cold boot: OD: GfxVfCurve: (1046469987, -1089068751, 1066221898)
> - On resuming back: OD: GfxVfCurve: (1046393849, -1089130480, 1066199153)
>
> Hi @Błażej Szczygieł,
> If you can add some descriptions about the BTC and GfxVfCurve related in the 
> patch header/description part, it will be better.
> Anyway, the patch is Reviewed-by: Evan Quan 

Presumably this should be done for navi10 as well?

Alex

>
> BR
> Evan
> > -Original Message-
> > From: amd-gfx  On Behalf Of Alex
> > Deucher
> > Sent: Tuesday, March 7, 2023 11:23 PM
> > To: Quan, Evan 
> > Cc: Błażej Szczygieł ; amd-
> > g...@lists.freedesktop.org
> > Subject: Re: [PATCH v2] drm/amd/pm: Fix sienna cichlid incorrect OD volage
> > after resume
> >
> > On Tue, Mar 7, 2023 at 3:23 AM Quan, Evan  wrote:
> > >
> > > [AMD Official Use Only - General]
> > >
> > > Can you share more background about this? I cannot see how this can
> > address incorrect OD voltage issue.
> >
> > See https://gitlab.freedesktop.org/drm/amd/-/issues/1897
> > The OD settings don't seem to be restored properly on resume.
> >
> > Alex
> >
> > >
> > > BR
> > > Evan
> > > > -Original Message-
> > > > From: amd-gfx  On Behalf Of
> > > > Blazej Szczygiel
> > > > Sent: Sunday, March 5, 2023 7:45 AM
> > > > To: amd-gfx@lists.freedesktop.org
> > > > Cc: Błażej Szczygieł 
> > > > Subject: [PATCH v2] drm/amd/pm: Fix sienna cichlid incorrect OD
> > > > volage after resume
> > > >
> > > > Always setup overdrive tables after resume. Preserve only some
> > > > user-defined settings in user_overdrive_table if they're set.
> > > >
> > > > Copy restored user_overdrive_table into od_table to get correct
> > > > values.
> > > >
> > > > Signed-off-by: Błażej Szczygieł 
> > > > ---
> > > >  .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c   | 43
> > ++
> > > > -
> > > >  1 file changed, 33 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git
> > a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> > > > b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> > > > index 697e98a0a20a..75f18681e984 100644
> > > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> > > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c
> > > > @@ -2143,16 +2143,9 @@ static int
> > > > sienna_cichlid_set_default_od_settings(struct smu_context *smu)
> > > >   (OverDriveTable_t *)smu->smu_table.boot_overdrive_table;
> > > >   OverDriveTable_t *user_od_table =
> > > >   (OverDriveTable_t
> > > > *)smu->smu_table.user_overdrive_table;
> > > > + OverDriveTable_t user_od_table_bak;
> > > >   int ret = 0;
> > > >
> > > > - /*
> > > > -  * For S3/S4/Runpm resume, no need to setup those overdrive
> > > > tables again as
> > > > -  *   - either they already have the default OD settings got 
> > > > during cold
> > > > bootup
> > > > -  *   - or they have some user customized OD settings which cannot 
> > > > be
> > > > overwritten
> > > > -  */
> > > > - if (smu->adev->in_suspend)
> > > > - return 0;
> > > > -
> > > >   ret = smu_cmn_update_table(smu, SMU_TABLE_OVERDRIVE,
> > > >  0, (void *)boot_od_table, false);
> > > >   if (ret) {
> > > > @@ -2163,7 +2156,23 @@ static int
> > > > sienna_cichlid_set_default_od_settings(struct smu_context *smu)
> > > >   sienna_cichlid_dump_od_table(smu, boot_od_table);
> > > >
> > > >   memcpy(od_table, boot_od_table, sizeof(OverDriveTable_t));
> > > > - memcpy(user_od_table, boot_od_table, sizeof(OverDriveTable_t));
> > > > +
> > > > + /*
> > > > +  * For S3/S4/Runpm resume, we need to setup those overdrive
> > > > tables again,
> > > > +  * but we have to preserve user defined values in "user_od_table".
> > > > +  */
> > > > + if (!smu->adev->in_suspend) {
> > > > + 

[PATCH] drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4

2023-03-08 Thread veerabadhran.gopalakrishnan
From: Veerabadhran Gopalakrishnan 

Added the video capability query support for VCN version 4_0_4

Signed-off-by: Veerabadhran Gopalakrishnan 
Reviewed-by: Leo Liu 
---
 drivers/gpu/drm/amd/amdgpu/soc21.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c 
b/drivers/gpu/drm/amd/amdgpu/soc21.c
index 9df223600..061793d39 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc21.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc21.c
@@ -111,6 +111,7 @@ static int soc21_query_video_codecs(struct amdgpu_device 
*adev, bool encode,
switch (adev->ip_versions[UVD_HWIP][0]) {
case IP_VERSION(4, 0, 0):
case IP_VERSION(4, 0, 2):
+   case IP_VERSION(4, 0, 4):
if (adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) {
if (encode)
*codecs = _4_0_0_video_codecs_encode_vcn1;
-- 
2.34.1



Re: [PATCH] drm/amdgpu: Drop redundant pci_enable_pcie_error_reporting()

2023-03-08 Thread Alex Deucher
Applied.  Thanks!

Alex

On Tue, Mar 7, 2023 at 3:22 PM Bjorn Helgaas  wrote:
>
> From: Bjorn Helgaas 
>
> pci_enable_pcie_error_reporting() enables the device to send ERR_*
> Messages.  Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is
> native"), the PCI core does this for all devices during enumeration, so the
> driver doesn't need to do it itself.
>
> Remove the redundant pci_enable_pcie_error_reporting() call from the
> driver.
>
> Note that this only controls ERR_* Messages from the device.  An ERR_*
> Message may cause the Root Port to generate an interrupt, depending on the
> AER Root Error Command register managed by the AER service driver.
>
> Signed-off-by: Bjorn Helgaas 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 --
>  2 files changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 164141bc8b4a..208cebb40232 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -50,7 +50,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>
>  #include 
>  #include 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index c4a4e2fe6681..a5151e83a3f7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3773,8 +3773,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> }
> }
>
> -   pci_enable_pcie_error_reporting(adev->pdev);
> -
> /* Post card if necessary */
> if (amdgpu_device_need_post(adev)) {
> if (!adev->bios) {
> --
> 2.25.1
>


Re: [PATCH] drm/amd/display: add prefix to amdgpu_dm_crtc.h functions

2023-03-08 Thread Alex Deucher
Applied.  Thanks!

On Tue, Mar 7, 2023 at 2:34 PM David Tadokoro  wrote:
>
> Some amdgpu_dm_crtc.h functions didn't have names that indicated where
> they were declared.
>
> To better filter results in debug tools like ftrace, prefix these
> functions with 'amdgpu_dm_crtc_'.
>
> Signed-off-by: David Tadokoro 
> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 32 +--
>  .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c| 26 +++
>  .../amd/display/amdgpu_dm/amdgpu_dm_crtc.h| 14 
>  3 files changed, 36 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index b472931cb7ca..b3e874589617 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -342,7 +342,7 @@ static inline bool is_dc_timing_adjust_needed(struct 
> dm_crtc_state *old_state,
>  {
> if (new_state->freesync_config.state ==  VRR_STATE_ACTIVE_FIXED)
> return true;
> -   else if (amdgpu_dm_vrr_active(old_state) != 
> amdgpu_dm_vrr_active(new_state))
> +   else if (amdgpu_dm_crtc_vrr_active(old_state) != 
> amdgpu_dm_crtc_vrr_active(new_state))
> return true;
> else
> return false;
> @@ -436,7 +436,7 @@ static void dm_pflip_high_irq(void *interrupt_params)
>
> WARN_ON(!e);
>
> -   vrr_active = amdgpu_dm_vrr_active_irq(amdgpu_crtc);
> +   vrr_active = amdgpu_dm_crtc_vrr_active_irq(amdgpu_crtc);
>
> /* Fixed refresh rate, or VRR scanout position outside front-porch? */
> if (!vrr_active ||
> @@ -510,7 +510,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
> acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - 
> IRQ_TYPE_VUPDATE);
>
> if (acrtc) {
> -   vrr_active = amdgpu_dm_vrr_active_irq(acrtc);
> +   vrr_active = amdgpu_dm_crtc_vrr_active_irq(acrtc);
> drm_dev = acrtc->base.dev;
> vblank = _dev->vblank[acrtc->base.index];
> previous_timestamp = 
> atomic64_read(_params->previous_timestamp);
> @@ -534,7 +534,7 @@ static void dm_vupdate_high_irq(void *interrupt_params)
>  * if a pageflip happened inside front-porch.
>  */
> if (vrr_active) {
> -   dm_crtc_handle_vblank(acrtc);
> +   amdgpu_dm_crtc_handle_vblank(acrtc);
>
> /* BTR processing for pre-DCE12 ASICs */
> if (acrtc->dm_irq_params.stream &&
> @@ -574,7 +574,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
> if (!acrtc)
> return;
>
> -   vrr_active = amdgpu_dm_vrr_active_irq(acrtc);
> +   vrr_active = amdgpu_dm_crtc_vrr_active_irq(acrtc);
>
> DC_LOG_VBLANK("crtc:%d, vupdate-vrr:%d, planes:%d\n", acrtc->crtc_id,
>   vrr_active, acrtc->dm_irq_params.active_planes);
> @@ -586,7 +586,7 @@ static void dm_crtc_high_irq(void *interrupt_params)
>  * to dm_vupdate_high_irq after end of front-porch.
>  */
> if (!vrr_active)
> -   dm_crtc_handle_vblank(acrtc);
> +   amdgpu_dm_crtc_handle_vblank(acrtc);
>
> /**
>  * Following stuff must happen at start of vblank, for crc
> @@ -2483,11 +2483,11 @@ static void dm_gpureset_toggle_interrupts(struct 
> amdgpu_device *adev,
>  enable ? "enable" : "disable");
>
> if (enable) {
> -   rc = dm_enable_vblank(>base);
> +   rc = 
> amdgpu_dm_crtc_enable_vblank(>base);
> if (rc)
> DRM_WARN("Failed to enable vblank 
> interrupts\n");
> } else {
> -   dm_disable_vblank(>base);
> +   amdgpu_dm_crtc_disable_vblank(>base);
> }
>
> }
> @@ -7746,7 +7746,7 @@ static void update_freesync_state_on_stream(
> _params);
>
> if (adev->family < AMDGPU_FAMILY_AI &&
> -   amdgpu_dm_vrr_active(new_crtc_state)) {
> +   amdgpu_dm_crtc_vrr_active(new_crtc_state)) {
> mod_freesync_handle_v_update(dm->freesync_module,
>  new_stream, _params);
>
> @@ -7864,8 +7864,8 @@ static void update_stream_irq_parameters(
>  static void amdgpu_dm_handle_vrr_transition(struct dm_crtc_state *old_state,
> struct dm_crtc_state *new_state)
>  {
> -   bool old_vrr_active = amdgpu_dm_vrr_active(old_state);
> -   bool new_vrr_active = amdgpu_dm_vrr_active(new_state);
> +   bool old_vrr_active = 

Re: [6.3][regression] commit a4e771729a51168bc36317effaa9962e336d4f5e lead to flood kernel logs with warning messages "at kernel/workqueue.c:3167 __flush_work+0x472/0x500"

2023-03-08 Thread Alex Deucher
On Wed, Mar 8, 2023 at 7:02 AM Mikhail Gavrilov
 wrote:
>
> Hi,
> I didn't faced to issue drm_bridge_hpd_enable+0x94/0x9c [drm] but
> fixing this issue leads to warning messages on my laptop ASUS ROG
> Strix G15 Advantage Edition G513QY-HQ007 which has two AMD GPU.
> Discrete Radeon 6800M and integrated in CPU Cezanne Vega 8.
>
> I found bad commit by bisecting:
> ❯ git bisect bad
> a4e771729a51168bc36317effaa9962e336d4f5e is the first bad commit
> commit a4e771729a51168bc36317effaa9962e336d4f5e
> Author: Dmitry Baryshkov 
> Date:   Tue Jan 24 12:45:48 2023 +0200
>
> drm/probe_helper: sort out poll_running vs poll_enabled
>
> There are two flags attemting to guard connector polling:
> poll_enabled and poll_running. While poll_enabled semantics is clearly
> defined and fully adhered (mark that drm_kms_helper_poll_init() was
> called and not finalized by the _fini() call), the poll_running flag
> doesn't have such clearliness.
>
> This flag is used only in drm_helper_probe_single_connector_modes() to
> guard calling of drm_kms_helper_poll_enable, it doesn't guard the
> drm_kms_helper_poll_fini(), etc. Change it to only be set if the polling
> is actually running. Tie HPD enablement to this flag.
>
> This fixes the following warning reported after merging the HPD series:
>
> Hot plug detection already enabled
> WARNING: CPU: 2 PID: 9 at drivers/gpu/drm/drm_bridge.c:1257
> drm_bridge_hpd_enable+0x94/0x9c [drm]
> Modules linked in: videobuf2_memops snd_soc_simple_card
> snd_soc_simple_card_utils fsl_imx8_ddr_perf videobuf2_common
> snd_soc_imx_spdif adv7511 etnaviv imx8m_ddrc imx_dcss mc cec nwl_dsi
> gov
> CPU: 2 PID: 9 Comm: kworker/u8:0 Not tainted
> 6.2.0-rc2-15208-g25b283acd578 #6
> Hardware name: NXP i.MX8MQ EVK (DT)
> Workqueue: events_unbound deferred_probe_work_func
> pstate: 6005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : drm_bridge_hpd_enable+0x94/0x9c [drm]
> lr : drm_bridge_hpd_enable+0x94/0x9c [drm]
> sp : 89ef3740
> x29: 89ef3740 x28: 09331f00 x27: 1000
> x26: 0020 x25: 81148ed8 x24: 0a8fe000
> x23: fffd x22: 05086348 x21: 81133ee0
> x20: 0550d800 x19: 05086288 x18: 0006
> x17:  x16: 896ef008 x15: 972891004260
> x14: 2a1403e19400 x13: 972891004260 x12: 2a1403e19400
> x11: 7100385f29400801 x10: 0aa0 x9 : 88112744
> x8 : 00250b00 x7 : 0003 x6 : 0011
> x5 :  x4 : bd986a48 x3 : 0001
> x2 :  x1 :  x0 : 0025
> Call trace:
>  drm_bridge_hpd_enable+0x94/0x9c [drm]
>  drm_bridge_connector_enable_hpd+0x2c/0x3c [drm_kms_helper]
>  drm_kms_helper_poll_enable+0x94/0x10c [drm_kms_helper]
>  drm_helper_probe_single_connector_modes+0x1a8/0x510 [drm_kms_helper]
>  drm_client_modeset_probe+0x204/0x1190 [drm]
>  __drm_fb_helper_initial_config_and_unlock+0x5c/0x4a4 [drm_kms_helper]
>  drm_fb_helper_initial_config+0x54/0x6c [drm_kms_helper]
>  drm_fbdev_client_hotplug+0xd0/0x140 [drm_kms_helper]
>  drm_fbdev_generic_setup+0x90/0x154 [drm_kms_helper]
>  dcss_kms_attach+0x1c8/0x254 [imx_dcss]
>  dcss_drv_platform_probe+0x90/0xfc [imx_dcss]
>  platform_probe+0x70/0xcc
>  really_probe+0xc4/0x2e0
>  __driver_probe_device+0x80/0xf0
>  driver_probe_device+0xe0/0x164
>  __device_attach_driver+0xc0/0x13c
>  bus_for_each_drv+0x84/0xe0
>  __device_attach+0xa4/0x1a0
>  device_initial_probe+0x1c/0x30
>  bus_probe_device+0xa4/0xb0
>  deferred_probe_work_func+0x90/0xd0
>  process_one_work+0x200/0x474
>  worker_thread+0x74/0x43c
>  kthread+0xfc/0x110
>  ret_from_fork+0x10/0x20
> ---[ end trace  ]---
>
> Reported-by: Laurentiu Palcu 
> Fixes: c8268795c9a9 ("drm/probe-helper: enable and disable HPD on
> connectors")
> Tested-by: Marek Szyprowski 
> Tested-by: Chen-Yu Tsai 
> Acked-by: Laurentiu Palcu 
> Tested-by: Laurentiu Palcu 
> Tested-by: Laurent Pinchart 
> Signed-off-by: Dmitry Baryshkov 
> Signed-off-by: Neil Armstrong 
> Link: 
> https://patchwork.freedesktop.org/patch/msgid/20230124104548.3234554-2-dmitry.barysh...@linaro.org
> (cherry picked from commit d33a54e3991dfce88b4fc6d9c3360951c2c5660d)
> Signed-off-by: Thomas Zimmermann 
>
>  drivers/gpu/drm/drm_probe_helper.c | 42 
> +++---
>  1 file changed, 21 insertions(+), 21 deletions(-)
>
> Of course I tried to check the bisect assumption by reverting this
> commit. And I can confirm without commit
> a4e771729a51168bc36317effaa9962e336d4f5e the warning messages do not
> appear within a day.
>
> I attached a full kernel log if someone would be 

[PATCH] drm/amd/display: remove unused variable available

2023-03-08 Thread Tom Rix
With gcc and W=1, there is this error
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_dpia_bw.c:297:13:
 error:
  variable ‘available’ set but not used [-Werror=unused-but-set-variable]
  297 | int available = 0;
  | ^

Since available is unused, remove it.

Signed-off-by: Tom Rix 
---
 .../drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c   | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
index f14217cc16fd..2f0311c42f90 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_dp_dpia_bw.c
@@ -294,7 +294,6 @@ bool link_dp_dpia_set_dptx_usb4_bw_alloc_support(struct 
dc_link *link)
 void dpia_handle_bw_alloc_response(struct dc_link *link, uint8_t bw, uint8_t 
result)
 {
int bw_needed = 0;
-   int available = 0;
int estimated = 0;
int host_router_total_estimated_bw = 0;
 
@@ -373,20 +372,13 @@ void dpia_handle_bw_alloc_response(struct dc_link *link, 
uint8_t bw, uint8_t res
 
// 1. If due to unplug of other sink
if (estimated == host_router_total_estimated_bw) {
-
// First update the estimated & max_bw fields
if (link->dpia_bw_alloc_config.estimated_bw < 
estimated) {
-   available = estimated - 
link->dpia_bw_alloc_config.estimated_bw;
link->dpia_bw_alloc_config.estimated_bw = 
estimated;
}
}
// 2. If due to realloc bw btw 2 dpia due to plug OR realloc 
unused Bw
else {
-
-   // We took from another unplugged/problematic sink to 
give to us
-   if (link->dpia_bw_alloc_config.estimated_bw < estimated)
-   available = estimated - 
link->dpia_bw_alloc_config.estimated_bw;
-
// We lost estimated bw usually due to plug event of 
other dpia
link->dpia_bw_alloc_config.estimated_bw = estimated;
}
-- 
2.27.0



Re: [RFC] drm/amd/display: Pass proper parent for DM backlight device registration

2023-03-08 Thread Hans de Goede
Hi,

On 2/15/23 12:38, Hans de Goede wrote:
> The parent for the backlight device should be the drm-connector object,
> not the PCI device.
> 
> Userspace relies on this to be able to detect which backlight class device
> to use on hybrid gfx devices where there may be multiple native (raw)
> backlight devices registered.
> 
> Specifically gnome-settings-daemon expects the parent device to have
> an "enabled" sysfs attribute (as drm_connector devices do) and tests
> that this returns "enabled" when read.
> 
> This aligns the parent of the backlight device with i915, nouveau, radeon.
> Note that drivers/gpu/drm/amd/amdgpu/atombios_encoders.c also already
> uses the drm_connector as parent, only amdgpu_dm.c used the PCI device
> as parent before this change.
> 
> Note this is marked as a RFC because I don't have hw to test, so this
> has only been compile tested! If someone can test this on actual
> hw which hits the changed code path that would be great.
> 
> Link: https://gitlab.gnome.org/GNOME/gnome-settings-daemon/-/issues/730
> Signed-off-by: Hans de Goede 

Self NACK. This has been tested by 2 reporters of:

https://gitlab.gnome.org/GNOME/gnome-settings-daemon/-/issues/730

Now and it does not work. Instead of setting the parent device pointer 
correctly,
this makes the backlight device not have a parent device any more at all.
I already was afraid this might happen, since the drm_connector object is not 
yet
registered at the time when the amdgpu code calls backlight_device_register().

Other drivers like e.g. nouveau register the backlight later from
a drm_connector_funcs.late_register callback. I was hoping doing it
the simple way as this patch did would work, but it looks like some bigger
changes to the amdgpu code (using a drm_connector_funcs.late_register callback)
are necessary.

I'll try to make some time to prepare a new patch.

Regards,

Hans



> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 31bce529f685..33b0e1de2770 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -4065,7 +4065,8 @@ static const struct backlight_ops 
> amdgpu_dm_backlight_ops = {
>  };
>  
>  static void
> -amdgpu_dm_register_backlight_device(struct amdgpu_display_manager *dm)
> +amdgpu_dm_register_backlight_device(struct amdgpu_display_manager *dm,
> + struct amdgpu_dm_connector *aconnector)
>  {
>   char bl_name[16];
>   struct backlight_properties props = { 0 };
> @@ -4088,7 +4089,7 @@ amdgpu_dm_register_backlight_device(struct 
> amdgpu_display_manager *dm)
>adev_to_drm(dm->adev)->primary->index + dm->num_of_edps);
>  
>   dm->backlight_dev[dm->num_of_edps] = backlight_device_register(bl_name,
> -
> adev_to_drm(dm->adev)->dev,
> +
> aconnector->base.kdev,
>  dm,
>  
> _dm_backlight_ops,
>  );
> @@ -4141,6 +4142,7 @@ static int initialize_plane(struct 
> amdgpu_display_manager *dm,
>  
>  
>  static void register_backlight_device(struct amdgpu_display_manager *dm,
> +   struct amdgpu_dm_connector *aconnector,
> struct dc_link *link)
>  {
>   if ((link->connector_signal & (SIGNAL_TYPE_EDP | SIGNAL_TYPE_LVDS)) &&
> @@ -4151,7 +4153,7 @@ static void register_backlight_device(struct 
> amdgpu_display_manager *dm,
>* is better then a black screen.
>*/
>   if (!dm->backlight_dev[dm->num_of_edps])
> - amdgpu_dm_register_backlight_device(dm);
> + amdgpu_dm_register_backlight_device(dm, aconnector);
>  
>   if (dm->backlight_dev[dm->num_of_edps]) {
>   dm->backlight_link[dm->num_of_edps] = link;
> @@ -4337,7 +4339,7 @@ static int amdgpu_dm_initialize_drm_device(struct 
> amdgpu_device *adev)
>  
>   if (ret) {
>   
> amdgpu_dm_update_connector_after_detect(aconnector);
> - register_backlight_device(dm, link);
> + register_backlight_device(dm, aconnector, link);
>  
>   if (dm->num_of_edps)
>   update_connector_ext_caps(aconnector);



[PATCH] drm/amd/display: remove unused variable res_pool

2023-03-08 Thread Tom Rix
With gcc and W=1, there is this error
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_mst_types.c:1214:31:
  error: variable ‘res_pool’ set but not used [-Werror=unused-but-set-variable]
 1214 | struct resource_pool *res_pool;
  |   ^~~~

Since res_pool is unused, remove it.

Signed-off-by: Tom Rix 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index 2739bef9b90c..4b9b5e4050fc 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -1211,7 +1211,6 @@ static int pre_compute_mst_dsc_configs_for_state(struct 
drm_atomic_state *state,
bool computed_streams[MAX_PIPES];
struct amdgpu_dm_connector *aconnector;
struct drm_dp_mst_topology_mgr *mst_mgr;
-   struct resource_pool *res_pool;
int link_vars_start_index = 0;
int ret = 0;
 
@@ -1220,7 +1219,6 @@ static int pre_compute_mst_dsc_configs_for_state(struct 
drm_atomic_state *state,
 
for (i = 0; i < dc_state->stream_count; i++) {
stream = dc_state->streams[i];
-   res_pool = stream->ctx->dc->res_pool;
 
if (stream->signal != SIGNAL_TYPE_DISPLAY_PORT_MST)
continue;
-- 
2.27.0



Re: [PATCH 2/2] drm/probe_helper: warning on poll_enabled for issue catching

2023-03-08 Thread Jani Nikula
On Wed, 08 Mar 2023, Guchun Chen  wrote:
> In order to catch issues in other drivers to ensure proper call
> sequence of polling function.
>
> Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
> Fixes: a4e771729a51("drm/probe_helper: sort out poll_running vs poll_enabled")

How does an additional warning "fix" anything?

> Reported-by: Bert Karwatzki 
> Suggested-by: Dmitry Baryshkov 
> Signed-off-by: Guchun Chen 
> ---
>  drivers/gpu/drm/drm_probe_helper.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_probe_helper.c 
> b/drivers/gpu/drm/drm_probe_helper.c
> index 8127be134c39..85e0e80d4a52 100644
> --- a/drivers/gpu/drm/drm_probe_helper.c
> +++ b/drivers/gpu/drm/drm_probe_helper.c
> @@ -852,6 +852,8 @@ EXPORT_SYMBOL(drm_kms_helper_is_poll_worker);
>   */
>  void drm_kms_helper_poll_disable(struct drm_device *dev)
>  {
> + WARN_ON(!dev->mode_config.poll_enabled);

drm_WARN_ON()  please.

> +
>   if (dev->mode_config.poll_running)
>   drm_kms_helper_disable_hpd(dev);

-- 
Jani Nikula, Intel Open Source Graphics Center


[PATCH 2/2] drm/probe_helper: warning on poll_enabled for issue catching

2023-03-08 Thread Guchun Chen
In order to catch issues in other drivers to ensure proper call
sequence of polling function.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes: a4e771729a51("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki 
Suggested-by: Dmitry Baryshkov 
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/drm_probe_helper.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/drm_probe_helper.c 
b/drivers/gpu/drm/drm_probe_helper.c
index 8127be134c39..85e0e80d4a52 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -852,6 +852,8 @@ EXPORT_SYMBOL(drm_kms_helper_is_poll_worker);
  */
 void drm_kms_helper_poll_disable(struct drm_device *dev)
 {
+   WARN_ON(!dev->mode_config.poll_enabled);
+
if (dev->mode_config.poll_running)
drm_kms_helper_disable_hpd(dev);
 
-- 
2.25.1



[PATCH 1/2] drm/amdgpu: add flag to enable/disable poll in suspend/resume path

2023-03-08 Thread Guchun Chen
Some amd asics having reliable hotplug support don't call
drm_kms_helper_poll_init in driver init sequence. However,
due to the unified suspend/resume path for all asics, because
the output_poll_work->func is not set for these asics, a warning
arrives when suspending.

[   90.656049]  
[   90.656050]  ? console_unlock+0x4d/0x100
[   90.656053]  ? __irq_work_queue_local+0x27/0x60
[   90.656056]  ? irq_work_queue+0x2b/0x50
[   90.656057]  ? __wake_up_klogd+0x40/0x60
[   90.656059]  __cancel_work_timer+0xed/0x180
[   90.656061]  drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
[   90.656072]  amdgpu_device_suspend+0x81/0x170 [amdgpu]
[   90.656180]  amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
[   90.656269]  pci_pm_runtime_suspend+0x61/0x1b0

So add use_kms_poll flag as the initialization check in amdgpu code before
calling drm_kms_helper_poll_disable/drm_kms_helper_poll_enable in suspend/resume
path.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes: a4e771729a51("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki 
Suggested-by: Dmitry Baryshkov 
Signed-off-by: Guchun Chen 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 --
 drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h   | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c   | 1 +
 drivers/gpu/drm/amd/amdgpu/dce_v10_0.c | 1 +
 drivers/gpu/drm/amd/amdgpu/dce_v11_0.c | 1 +
 drivers/gpu/drm/amd/amdgpu/dce_v6_0.c  | 1 +
 drivers/gpu/drm/amd/amdgpu/dce_v8_0.c  | 1 +
 7 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index c4a4e2fe6681..74af0b8c0d08 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4145,7 +4145,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool 
fbcon)
if (amdgpu_acpi_smart_shift_update(dev, AMDGPU_SS_DEV_D3))
DRM_WARN("smart shift update failed\n");
 
-   drm_kms_helper_poll_disable(dev);
+   if (adev->mode_info.use_kms_poll)
+   drm_kms_helper_poll_disable(dev);
 
if (fbcon)

drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, true);
@@ -4243,7 +4244,8 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
if (fbcon)

drm_fb_helper_set_suspend_unlocked(adev_to_drm(adev)->fb_helper, false);
 
-   drm_kms_helper_poll_enable(dev);
+   if (adev->mode_info.use_kms_poll)
+   drm_kms_helper_poll_enable(dev);
 
amdgpu_ras_resume(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 32fe05c810c6..d383ea3e8e94 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -343,6 +343,7 @@ struct amdgpu_mode_info {
int disp_priority;
const struct amdgpu_display_funcs *funcs;
const enum drm_plane_type *plane_type;
+   bool use_kms_poll;
 };
 
 #define AMDGPU_MAX_BL_LEVEL 0xFF
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
index 53ff91fc6cf6..3277799a80bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c
@@ -518,6 +518,7 @@ static int amdgpu_vkms_sw_init(void *handle)
return r;
 
drm_kms_helper_poll_init(adev_to_drm(adev));
+   adev->mode_info.use_kms_poll = true;
 
adev->mode_info.mode_config_initialized = true;
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
index 9a24ed463abd..f4d0a7cf588b 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
@@ -2842,6 +2842,7 @@ static int dce_v10_0_sw_init(void *handle)
  amdgpu_display_hotplug_work_func);
 
drm_kms_helper_poll_init(adev_to_drm(adev));
+   adev->mode_info.use_kms_poll = true;
 
adev->mode_info.mode_config_initialized = true;
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
index c14b70350a51..25d0a866ca28 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
@@ -2961,6 +2961,7 @@ static int dce_v11_0_sw_init(void *handle)
  amdgpu_display_hotplug_work_func);
 
drm_kms_helper_poll_init(adev_to_drm(adev));
+   adev->mode_info.use_kms_poll = true;
 
adev->mode_info.mode_config_initialized = true;
return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c 
b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
index 7f85ba5b726f..3936c6bfe2e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
@@ -2720,6 +2720,7 @@ static int dce_v6_0_sw_init(void *handle)
  amdgpu_display_hotplug_work_func);
 

[6.3][regression] commit a4e771729a51168bc36317effaa9962e336d4f5e lead to flood kernel logs with warning messages "at kernel/workqueue.c:3167 __flush_work+0x472/0x500"

2023-03-08 Thread Mikhail Gavrilov
Hi,
I didn't faced to issue drm_bridge_hpd_enable+0x94/0x9c [drm] but
fixing this issue leads to warning messages on my laptop ASUS ROG
Strix G15 Advantage Edition G513QY-HQ007 which has two AMD GPU.
Discrete Radeon 6800M and integrated in CPU Cezanne Vega 8.

I found bad commit by bisecting:
❯ git bisect bad
a4e771729a51168bc36317effaa9962e336d4f5e is the first bad commit
commit a4e771729a51168bc36317effaa9962e336d4f5e
Author: Dmitry Baryshkov 
Date:   Tue Jan 24 12:45:48 2023 +0200

drm/probe_helper: sort out poll_running vs poll_enabled

There are two flags attemting to guard connector polling:
poll_enabled and poll_running. While poll_enabled semantics is clearly
defined and fully adhered (mark that drm_kms_helper_poll_init() was
called and not finalized by the _fini() call), the poll_running flag
doesn't have such clearliness.

This flag is used only in drm_helper_probe_single_connector_modes() to
guard calling of drm_kms_helper_poll_enable, it doesn't guard the
drm_kms_helper_poll_fini(), etc. Change it to only be set if the polling
is actually running. Tie HPD enablement to this flag.

This fixes the following warning reported after merging the HPD series:

Hot plug detection already enabled
WARNING: CPU: 2 PID: 9 at drivers/gpu/drm/drm_bridge.c:1257
drm_bridge_hpd_enable+0x94/0x9c [drm]
Modules linked in: videobuf2_memops snd_soc_simple_card
snd_soc_simple_card_utils fsl_imx8_ddr_perf videobuf2_common
snd_soc_imx_spdif adv7511 etnaviv imx8m_ddrc imx_dcss mc cec nwl_dsi
gov
CPU: 2 PID: 9 Comm: kworker/u8:0 Not tainted
6.2.0-rc2-15208-g25b283acd578 #6
Hardware name: NXP i.MX8MQ EVK (DT)
Workqueue: events_unbound deferred_probe_work_func
pstate: 6005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : drm_bridge_hpd_enable+0x94/0x9c [drm]
lr : drm_bridge_hpd_enable+0x94/0x9c [drm]
sp : 89ef3740
x29: 89ef3740 x28: 09331f00 x27: 1000
x26: 0020 x25: 81148ed8 x24: 0a8fe000
x23: fffd x22: 05086348 x21: 81133ee0
x20: 0550d800 x19: 05086288 x18: 0006
x17:  x16: 896ef008 x15: 972891004260
x14: 2a1403e19400 x13: 972891004260 x12: 2a1403e19400
x11: 7100385f29400801 x10: 0aa0 x9 : 88112744
x8 : 00250b00 x7 : 0003 x6 : 0011
x5 :  x4 : bd986a48 x3 : 0001
x2 :  x1 :  x0 : 0025
Call trace:
 drm_bridge_hpd_enable+0x94/0x9c [drm]
 drm_bridge_connector_enable_hpd+0x2c/0x3c [drm_kms_helper]
 drm_kms_helper_poll_enable+0x94/0x10c [drm_kms_helper]
 drm_helper_probe_single_connector_modes+0x1a8/0x510 [drm_kms_helper]
 drm_client_modeset_probe+0x204/0x1190 [drm]
 __drm_fb_helper_initial_config_and_unlock+0x5c/0x4a4 [drm_kms_helper]
 drm_fb_helper_initial_config+0x54/0x6c [drm_kms_helper]
 drm_fbdev_client_hotplug+0xd0/0x140 [drm_kms_helper]
 drm_fbdev_generic_setup+0x90/0x154 [drm_kms_helper]
 dcss_kms_attach+0x1c8/0x254 [imx_dcss]
 dcss_drv_platform_probe+0x90/0xfc [imx_dcss]
 platform_probe+0x70/0xcc
 really_probe+0xc4/0x2e0
 __driver_probe_device+0x80/0xf0
 driver_probe_device+0xe0/0x164
 __device_attach_driver+0xc0/0x13c
 bus_for_each_drv+0x84/0xe0
 __device_attach+0xa4/0x1a0
 device_initial_probe+0x1c/0x30
 bus_probe_device+0xa4/0xb0
 deferred_probe_work_func+0x90/0xd0
 process_one_work+0x200/0x474
 worker_thread+0x74/0x43c
 kthread+0xfc/0x110
 ret_from_fork+0x10/0x20
---[ end trace  ]---

Reported-by: Laurentiu Palcu 
Fixes: c8268795c9a9 ("drm/probe-helper: enable and disable HPD on
connectors")
Tested-by: Marek Szyprowski 
Tested-by: Chen-Yu Tsai 
Acked-by: Laurentiu Palcu 
Tested-by: Laurentiu Palcu 
Tested-by: Laurent Pinchart 
Signed-off-by: Dmitry Baryshkov 
Signed-off-by: Neil Armstrong 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20230124104548.3234554-2-dmitry.barysh...@linaro.org
(cherry picked from commit d33a54e3991dfce88b4fc6d9c3360951c2c5660d)
Signed-off-by: Thomas Zimmermann 

 drivers/gpu/drm/drm_probe_helper.c | 42 +++---
 1 file changed, 21 insertions(+), 21 deletions(-)

Of course I tried to check the bisect assumption by reverting this
commit. And I can confirm without commit
a4e771729a51168bc36317effaa9962e336d4f5e the warning messages do not
appear within a day.

I attached a full kernel log if someone would be interested to see it.

-- 
Best Regards,
Mike Gavrilov.
git bisect start
# status: waiting for both good and bad commits
# good: [5b7c4cabbb65f5c469464da6c5f614cbd7f730f2] Merge tag 'net-next-6.3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git 

Re: [PATCH v3 00/17] Enable Colorspace connector property in amdgpu

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:10:50 -0500
Harry Wentland  wrote:

> This patchset is based on Joshua's previous patchset [1], as well
> as my previous patchset [2].
> 
> It is
> - enabling support for the colorspace property in amdgpu, as well as
> - allowing drivers to specify the supported set of colorspaces, and
> - deprecating the BT2020_YCC and BT2020_RGB properties in favor of
>   a common BT2020 property. We leave the BT2020_CYCC property untouched
>   for now, same as the other _YVV properties. If they'll see use later
>   we might need to do something similar there, or allow userspace to
>   decide on the output encoding (RGB vs YUV).
> 
> Colorspace, Infoframes, and YCbCr matrix
> ---
> 
> Even though the initial intent of the colorspace property was to set the
> colorspace field in the respective HDMI AVI and DP SDP infoframes that
> is not sufficient in all scenarios. For DP the colorspace information
> also affects the MSA (main stream attribute) packet. For YUV output the
> colorspace affects the RGB-to-YCbCr conversion matrix. The colorspace
> field of the infopackets also depends on the encoding used, which is
> something that is decided by the driver and not known to userspace.
> 
> For these reasons a driver will need to be able to select the supported
> colorspaces at property creation.
> 
> Note: There seems to be an understanding that the colorspace property
> should ONLY modify the infoframe. While this is current behavior and
> sufficient in some cases it is nowhere specified that this should be the
> only use of this property. As outlined above this limitation is not
> going to work in all cases.
> 
> This patchset does not affect current behavior for the drivers that
> implement this property: i915 and vc4.
> 
> In the future we might want to give userspace control over the encoding
> format on the wire, in particular to avoid use of YUV420 when image
> fidelity is important. This work would likely go hand in hand with a
> min_bpc property and wouldn't conflict with the work done in this
> patchset.
> 
> Colorspace on crtc or connector?
> 
> 
> There have been suggestions of programming 'colorspace' on the drm_crtc
> but I don't think the crtc is the right place for this property. The
> drm_plane and drm_crtc will be used to offload color processing that
> would normally be done via the GFX or other pipelines. The drm_connector
> controls the signalling with the display and ensures the wire format is
> appropriate for the encoding by programming the RGB-to-YCbCr matrix.
> 
> [1] https://patchwork.freedesktop.org/series/113632/
> [2] https://patchwork.freedesktop.org/series/111865/

Hi Harry,

this is a really good cover letter.

I've given all the comments I have on this iteration.


Thanks,
pq


pgp5F14rqP9jI.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 15/17] drm/amd/display: Add default case for output_color_space switch

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:11:05 -0500
Harry Wentland  wrote:

> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Reviewed-By: Joshua Ashton 

Hi,

why?

Isn't the bitmask of supported values supposed to stop arbitrary values
from coming through?

Why handle unsupported values like DEFAULT instead of as a kernel bug?

If this is only to stop compiler warnings of not handling all enum
values in a switch, is the commit ordering in this series even
bisectable?


Thanks,
pq

> ---
>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 43 ++-
>  1 file changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index 7f77e226f1eb..a15b26962496 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -5308,7 +5308,29 @@ get_output_color_space(const struct dc_crtc_timing 
> *dc_crtc_timing,
>   enum dc_color_space color_space = COLOR_SPACE_SRGB;
>  
>   switch (connector_state->colorspace) {
> + case DRM_MODE_COLORIMETRY_BT601_YCC:
> + if (dc_crtc_timing->flags.Y_ONLY)
> + color_space = COLOR_SPACE_YCBCR601_LIMITED;
> + else
> + color_space = COLOR_SPACE_YCBCR601;
> + break;
> + case DRM_MODE_COLORIMETRY_BT709_YCC:
> + if (dc_crtc_timing->flags.Y_ONLY)
> + color_space = COLOR_SPACE_YCBCR709_LIMITED;
> + else
> + color_space = COLOR_SPACE_YCBCR709;
> + break;
> + case DRM_MODE_COLORIMETRY_OPRGB:
> + color_space = COLOR_SPACE_ADOBERGB;
> + break;
> + case DRM_MODE_COLORIMETRY_BT2020:
> + color_space = COLOR_SPACE_2020_RGB_FULLRANGE;
> + break;
> + case DRM_MODE_COLORIMETRY_BT2020_DEPRECATED:
> + color_space = COLOR_SPACE_2020_YCBCR;
> + break;
>   case DRM_MODE_COLORIMETRY_DEFAULT: // ITU601
> + default:
>   if (dc_crtc_timing->pixel_encoding == PIXEL_ENCODING_RGB) {
>   color_space = COLOR_SPACE_SRGB;
>   /*
> @@ -5330,27 +5352,6 @@ get_output_color_space(const struct dc_crtc_timing 
> *dc_crtc_timing,
>   color_space = COLOR_SPACE_YCBCR601;
>   }
>   break;
> - case DRM_MODE_COLORIMETRY_BT601_YCC:
> - if (dc_crtc_timing->flags.Y_ONLY)
> - color_space = COLOR_SPACE_YCBCR601_LIMITED;
> - else
> - color_space = COLOR_SPACE_YCBCR601;
> - break;
> - case DRM_MODE_COLORIMETRY_BT709_YCC:
> - if (dc_crtc_timing->flags.Y_ONLY)
> - color_space = COLOR_SPACE_YCBCR709_LIMITED;
> - else
> - color_space = COLOR_SPACE_YCBCR709;
> - break;
> - case DRM_MODE_COLORIMETRY_OPRGB:
> - color_space = COLOR_SPACE_ADOBERGB;
> - break;
> - case DRM_MODE_COLORIMETRY_BT2020:
> - color_space = COLOR_SPACE_2020_RGB_FULLRANGE;
> - break;
> - case DRM_MODE_COLORIMETRY_BT2020_DEPRECATED:
> - color_space = COLOR_SPACE_2020_YCBCR;
> - break;
>   }
>  
>   return color_space;



pgp1z_zd_cAw0.pgp
Description: OpenPGP digital signature


[PATCH 2/2] drm/amd/amdkfd: Fix build error with unmatched argument type

2023-03-08 Thread Qingqing Zhuo
[Why]
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_migrate.c: In function 
‘svm_migrate_copy_to_vram’:
./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:35:21:
error: format ‘%lx’ expects argument of type ‘long unsigned int’,
but argument 6 has type ‘uint64_t’ {aka ‘long long unsigned int’} 
[-Werror=format=]
   35 | #define pr_fmt(fmt) "amdgpu: " fmt
  | ^~

[How]
use %llx instead of %lx for ttm_res_offset.

Fixes: d5db9d377c021 ("drm/amdkfd: Fix BO offset for multi-VMA page migration")
Signed-off-by: Qingqing Zhuo 

Cc: Xiaogang Chen 
Cc: Felix Kuehling 

---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index 373cd7b0e1ca..fd54a00e7229 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -304,7 +304,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct 
svm_range *prange,
uint64_t i, j;
int r;
 
-   pr_debug("svms 0x%p [0x%lx 0x%lx 0x%lx]\n", prange->svms, prange->start,
+   pr_debug("svms 0x%p [0x%lx 0x%lx 0x%llx]\n", prange->svms, 
prange->start,
 prange->last, ttm_res_offset);
 
src = scratch;
-- 
2.34.1



[PATCH 1/2] drm/vc4: Fix build error with undefined label

2023-03-08 Thread Qingqing Zhuo
[Why]
drivers/gpu/drm/vc4/vc4_hdmi.c: In function ‘vc4_hdmi_bind’:
drivers/gpu/drm/vc4/vc4_hdmi.c:3448:17: error: label ‘err_disable_runtime_pm’ 
used but not defined

[How]
update err_disable_runtime_pm to err_put_runtime_pm.

Signed-off-by: Qingqing Zhuo 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index 9e145690c480..edf882360d24 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -3445,7 +3445,7 @@ static int vc4_hdmi_bind(struct device *dev, struct 
device *master, void *data)
 */
ret = pm_runtime_resume_and_get(dev);
if (ret)
-   goto err_disable_runtime_pm;
+   goto err_put_runtime_pm;
 
if ((of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi0") ||
 of_device_is_compatible(dev->of_node, "brcm,bcm2711-hdmi1")) &&
-- 
2.34.1



Re: [PATCH v3 14/17] drm/amd/display: Add debugfs for testing output colorspace

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:11:04 -0500
Harry Wentland  wrote:

> In order to IGT test colorspace we'll want to print
> the currently enabled colorspace on a stream. We add
> a new debugfs to do so, using the same scheme as
> current bpc reporting.
> 
> This might also come in handy when debugging display
> issues.
> 
> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Reviewed-By: Joshua Ashton 
> ---
>  .../amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 57 +++
>  1 file changed, 57 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> index 4a5dae578d97..f0022c16b708 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c
> @@ -906,6 +906,61 @@ static int amdgpu_current_bpc_show(struct seq_file *m, 
> void *data)
>  }
>  DEFINE_SHOW_ATTRIBUTE(amdgpu_current_bpc);
>  
> +/*
> + * Returns the current bpc for the crtc.

Hi,

bpc?

> + * Example usage: cat 
> /sys/kernel/debug/dri/0/crtc-0/amdgpu_current_colorspace
> + */
> +static int amdgpu_current_colorspace_show(struct seq_file *m, void *data)


Thanks,
pq

> +{
> + struct drm_crtc *crtc = m->private;
> + struct drm_device *dev = crtc->dev;
> + struct dm_crtc_state *dm_crtc_state = NULL;
> + int res = -ENODEV;
> +
> + mutex_lock(>mode_config.mutex);
> + drm_modeset_lock(>mutex, NULL);
> + if (crtc->state == NULL)
> + goto unlock;
> +
> + dm_crtc_state = to_dm_crtc_state(crtc->state);
> + if (dm_crtc_state->stream == NULL)
> + goto unlock;
> +
> + switch (dm_crtc_state->stream->output_color_space) {
> + case COLOR_SPACE_SRGB:
> + seq_printf(m, "RGB");
> + break;
> + case COLOR_SPACE_YCBCR601:
> + case COLOR_SPACE_YCBCR601_LIMITED:
> + seq_printf(m, "BT601_YCC");
> + break;
> + case COLOR_SPACE_YCBCR709:
> + case COLOR_SPACE_YCBCR709_LIMITED:
> + seq_printf(m, "BT709_YCC");
> + break;
> + case COLOR_SPACE_ADOBERGB:
> + seq_printf(m, "opRGB");
> + break;
> + case COLOR_SPACE_2020_RGB_FULLRANGE:
> + seq_printf(m, "BT2020_RGB");
> + break;
> + case COLOR_SPACE_2020_YCBCR:
> + seq_printf(m, "BT2020_YCC");
> + break;
> + default:
> + goto unlock;
> + }
> + res = 0;
> +
> +unlock:
> + drm_modeset_unlock(>mutex);
> + mutex_unlock(>mode_config.mutex);
> +
> + return res;
> +}
> +DEFINE_SHOW_ATTRIBUTE(amdgpu_current_colorspace);
> +
> +
>  /*
>   * Example usage:
>   * Disable dsc passthrough, i.e.,: have dsc decoding at converver, not 
> external RX
> @@ -3235,6 +3290,8 @@ void crtc_debugfs_init(struct drm_crtc *crtc)
>  #endif
>   debugfs_create_file("amdgpu_current_bpc", 0644, crtc->debugfs_entry,
>   crtc, _current_bpc_fops);
> + debugfs_create_file("amdgpu_current_colorspace", 0644, 
> crtc->debugfs_entry,
> + crtc, _current_colorspace_fops);
>  }
>  
>  /*



pgpYcs8DJ4IX3.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 09/17] drm/amd/display: Register Colorspace property for DP and HDMI

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:10:59 -0500
Harry Wentland  wrote:

> We want compositors to be able to set the output
> colorspace on DP and HDMI outputs, based on the
> caps reported from the receiver via EDID.
> 
> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Reviewed-By: Joshua Ashton 
> ---
>  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> index f91b2ea13d96..2d883c6dae90 100644
> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
> @@ -7184,6 +7184,12 @@ static int amdgpu_dm_connector_get_modes(struct 
> drm_connector *connector)
>   return amdgpu_dm_connector->num_modes;
>  }
>  
> +static const u32 supported_colorspaces =
> + BIT(DRM_MODE_COLORIMETRY_BT709_YCC) |
> + BIT(DRM_MODE_COLORIMETRY_OPRGB) |
> + BIT(DRM_MODE_COLORIMETRY_BT2020) |
> + BIT(DRM_MODE_COLORIMETRY_BT2020_DEPRECATED);

No DEFAULT?
No BT.709 RGB, i.e. sRGB?

Doesn't DRM core reject enum uint values that are not listed in the enum
property?


Thanks,
pq

> +
>  void amdgpu_dm_connector_init_helper(struct amdgpu_display_manager *dm,
>struct amdgpu_dm_connector *aconnector,
>int connector_type,
> @@ -7264,6 +7270,15 @@ void amdgpu_dm_connector_init_helper(struct 
> amdgpu_display_manager *dm,
>   adev->mode_info.abm_level_property, 0);
>   }
>  
> + if (connector_type == DRM_MODE_CONNECTOR_HDMIA) {
> + if 
> (!drm_mode_create_hdmi_colorspace_property(>base, 
> supported_colorspaces))
> + 
> drm_connector_attach_colorspace_property(>base);
> + } else if (connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
> +connector_type == DRM_MODE_CONNECTOR_eDP) {
> + if (!drm_mode_create_dp_colorspace_property(>base, 
> supported_colorspaces))
> + 
> drm_connector_attach_colorspace_property(>base);
> + }
> +
>   if (connector_type == DRM_MODE_CONNECTOR_HDMIA ||
>   connector_type == DRM_MODE_CONNECTOR_DisplayPort ||
>   connector_type == DRM_MODE_CONNECTOR_eDP) {



pgpMkF1_DzGWX.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 06/17] drm/connector: Print connector colorspace in state debugfs

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:10:56 -0500
Harry Wentland  wrote:

> v3: Fix kerneldocs (kernel test robot)
> 
> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Cc: Joshua Ashton 
> Cc: Jani Nikula 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> Reviewed-By: Joshua Ashton 
> ---
>  drivers/gpu/drm/drm_atomic.c|  1 +
>  drivers/gpu/drm/drm_connector.c | 15 +++
>  include/drm/drm_connector.h |  1 +
>  3 files changed, 17 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index c0dc5858a723..d6d04c4ccfc0 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -1071,6 +1071,7 @@ static void drm_atomic_connector_print_state(struct 
> drm_printer *p,
>   drm_printf(p, "\tcrtc=%s\n", state->crtc ? state->crtc->name : 
> "(null)");
>   drm_printf(p, "\tself_refresh_aware=%d\n", state->self_refresh_aware);
>   drm_printf(p, "\tmax_requested_bpc=%d\n", state->max_requested_bpc);
> + drm_printf(p, "\tcolorspace=%s\n", 
> drm_get_colorspace_name(state->colorspace));
>  
>   if (connector->connector_type == DRM_MODE_CONNECTOR_WRITEBACK)
>   if (state->writeback_job && state->writeback_job->fb)
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index 7649f0ac454f..7ed48f9cbb20 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -1044,6 +1044,21 @@ static const char * const colorspace_names[] = {
>   [DRM_MODE_COLORIMETRY_BT601_YCC] = "BT601_YCC",
>  };
>  
> +/**
> + * drm_get_colorspace_name - return a string for color encoding
> + * @colorspace: color space to compute name of
> + *
> + * In contrast to the other drm_get_*_name functions this one here returns a
> + * const pointer and hence is threadsafe.
> + */
> +const char *drm_get_colorspace_name(enum drm_colorspace colorspace)
> +{
> + if (WARN_ON(colorspace >= ARRAY_SIZE(colorspace_names)))
> + return "unknown";
> +
> + return colorspace_names[colorspace];

Should this protect against returning NULL? Well, I suppose that cannot
happen right now, and probably holes will not be added in the enum. But
should kernel code still be more paranoid?


Thanks,
pq

> +}
> +
>  static const u32 hdmi_colorspaces =
>   BIT(DRM_MODE_COLORIMETRY_SMPTE_170M_YCC) |
>   BIT(DRM_MODE_COLORIMETRY_BT709_YCC) |
> diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> index 46c064d9ffef..c77e42408522 100644
> --- a/include/drm/drm_connector.h
> +++ b/include/drm/drm_connector.h
> @@ -1970,6 +1970,7 @@ void drm_connector_list_iter_end(struct 
> drm_connector_list_iter *iter);
>  
>  bool drm_connector_has_possible_encoder(struct drm_connector *connector,
>   struct drm_encoder *encoder);
> +const char *drm_get_colorspace_name(enum drm_colorspace colorspace);
>  
>  /**
>   * drm_for_each_connector_iter - connector_list iterator macro



pgpZ8zS1Vd7e2.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 05/17] drm/connector: Use common colorspace_names array

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:10:55 -0500
Harry Wentland  wrote:

> We an use bitfields to track the support ones for HDMI
> and DP. This allows us to print colorspaces in a consistent
> manner without needing to know whether we're dealing with
> DP or HDMI.
> 
> Signed-off-by: Harry Wentland 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Cc: Joshua Ashton 
> Cc: Jani Nikula 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> ---
>  drivers/gpu/drm/drm_connector.c | 131 +++-
>  include/drm/drm_connector.h |   1 +
>  2 files changed, 78 insertions(+), 54 deletions(-)
> 

...

> diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> index 3e2e1bc7aa04..46c064d9ffef 100644
> --- a/include/drm/drm_connector.h
> +++ b/include/drm/drm_connector.h
> @@ -460,6 +460,7 @@ enum drm_colorspace {
>   DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED = 13,
>   DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT = 14,
>   DRM_MODE_COLORIMETRY_BT601_YCC  = 15,
> + DRM_MODE_COLORIMETRY_MAX

Maybe a comment to say that MAX is not a valid value?
Given that things like iccMAX exist (even though it makes no sense as a
colorspace), MAX could perhaps be confused with something.

Or call it DRM_MODE_COLORIMETRY__COUNT? or __END?


Thanks,
pq


pgpqIl04gxPOZ.pgp
Description: OpenPGP digital signature


Re: [PATCH 9/9] drm: move ttm_execbuf_util into vmwgfx

2023-03-08 Thread Christian König

Am 08.03.23 um 06:14 schrieb Zack Rusin:

On Tue, 2023-02-28 at 09:34 +0100, Christian König wrote:

VMWGFX is the only remaining user of this and should probably moved over
to drm_exec when it starts using GEM as well.

Is this because vmwgfx piggybacks buffer-id relocations on top of ttm 
validations or
did you just find it too hard to port it over? I'd prefer to avoid ttm moves to
vmwgfx and at least have a clear idea of what we need to do to port.


I've just found it to hard to port it over because vmwgfx does some 
strange things with the validation code here.


If you want we can take a deeper look at this together, but I need to 
find some time.


Alternatively just tell me how to do it and I will add that to the patch 
set :)


Regards,
Christian.



z




Re: [PATCH v3 03/17] drm/connector: Deprecate split for BT.2020 in drm_colorspace enum

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:10:53 -0500
Harry Wentland  wrote:

> From: Joshua Ashton 
> 
> Userspace has no way of controlling or knowing the pixel encoding
> currently, so there is no way for it to ever get the right values here.
> 
> When we do add pixel_encoding control from userspace,we can pick the
> right value for the colorimetry packet based on the
> pixel_encoding + the colorspace.
> 
> Let's deprecate these values, and have one BT.2020 colorspace entry
> that userspace can use.
> 
> v2:
>  - leave CYCC alone for now; it serves a purpose
>  - leave BT2020_RGB the new default BT2020
> 
> Signed-off-by: Joshua Ashton 
> Signed-off-by: Harry Wentland 
> Reviewed-by: Harry Wentland 
> 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> ---
>  drivers/gpu/drm/display/drm_hdmi_helper.c |  7 +++
>  drivers/gpu/drm/drm_connector.c   |  8 
>  drivers/gpu/drm/i915/display/intel_dp.c   | 14 +++---
>  include/drm/drm_connector.h   | 15 +--
>  4 files changed, 23 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/display/drm_hdmi_helper.c 
> b/drivers/gpu/drm/display/drm_hdmi_helper.c
> index faf5e9efa7d3..05a0d03ffcda 100644
> --- a/drivers/gpu/drm/display/drm_hdmi_helper.c
> +++ b/drivers/gpu/drm/display/drm_hdmi_helper.c
> @@ -97,8 +97,7 @@ EXPORT_SYMBOL(drm_hdmi_infoframe_set_hdr_metadata);
>  #define HDMI_COLORIMETRY_OPYCC_601   (C(3) | EC(3) | ACE(0))
>  #define HDMI_COLORIMETRY_OPRGB   (C(3) | EC(4) | ACE(0))
>  #define HDMI_COLORIMETRY_BT2020_CYCC (C(3) | EC(5) | ACE(0))
> -#define HDMI_COLORIMETRY_BT2020_RGB  (C(3) | EC(6) | ACE(0))
> -#define HDMI_COLORIMETRY_BT2020_YCC  (C(3) | EC(6) | ACE(0))
> +#define HDMI_COLORIMETRY_BT2020  (C(3) | EC(6) | ACE(0))
>  #define HDMI_COLORIMETRY_DCI_P3_RGB_D65  (C(3) | EC(7) | ACE(0))
>  #define HDMI_COLORIMETRY_DCI_P3_RGB_THEATER  (C(3) | EC(7) | ACE(1))
>  
> @@ -112,8 +111,8 @@ static const u32 hdmi_colorimetry_val[] = {
>   [DRM_MODE_COLORIMETRY_OPYCC_601] = HDMI_COLORIMETRY_OPYCC_601,
>   [DRM_MODE_COLORIMETRY_OPRGB] = HDMI_COLORIMETRY_OPRGB,
>   [DRM_MODE_COLORIMETRY_BT2020_CYCC] = HDMI_COLORIMETRY_BT2020_CYCC,
> - [DRM_MODE_COLORIMETRY_BT2020_RGB] = HDMI_COLORIMETRY_BT2020_RGB,
> - [DRM_MODE_COLORIMETRY_BT2020_YCC] = HDMI_COLORIMETRY_BT2020_YCC,
> + [DRM_MODE_COLORIMETRY_BT2020_DEPRECATED] = HDMI_COLORIMETRY_BT2020,
> + [DRM_MODE_COLORIMETRY_BT2020] = HDMI_COLORIMETRY_BT2020,
>  };
>  
>  #undef C
> diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
> index 61c29ce74b03..fe7eab15f727 100644
> --- a/drivers/gpu/drm/drm_connector.c
> +++ b/drivers/gpu/drm/drm_connector.c
> @@ -1031,9 +1031,9 @@ static const struct drm_prop_enum_list 
> hdmi_colorspaces[] = {
>   /* Colorimetry based on ITU-R BT.2020 */
>   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
>   /* Colorimetry based on ITU-R BT.2020 */
> - { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" },
> + { DRM_MODE_COLORIMETRY_BT2020, "BT2020" },
>   /* Colorimetry based on ITU-R BT.2020 */
> - { DRM_MODE_COLORIMETRY_BT2020_YCC, "BT2020_YCC" },
> + { DRM_MODE_COLORIMETRY_BT2020_DEPRECATED, "BT2020_DEPRECATED" },
>   /* Added as part of Additional Colorimetry Extension in 861.G */
>   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
>   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER, "DCI-P3_RGB_Theater" },
> @@ -1054,7 +1054,7 @@ static const struct drm_prop_enum_list dp_colorspaces[] 
> = {
>   /* Colorimetry based on SMPTE RP 431-2 */
>   { DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65, "DCI-P3_RGB_D65" },
>   /* Colorimetry based on ITU-R BT.2020 */
> - { DRM_MODE_COLORIMETRY_BT2020_RGB, "BT2020_RGB" },
> + { DRM_MODE_COLORIMETRY_BT2020, "BT2020" },
>   { DRM_MODE_COLORIMETRY_BT601_YCC, "BT601_YCC" },
>   { DRM_MODE_COLORIMETRY_BT709_YCC, "BT709_YCC" },
>   /* Standard Definition Colorimetry based on IEC 61966-2-4 */
> @@ -1068,7 +1068,7 @@ static const struct drm_prop_enum_list dp_colorspaces[] 
> = {
>   /* Colorimetry based on ITU-R BT.2020 */
>   { DRM_MODE_COLORIMETRY_BT2020_CYCC, "BT2020_CYCC" },
>   /* Colorimetry based on ITU-R BT.2020 */
> - { DRM_MODE_COLORIMETRY_BT2020_YCC, "BT2020_YCC" },
> + { DRM_MODE_COLORIMETRY_BT2020_DEPRECATED, "BT2020_DEPRECATED" },

Let's hope no-one complains about missing the old string names in UABI. :-)

Actually, you should write in the commit message why removing old names
is fine.

>  };
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> b/drivers/gpu/drm/i915/display/intel_dp.c
> index c9be61d2348e..be100a193bf5 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> +++ 

Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module exit.

2023-03-08 Thread Christian König

Am 07.03.23 um 16:28 schrieb Belanger, David:

[AMD Official Use Only - General]


The test case is a python program that will load the driver, do some 
operations, then unload the driver.


What do you mean with unloading the driver? Removing the module? Or 
destroying the device?



When the driver exists, there is still the python process space around holding 
on the address space.
When the python process space exits, the mmu_notifier gets called but the 
driver has already been unloaded.

The goal of the fix is to address case where there could be outstanding address 
space / worker threads for process
cleanup that needs to be cleared/completed at exit time.


Yeah and when the module is unloaded this is a completely futile effort.

The general upstream approach is to take references on the struct device 
and module and prevent unloading as long as those references exists.


The device might be non-functional any more (because for example of hot 
plug), but the driver should never be unloaded before the python program 
exits.


Regards,
Christian.



Regards,
David B.


-Original Message-
From: Koenig, Christian 
Sent: Tuesday, March 7, 2023 2:05 AM
To: Belanger, David ; amd-
g...@lists.freedesktop.org
Subject: Re: [PATCH] drm/amdkfd: Fixed kfd_process cleanup on module
exit.

Am 06.03.23 um 22:58 schrieb David Belanger:

Handle case when module is unloaded (kfd_exit) before a process space
(mm_struct) is released.

Well that should never ever happen in the first place. It sounds like we are
missing grabbing module references.

Regards,
Christian.


Signed-off-by: David Belanger 
---
   drivers/gpu/drm/amd/amdkfd/kfd_module.c  |  4 ++
   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 57



   2 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 09b966dc3768..8ef4bd9e4f7d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -26,6 +26,9 @@
   #include "kfd_priv.h"
   #include "amdgpu_amdkfd.h"

+void kfd_cleanup_processes(void);
+
+
   static int kfd_init(void)
   {
int err;
@@ -77,6 +80,7 @@ static int kfd_init(void)

   static void kfd_exit(void)
   {
+   kfd_cleanup_processes();
kfd_debugfs_fini();
kfd_process_destroy_wq();
kfd_procfs_shutdown();
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index ebabe92f7edb..b5b28a32639d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1181,6 +1181,17 @@ static void kfd_process_notifier_release(struct

mmu_notifier *mn,

return;

mutex_lock(_processes_mutex);
+   /*
+* Do early return if p is not in the table.
+*
+* This could potentially happen if this function is called concurrently
+* by mmu_notifier and by kfd_cleanup_pocesses.
+*
+*/
+   if (!hash_hashed(>kfd_processes)) {
+   mutex_unlock(_processes_mutex);
+   return;
+   }
hash_del_rcu(>kfd_processes);
mutex_unlock(_processes_mutex);
synchronize_srcu(_processes_srcu);
@@ -1200,6 +1211,52 @@ static const struct mmu_notifier_ops

kfd_process_mmu_notifier_ops = {

.free_notifier = kfd_process_free_notifier,
   };

+
+void kfd_cleanup_processes(void)
+{
+   struct kfd_process *p;
+   unsigned int temp;
+
+   /*
+* Iterate over remaining processes in table, calling notifier release
+* to free up notifier and process resources.
+*
+* This code handles the case when driver is unloaded before all

mm_struct

+* are released.
+*/
+   int idx = srcu_read_lock(_processes_srcu);
+
+   hash_for_each_rcu(kfd_processes_table, temp, p, kfd_processes) {
+   if (p) {
+   /*
+* Obtain a reference on p to avoid a late

mmu_notifier release

+* call triggering freeing the process.
+*/
+
+   kref_get(>ref);
+
+   srcu_read_unlock(_processes_srcu, idx);
+
+   kfd_process_notifier_release(>mmu_notifier, p-
mm);
+
+   kfd_unref_process(p);
+
+   idx = srcu_read_lock(_processes_srcu);
+   }
+   }
+   srcu_read_unlock(_processes_srcu, idx);
+
+   /*
+* Must be called after all mmu_notifier_put are done and before
+* kfd_process_wq is released.
+*
+* Ensures that all outstanding free_notifier gets called, triggering 
the

release

+* of the process.
+*/
+   mmu_notifier_synchronize();
+}
+
+
   static int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file

*filep)

   {
unsigned long  offset;




Re: [PATCH v3 02/17] drm/connector: Add enum documentation to drm_colorspace

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:10:52 -0500
Harry Wentland  wrote:

> From: Joshua Ashton 
> 
> To match the other enums, and add more information about these values.
> 
> v2:
>  - Specify where an enum entry comes from
>  - Clarify DEFAULT and NO_DATA behavior
>  - BT.2020 CYCC is "constant luminance"
>  - correct type for BT.601
> 
> Signed-off-by: Joshua Ashton 
> Signed-off-by: Harry Wentland 
> Reviewed-by: Harry Wentland 

Hi,

this effort is really good, but of course I still find things to
nitpick about. If there is no answer to my questions, then I would
prefer the documentation to spell out the unknowns and ambiguities.

> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> ---
>  include/drm/drm_connector.h | 67 +++--
>  1 file changed, 65 insertions(+), 2 deletions(-)
> 
> diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> index 6d6a53a6b010..bb078666dc34 100644
> --- a/include/drm/drm_connector.h
> +++ b/include/drm/drm_connector.h
> @@ -363,13 +363,76 @@ enum drm_privacy_screen_status {
>   PRIVACY_SCREEN_ENABLED_LOCKED,
>  };
>  
> -/*
> - * This is a consolidated colorimetry list supported by HDMI and
> +/**
> + * enum drm_colorspace - color space
> + *
> + * This enum is a consolidated colorimetry list supported by HDMI and
>   * DP protocol standard. The respective connectors will register
>   * a property with the subset of this list (supported by that
>   * respective protocol). Userspace will set the colorspace through
>   * a colorspace property which will be created and exposed to
>   * userspace.
> + *
> + * DP definitions come from the DP v2.0 spec
> + * HDMI definitions come from the CTA-861-H spec
> + *
> + * @DRM_MODE_COLORIMETRY_DEFAULT:
> + *   Driver specific behavior.
> + *   For DP:
> + *   RGB encoded: sRGB (IEC 61966-2-1)
> + *   YCbCr encoded: ITU-R BT.601 colorimetry format

Does this mean that HDMI behavior is driver-specific while DP behavior
is as defined?

Is it intentional that YCbCr encoding also uses different RGB-primaries
than RGB-encoded signal? (BT.601 vs. BT.709/sRGB)

Or do you need to be more explicit on which parts of each spec apply
(ColourPrimaries vs. TransferCharacteristics vs. MatrixCoefficients in
CICP parlance)?

E.g. BT.709/sRGB ColourPrimaries with BT.601 MatrixCoefficients.

> + * @DRM_MODE_COLORIMETRY_NO_DATA:
> + *   Driver specific behavior.
> + *   For HDMI:
> + *   Sets "No Data" in infoframe

Does DEFAULT mean that something else than "No Data" may be set in the
HDMI infoframe?

If so, since these two have the same value, where is the difference? Is
DEFAULT purely an UAPI token, and NO_DATA used internally? Or NO_DATA
used only when crafting actual infoframe packets?

Should NO_DATA be documented to be a strictly driver-internal value,
and not documented with UAPI?

I am unclear if userspace is using these enum values directly, or do
they use the string names only.

> + * @DRM_MODE_COLORIMETRY_SMPTE_170M_YCC:
> + *   (HDMI)
> + *   SMPTE ST 170M colorimetry format

Does "colorimetry format" mean that the spec is used in full, for all
of ColourPrimaries, TransferCharacteristics and MatrixCoefficients?

If yes, good. If not, the wording misleads me.

> + * @DRM_MODE_COLORIMETRY_BT709_YCC:
> + *   (HDMI, DP)
> + *   ITU-R BT.709 colorimetry format
> + * @DRM_MODE_COLORIMETRY_XVYCC_601:
> + *   (HDMI, DP)
> + *   xvYCC601 colorimetry format
> + * @DRM_MODE_COLORIMETRY_XVYCC_709:
> + *   (HDMI, DP)
> + *   xvYCC709 colorimetry format

Btw. xvYCC are funny because they require limited quantization range
encoding, but use the foot- and headroom to encode out-of-nominal-range
values in order to expand the color gamut with negative and greater
than unity values.

Just for curiosity, is it in any way possible today to make use of that
extended color gamut through KMS? Has it ever been possible?

I mean, the KMS color pipeline assumes full-range RGB, so I don't see
any way to make use of xvYCC.

> + * @DRM_MODE_COLORIMETRY_SYCC_601:
> + *   (HDMI, DP)
> + *   sYCC601 colorimetry format
> + * @DRM_MODE_COLORIMETRY_OPYCC_601:
> + *   (HDMI, DP)
> + *   opYCC601 colorimetry format
> + * @DRM_MODE_COLORIMETRY_OPRGB:
> + *   (HDMI, DP)
> + *   opRGB colorimetry format
> + * @DRM_MODE_COLORIMETRY_BT2020_CYCC:
> + *   (HDMI, DP)
> + *   ITU-R BT.2020 Y'c C'bc C'rc (constant luminance) colorimetry format
> + * @DRM_MODE_COLORIMETRY_BT2020_RGB:
> + *   (HDMI, DP)
> + *   ITU-R BT.2020 R' G' B' colorimetry format
> + * @DRM_MODE_COLORIMETRY_BT2020_YCC:
> + *   (HDMI, DP)
> + *   ITU-R BT.2020 Y' C'b C'r colorimetry format
> + * @DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65:
> + *   (HDMI)
> + *   SMPTE ST 2113 P3D65 colorimetry format
> + * @DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER:
> + *   (HDMI)
> + *   SMPTE ST 2113 P3DCI colorimetry format
> + * 

Re: [PATCH v4 01/17] drm/connector: Convert DRM_MODE_COLORIMETRY to enum

2023-03-08 Thread Pekka Paalanen
On Tue, 7 Mar 2023 10:29:34 -0500
Harry Wentland  wrote:

> This allows us to use strongly typed arguments.
> 
> v2:
>  - Bring NO_DATA back
>  - Provide explicit enum values
> 
> v4: Drop unnecessary '&' from kerneldoc (emersion)
> 
> Signed-off-by: Harry Wentland 
> Reviewed-by: Simon Ser 
> 
> Cc: Pekka Paalanen 
> Cc: Sebastian Wick 
> Cc: vitaly.pros...@amd.com
> Cc: Uma Shankar 
> Cc: Ville Syrjälä 
> Cc: Joshua Ashton 
> Cc: dri-de...@lists.freedesktop.org
> Cc: amd-gfx@lists.freedesktop.org
> ---
>  include/drm/display/drm_dp.h |  2 +-
>  include/drm/drm_connector.h  | 49 ++--
>  2 files changed, 26 insertions(+), 25 deletions(-)
> 
> diff --git a/include/drm/display/drm_dp.h b/include/drm/display/drm_dp.h
> index ed10e6b6f99d..dae5e9c201e4 100644
> --- a/include/drm/display/drm_dp.h
> +++ b/include/drm/display/drm_dp.h
> @@ -1623,7 +1623,7 @@ enum dp_pixelformat {
>   *
>   * This enum is used to indicate DP VSC SDP Colorimetry formats.
>   * It is based on DP 1.4 spec [Table 2-117: VSC SDP Payload for DB16 through
> - * DB18] and a name of enum member follows DRM_MODE_COLORIMETRY definition.
> + * DB18] and a name of enum member follows enum drm_colorimetry definition.
>   *
>   * @DP_COLORIMETRY_DEFAULT: sRGB (IEC 61966-2-1) or
>   *  ITU-R BT.601 colorimetry format
> diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
> index 4d830fc55a3d..6d6a53a6b010 100644
> --- a/include/drm/drm_connector.h
> +++ b/include/drm/drm_connector.h
> @@ -371,29 +371,30 @@ enum drm_privacy_screen_status {
>   * a colorspace property which will be created and exposed to
>   * userspace.
>   */
> -
> -/* For Default case, driver will set the colorspace */
> -#define DRM_MODE_COLORIMETRY_DEFAULT 0
> -/* CEA 861 Normal Colorimetry options */
> -#define DRM_MODE_COLORIMETRY_NO_DATA 0
> -#define DRM_MODE_COLORIMETRY_SMPTE_170M_YCC  1
> -#define DRM_MODE_COLORIMETRY_BT709_YCC   2
> -/* CEA 861 Extended Colorimetry Options */
> -#define DRM_MODE_COLORIMETRY_XVYCC_601   3
> -#define DRM_MODE_COLORIMETRY_XVYCC_709   4
> -#define DRM_MODE_COLORIMETRY_SYCC_6015
> -#define DRM_MODE_COLORIMETRY_OPYCC_601   6
> -#define DRM_MODE_COLORIMETRY_OPRGB   7
> -#define DRM_MODE_COLORIMETRY_BT2020_CYCC 8
> -#define DRM_MODE_COLORIMETRY_BT2020_RGB  9
> -#define DRM_MODE_COLORIMETRY_BT2020_YCC  10
> -/* Additional Colorimetry extension added as part of CTA 861.G */
> -#define DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65  11
> -#define DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER  12
> -/* Additional Colorimetry Options added for DP 1.4a VSC Colorimetry Format */
> -#define DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED  13
> -#define DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT  14
> -#define DRM_MODE_COLORIMETRY_BT601_YCC   15
> +enum drm_colorspace {
> + /* For Default case, driver will set the colorspace */
> + DRM_MODE_COLORIMETRY_DEFAULT= 0,
> + DRM_MODE_COLORIMETRY_NO_DATA= 0,
> + /* CEA 861 Normal Colorimetry options */

This comment seems to be in the wrong place, NO_DATA should be under
this comment.

With that fixed:
Reviewed-by: Pekka Paalanen 


Thanks,
pq

> + DRM_MODE_COLORIMETRY_SMPTE_170M_YCC = 1,
> + DRM_MODE_COLORIMETRY_BT709_YCC  = 2,
> + /* CEA 861 Extended Colorimetry Options */
> + DRM_MODE_COLORIMETRY_XVYCC_601  = 3,
> + DRM_MODE_COLORIMETRY_XVYCC_709  = 4,
> + DRM_MODE_COLORIMETRY_SYCC_601   = 5,
> + DRM_MODE_COLORIMETRY_OPYCC_601  = 6,
> + DRM_MODE_COLORIMETRY_OPRGB  = 7,
> + DRM_MODE_COLORIMETRY_BT2020_CYCC= 8,
> + DRM_MODE_COLORIMETRY_BT2020_RGB = 9,
> + DRM_MODE_COLORIMETRY_BT2020_YCC = 10,
> + /* Additional Colorimetry extension added as part of CTA 861.G */
> + DRM_MODE_COLORIMETRY_DCI_P3_RGB_D65 = 11,
> + DRM_MODE_COLORIMETRY_DCI_P3_RGB_THEATER = 12,
> + /* Additional Colorimetry Options added for DP 1.4a VSC Colorimetry 
> Format */
> + DRM_MODE_COLORIMETRY_RGB_WIDE_FIXED = 13,
> + DRM_MODE_COLORIMETRY_RGB_WIDE_FLOAT = 14,
> + DRM_MODE_COLORIMETRY_BT601_YCC  = 15,
> +};
>  
>  /**
>   * enum drm_bus_flags - bus_flags info for _display_info
> @@ -826,7 +827,7 @@ struct drm_connector_state {
>* colorspace change on Sink. This is most commonly used to switch
>* to wider color gamuts like BT2020.
>*/
> - u32 colorspace;
> + enum drm_colorspace colorspace;
>  
>   /**
>* @writeback_job: Writeback job for writeback connectors



pgpGJV06OAMQw.pgp
Description: OpenPGP digital signature


[BUG 6.3-rc1] Bad lock in ttm_bo_delayed_delete()

2023-03-08 Thread Steven Rostedt


In a report for a regression in my code, I tried to run v6.3-rc1 through my
tests. It crashed at boot up on my first test (my start up tests do take a
long time, hence the 206 seconds of boot!).

[  206.238782] [ cut here ]
[  206.277786] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[  206.277946] WARNING: CPU: 0 PID: 332 at kernel/locking/mutex.c:582 
__ww_mutex_lock.constprop.0+0x566/0xfec
[  206.313338] Modules linked in:
[  206.324732] CPU: 0 PID: 332 Comm: kworker/0:13H Not tainted 
6.3.0-rc1-test-1-ga98bd42762ed-dirty #965
[  206.338273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.16.0-debian-1.16.0-5 04/01/2014
[  206.353596] Workqueue: ttm ttm_bo_delayed_delete
[  206.370520] EIP: __ww_mutex_lock.constprop.0+0x566/0xfec
[  206.382855] Code: e8 ab 59 95 ff 85 c0 0f 84 25 fb ff ff 8b 0d 58 c0 3b cf 
85 c9 0f 85 17 fb ff ff 68 e0 8d 07 cf 68 2b ac 05 cf e8 e6 e6 3f ff <0f> 0b 58 
5a e9 ff fa ff ff e8 78 59 95 ff 85 c0 74 0e 8b 0d 58 c0
[  206.411247] EAX: 0028 EBX:  ECX: c3ae5dd8 EDX: 0002
[  206.425193] ESI:  EDI: c2d5f0bc EBP: c3ae5f00 ESP: c3ae5eac
[  206.439236] DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068 EFLAGS: 00010246
[  206.453597] CR0: 80050033 CR2: ff9ff000 CR3: 0f512000 CR4: 00150ef0
[  206.467841] Call Trace:
[  206.481059]  ? ttm_bo_delayed_delete+0x30/0x94
[  206.494980]  ww_mutex_lock+0x32/0x94
[  206.508699]  ttm_bo_delayed_delete+0x30/0x94
[  206.522371]  process_one_work+0x21a/0x538
[  206.536306]  worker_thread+0x146/0x398
[  206.549860]  kthread+0xea/0x10c
[  206.563141]  ? process_one_work+0x538/0x538
[  206.576835]  ? kthread_complete_and_exit+0x1c/0x1c
[  206.590652]  ret_from_fork+0x1c/0x28
[  206.604522] irq event stamp: 4219
[  206.617852] hardirqs last  enabled at (4219): [] 
_raw_spin_unlock_irqrestore+0x2d/0x58
[  206.633077] hardirqs last disabled at (4218): [] 
kvfree_call_rcu+0x155/0x2ec
[  206.648161] softirqs last  enabled at (3570): [] 
__do_softirq+0x2f3/0x48b
[  206.663025] softirqs last disabled at (3565): [] 
call_on_stack+0x45/0x4c
[  206.678065] ---[ end trace  ]---

Looks like there was a lock possibly used after free. But as commit
9bff18d13473a9fdf81d5158248472a9d8ecf2bd ("drm/ttm: use per BO cleanup
workers") changed a lot of this code, I figured it may be the culprit.

-- Steve


Re: [BUG 6.3-rc1] Bad lock in ttm_bo_delayed_delete()

2023-03-08 Thread Steven Rostedt
On Tue, 7 Mar 2023 21:22:23 -0500
Steven Rostedt  wrote:

> Looks like there was a lock possibly used after free. But as commit
> 9bff18d13473a9fdf81d5158248472a9d8ecf2bd ("drm/ttm: use per BO cleanup
> workers") changed a lot of this code, I figured it may be the culprit.

If I bothered to look at the second warning after this one (I usually stop
after the first), it appears to state there was a use after free issue.

[  206.692285] [ cut here ]
[  206.706333] refcount_t: underflow; use-after-free.
[  206.720577] WARNING: CPU: 0 PID: 332 at lib/refcount.c:28 
refcount_warn_saturate+0xb6/0xfc
[  206.735810] Modules linked in:
[  206.749493] CPU: 0 PID: 332 Comm: kworker/0:13H Tainted: GW  
6.3.0-rc1-test-1-ga98bd42762ed-dirty #965
[  206.765833] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.16.0-debian-1.16.0-5 04/01/2014
[  206.781767] Workqueue: ttm ttm_bo_delayed_delete
[  206.796500] EIP: refcount_warn_saturate+0xb6/0xfc
[  206.811121] Code: 68 50 1c 0d cf e8 66 b3 a9 ff 0f 0b 58 c9 c3 90 80 3d 57 
c6 38 cf 00 75 8a c6 05 57 c6 38 cf 01 68 7c 1c 0d cf e8 46 b3 a9 ff <0f> 0b 59 
c9 c3 80 3d 55 c6 38 cf 00 0f 85 67 ff ff ff c6 05 55 c6
[  206.844560] EAX: 0026 EBX: c2d5f150 ECX: c3ae5e40 EDX: 0002
[  206.862109] ESI: c2d5f0bc EDI: f6f91200 EBP: c3ae5f18 ESP: c3ae5f14
[  206.878773] DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068 EFLAGS: 00010246
[  206.895665] CR0: 80050033 CR2: ff9ff000 CR3: 0f512000 CR4: 00150ef0
[  206.912303] Call Trace:
[  206.927940]  ttm_bo_delayed_delete+0x8c/0x94
[  206.944179]  process_one_work+0x21a/0x538
[  206.960605]  worker_thread+0x146/0x398
[  206.976839]  kthread+0xea/0x10c
[  206.992696]  ? process_one_work+0x538/0x538
[  207.008827]  ? kthread_complete_and_exit+0x1c/0x1c
[  207.025150]  ret_from_fork+0x1c/0x28
[  207.041307] irq event stamp: 4219
[  207.056883] hardirqs last  enabled at (4219): [] 
_raw_spin_unlock_irqrestore+0x2d/0x58
[  207.074298] hardirqs last disabled at (4218): [] 
kvfree_call_rcu+0x155/0x2ec
[  207.091461] softirqs last  enabled at (3570): [] 
__do_softirq+0x2f3/0x48b
[  207.107979] softirqs last disabled at (3565): [] 
call_on_stack+0x45/0x4c
[  207.123827] ---[ end trace  ]---


-- Steve


[PATCH] drm/amd/display: remove legacy fields of dc_plane_cap struct

2023-03-08 Thread David Tadokoro
The fields blends_with_above and blends_with_below of struct
dc_plane_cap (defined in dc/dc.h) are boolean and set to true by
default. All instances of a dc_plane_cap maintain the default values of
both. Also, there is only one if statement that checks those fields and
there would be the same effect if it was deleted (assuming that those
fields are always going to be true).

For this reason, considering both fields as legacy ones, this commit
removes them and the aforementioned if statement.

Signed-off-by: David Tadokoro 
---
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c   | 3 ---
 drivers/gpu/drm/amd/display/dc/dc.h | 2 --
 drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c | 3 ---
 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c   | 2 --
 drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c   | 2 --
 drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c   | 2 --
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c   | 2 --
 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/dcn302/dcn302_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/dcn303/dcn303_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c   | 2 --
 drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/dcn315/dcn315_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/dcn316/dcn316_resource.c | 2 --
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_resource.c   | 2 --
 drivers/gpu/drm/amd/display/dc/dcn321/dcn321_resource.c | 2 --
 17 files changed, 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index b472931cb7ca..fdcb375e908a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4354,9 +4354,6 @@ static int amdgpu_dm_initialize_drm_device(struct 
amdgpu_device *adev)
if (plane->type != DC_PLANE_TYPE_DCN_UNIVERSAL)
continue;
 
-   if (!plane->blends_with_above || !plane->blends_with_below)
-   continue;
-
if (!plane->pixel_format_support.argb)
continue;
 
diff --git a/drivers/gpu/drm/amd/display/dc/dc.h 
b/drivers/gpu/drm/amd/display/dc/dc.h
index f0a1934ebf8c..ccc27d482640 100644
--- a/drivers/gpu/drm/amd/display/dc/dc.h
+++ b/drivers/gpu/drm/amd/display/dc/dc.h
@@ -82,8 +82,6 @@ enum det_size {
 
 struct dc_plane_cap {
enum dc_plane_type type;
-   uint32_t blends_with_above : 1;
-   uint32_t blends_with_below : 1;
uint32_t per_pixel_alpha : 1;
struct {
uint32_t argb : 1;
diff --git a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c 
b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
index f808315b2835..a4a45a6ce61e 100644
--- a/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce110/dce110_resource.c
@@ -401,8 +401,6 @@ static const struct resource_caps stoney_resource_cap = {
 
 static const struct dc_plane_cap plane_cap = {
.type = DC_PLANE_TYPE_DCE_RGB,
-   .blends_with_below = true,
-   .blends_with_above = true,
.per_pixel_alpha = 1,
 
.pixel_format_support = {
@@ -428,7 +426,6 @@ static const struct dc_plane_cap plane_cap = {
 
 static const struct dc_plane_cap underlay_plane_cap = {
.type = DC_PLANE_TYPE_DCE_UNDERLAY,
-   .blends_with_above = true,
.per_pixel_alpha = 1,
 
.pixel_format_support = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
index 6bfac8088ab0..2bb8e11f26e0 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_resource.c
@@ -504,8 +504,6 @@ static const struct resource_caps rv2_res_cap = {
 
 static const struct dc_plane_cap plane_cap = {
.type = DC_PLANE_TYPE_DCN_UNIVERSAL,
-   .blends_with_above = true,
-   .blends_with_below = true,
.per_pixel_alpha = true,
 
.pixel_format_support = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c 
b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index 3af24ef9cb2d..00668df0938e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -670,8 +670,6 @@ static const struct resource_caps res_cap_nv10 = {
 
 static const struct dc_plane_cap plane_cap = {
.type = DC_PLANE_TYPE_DCN_UNIVERSAL,
-   .blends_with_above = true,
-   .blends_with_below = true,
.per_pixel_alpha = true,
 
.pixel_format_support = {
diff --git a/drivers/gpu/drm/amd/display/dc/dcn201/dcn201_resource.c