Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang

2024-09-10 Thread Leo Li

Hi Mikhail,

Can you give this patch a try to see if it helps?
https://gist.github.com/leeonadoh/3271e90ec95d768424c572c970ada743

Thanks,
Leo

On 2024-09-10 11:47, Leo Li wrote:



On 2024-09-08 19:30, Mikhail Gavrilov wrote:

I have done additional tests:
1. The computer does not hang with 6900XT instead the screen flickers
when moving the cursor.
2. The computer does not hang with 7900XTX if I turn off VRR. But the
screen flickers when moving the cursor, as on 6900XT.
To enable VRR, please set 'variable-refresh-rate' in
experimental-features, and in the Display setting, enable Variable
Refresh Rate.
$ gsettings set org.gnome.mutter experimental-features
"['variable-refresh-rate', 'scale-monitor-framebuffer']"
https://postimg.cc/PvXYdvGR


Thanks Mikhail, I think I know what's going on now.

The `scale-monitor-framebuffer` experimental setting is what puts us down the
bad code path. It seems VRR has nothing to do with this issue, just setting
`scale-monitor-framebuffer` is enough to reproduce.

It seems that mutter with this setting is opting for HW scaling rather than GPU
scaling. I see that "Find the Orange Narwhal" sends out a 1080p buffer,
which with this setting, gets directly scanned out and scaled by DCN HW to 4k in
full screen.

An oddity with current gen DCN hardware is that the cursor inherits the scaling
of the HW plane underneath. So if mutter requests a hw cursor with a different
scaling than the game's plane, amdgpu will reject that, and likely force mutter
into SW cursor.

My offending patch changed this behavior by rerouting DCN HW pipes to
accommodate such a configuration. It essentially takes a full-fledged DCN
overlay plane, and uses that just for the cursor, and thereby freeing it from
inheriting things from the underlying hw plane.

My guess is this causes flickering due to how DC (display core driver) handles
updates; it needs all enabled planes in it's update state. However, a KMS cursor
update will only include the cursor plane. It's likely that amdgpu_dm only adds
the dedicated cursor plane to DC's update state, leaving the game's plane out.

The fix isn't exactly trivial. If I don't get anywhere before the fixes window,
I'll send out a revert.

Cheers,
Leo


Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang

2024-09-10 Thread Leo Li




On 2024-09-08 19:30, Mikhail Gavrilov wrote:

I have done additional tests:
1. The computer does not hang with 6900XT instead the screen flickers
when moving the cursor.
2. The computer does not hang with 7900XTX if I turn off VRR. But the
screen flickers when moving the cursor, as on 6900XT.
To enable VRR, please set 'variable-refresh-rate' in
experimental-features, and in the Display setting, enable Variable
Refresh Rate.
$ gsettings set org.gnome.mutter experimental-features
"['variable-refresh-rate', 'scale-monitor-framebuffer']"
https://postimg.cc/PvXYdvGR


Thanks Mikhail, I think I know what's going on now.

The `scale-monitor-framebuffer` experimental setting is what puts us down the
bad code path. It seems VRR has nothing to do with this issue, just setting
`scale-monitor-framebuffer` is enough to reproduce.

It seems that mutter with this setting is opting for HW scaling rather than GPU
scaling. I see that "Find the Orange Narwhal" sends out a 1080p buffer,
which with this setting, gets directly scanned out and scaled by DCN HW to 4k in
full screen.

An oddity with current gen DCN hardware is that the cursor inherits the scaling
of the HW plane underneath. So if mutter requests a hw cursor with a different
scaling than the game's plane, amdgpu will reject that, and likely force mutter
into SW cursor.

My offending patch changed this behavior by rerouting DCN HW pipes to
accommodate such a configuration. It essentially takes a full-fledged DCN
overlay plane, and uses that just for the cursor, and thereby freeing it from
inheriting things from the underlying hw plane.

My guess is this causes flickering due to how DC (display core driver) handles
updates; it needs all enabled planes in it's update state. However, a KMS cursor
update will only include the cursor plane. It's likely that amdgpu_dm only adds
the dedicated cursor plane to DC's update state, leaving the game's plane out.

The fix isn't exactly trivial. If I don't get anywhere before the fixes window,
I'll send out a revert.

Cheers,
Leo


Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang

2024-09-06 Thread Leo Li



Hi Mikhail,

I've tried to align my system with yours as best as I can, but so far, I've had
no luck reproducing the hang. A video of what I'm doing:
https://youtu.be/VeD-LPCnfWM?si=b2baF8MyDBuU4jRH
(Under the hood, the W7900 and 7900xt should be the same)

I have a few suggestions:

First, can you also open an issue on the amd gitlab tracker? It gives more
visibility to others, and makes working together a bit easier:
https://gitlab.freedesktop.org/drm/amd/-/issues

Second, can you try adding "amdgpu.dcdebugmask=0x40" to your kernel cmdline at
boot, and see if you can still repro the hang?

This setting disables hw planes. If it resolves the hang, then it's quite
interesting, because it suggests that gnome may be using direct-scanout via hw
planes. We may need to align our gnome configuration in that case, since I don't
see any additional hw planes being used on my setup.

Third, in case these two issues are related, can you give the attached patch on
this issue thread a try as well?
https://gitlab.freedesktop.org/drm/amd/-/issues/3569#note_2558359

Thanks,
Leo

On 2024-09-05 02:06, Mikhail Gavrilov wrote:

On Thu, Sep 5, 2024 at 4:06 AM Leo Li  wrote:


Can you delete ", new_cursor_state" on that line and try again? Seems to be a
unused variable warning being elevated to an error.



Thanks, I applied both patches and can confirm that this solved the issue.
The first patch was definitely not enough.

Tested-by: Mikhail Gavrilov 



Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang

2024-09-04 Thread Leo Li




On 2024-09-04 18:21, Mikhail Gavrilov wrote:

On Wed, Sep 4, 2024 at 4:15 AM Leo Li  wrote:

Hi Mike,

Super sorry for the ridiculous wait. Your first two emails slipped by my inbox,
which is really silly, given I'm first in the to field...

Thanks for bisecting and finding a free game to reproduce it on. I did not have
luck reproducing this today, but I am on sway and not gnome. While I get gnome
set up, will you be able to test which one of these reverts fixes the hang for
you? Whether just 1/2 is enough, or both 1/2 and 2/2 is required?

I applied them on top of Linus's v6.11-rc6 tag, so hopefully they'll git am
cleanly for you:

1/2:
https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0001-revert-drm-amd-display-move-primary-plane-zpos-highe-patch
2/2:
https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0002-revert-drm-amd-display-introduce-overlay-cursor-mode-patch



The first patch is not enough.
Yes, it fixes the system hang when I launch the game "Find the Orange Narwhal".
But it does not fix the issue completely.
Some RenPy games still can lead the system to hang.
For example "Innocence Or Money Season 1"
https://store.steampowered.com/app/1958390/Innocence_Or_Money_Season_1__Episodes_1_to_3/
on the language selection screen.

Unfortunately the kernel is not builded with both patches.
I have got compilation error after applying second patch:

   CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/fifo/chid.o
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c: In
function ‘amdgpu_dm_atomic_check’:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:11003:69:
error: unused variable ‘new_cursor_state’ [-Werror=unused-variable]
11003 | struct drm_plane_state *old_plane_state,
*new_plane_state, *new_cursor_state;


Can you delete ", new_cursor_state" on that line and try again? Seems to be a
unused variable warning being elevated to an error.

Thanks,
Leo


   |
  ^~~~
   CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/basics/conversion.o
***
   CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.o
cc1: all warnings being treated as errors
   CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/calcs/dcn_calc_auto.o
   CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/gr/ga102.o
   CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/gr/ad102.o
   CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/gr/r535.o
   CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/clk_mgr.o
   CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxnv40.o
   CC [M]  
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dce60/dce60_clk_mgr.o
make[6]: *** [scripts/Makefile.build:244:
drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.o] Error 1
make[6]: *** Waiting for unfinished jobs
   CC [M]  drivers/gpu/drm/nouveau/nvkm/engine/gr/ctxnv50.o
***
make[5]: *** [scripts/Makefile.build:485: drivers/gpu/drm/amd/amdgpu] Error 2
make[4]: *** [scripts/Makefile.build:485: drivers/gpu/drm] Error 2
make[3]: *** [scripts/Makefile.build:485: drivers/gpu] Error 2
make[2]: *** [scripts/Makefile.build:485: drivers] Error 2
make[1]: *** [/home/mikhail/packaging-work/git/linux-3/Makefile:1925: .] Error 2
make: *** [Makefile:224: __sub-make] Error 2



Re: 6.11/regression/bisected - after commit 1b04dcca4fb1, launching some RenPy games causes computer hang

2024-09-03 Thread Leo Li





On 2024-09-03 02:35, Mikhail Gavrilov wrote:

On Sun, Aug 25, 2024 at 2:12 AM Mikhail Gavrilov
 wrote:


Hi,
Is anyone trying to look into it?
I continue to reproduce this issue on fresh kernel builds 6.11-rc4+.
In addition to the RenPy engine, the problem also reproduces on games
from Ubisoft, such as Far Cry 4.
A very important note that I missed in the first message.
To reproduce the problem, you need to enable scaling in Gnome for
HiDPI monitors.
I am using 4K resolution with 200% of fractional scaling.


Sorry for persistence, but I'm afraid there's no time left to fix this
regression.
There's a week left until the release.
A month later, no one has looked at what the problem is.



Hi Mike,

Super sorry for the ridiculous wait. Your first two emails slipped by my inbox,
which is really silly, given I'm first in the to field...

Thanks for bisecting and finding a free game to reproduce it on. I did not have
luck reproducing this today, but I am on sway and not gnome. While I get gnome
set up, will you be able to test which one of these reverts fixes the hang for
you? Whether just 1/2 is enough, or both 1/2 and 2/2 is required?

I applied them on top of Linus's v6.11-rc6 tag, so hopefully they'll git am
cleanly for you:

1/2:
https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0001-revert-drm-amd-display-move-primary-plane-zpos-highe-patch
2/2:
https://gist.github.com/leeonadoh/69147b5fa8d815b39c5f4c3e005cca28#file-0002-revert-drm-amd-display-introduce-overlay-cursor-mode-patch

Thanks,
Leo


Re: [PATCH 2/2] Revert "drm/amd/display: add panel_power_savings sysfs entry to eDP connectors"

2024-08-07 Thread Leo Li




On 2024-08-07 01:13, Mario Limonciello wrote:

On 8/6/24 13:42, Sebastian Wick wrote:

From: Sebastian Wick 

This reverts commit 63d0b87213a0ba241b3fcfba3fe7b0aed0cd1cc5.

The panel_power_savings sysfs entry can be used to change the displayed
colorimetry which breaks color managed setups.

The "do not break userspace" rule which was violated here is enough
reason to revert this commit.


Hi Sebastian,

The default pane_power_savings sysfs value is 0, which is ABM disabled, so it
wouldn't break colorimetry *by default*. It would break colorimetry only if set
to a non-zero value by the user, or something in userspace.

That said, this sysfs opens a door to "break" colorimetry. But if we make it
such that it can only be set in a user-aware way, then the user can decide
whether they want color accuracy, or power. I don't think that's anything new.
User already decide between setting max backlight for better color, or lower
backlight for better battery. Setting max cpu freq, or capping it. Either can be
seen as breaking.

So I think the issue is really in ppd (power profiles daemon), which afaik is
the only user of this sysfs. It sets a non-zero value by default without the
user being aware.



The bigger problem is that this feature is part of the display chain
which is supposed to be controlled by KMS. This sysfs entry bypasses the
DRM master process and splits control to two independent processes which
do not know about each other. This is what caused the broken user space.
It also causes modesets which can be extremely confusing for the DRM
master process, causing unexpected timings.

We should in general not allow anything other than KMS to control the
display path. If we make an exception to this rule, this must be first
discussed on dri-devel with all the stakeholders approving the
exception.

This has not happened which is the second reason to revert this commit.


I also agree that ABM/backlight related things that affect colorimetry should be
under KMS control. However, from the way things are going, getting there will
take a while, and an interim solution is desired.

We (mostly Mario) proposed a KMS connector property to limit control to KMS as a
way to improve on the sysfs, but that needs more work with compositor folks.
After that, each compositor will need to pipe it through to users. So I think
having something like ppd provide generic support is beneficial in the interim.
It just needs to be 100% opt in.

If we decide to revert this, I think it's worth noting that ABM will be out of
reach for users for a very long while.

On the modeset thing, it's not clear to me why ABM needs a modeset, and I wonder
if it's really necessary. I'll go and find out.

Thanks,
Leo



Signed-off-by: Sebastian Wick 


For anyone who hasn't seen it, there has been a bunch of discussions that have 
transpired on this topic and what to do about it on [1] as well as some other 
linked places on that bug.


Also FWIW there was a discussion on the merits of the sysfs file on dri-devel 
during the initial patch submission [2].


If this revert ends up going through, please also revert 
0887054d14ae23061e28e28747cdea7e40be9224 in the same series so the feature can 
"at least" be accessed by the compositor and changed at runtime like the sysfs 
file had allowed.


[1] https://gitlab.freedesktop.org/upower/power-profiles-daemon/-/issues/159
[2] 
https://lore.kernel.org/dri-devel/20240202152837.7388-1-hamza.mahf...@amd.com/


---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 80 ---
  1 file changed, 80 deletions(-)

diff --git ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
../drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 4d4c75173fc3..16c9051d9ccf 100644
--- ./drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ ../drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6772,82 +6772,10 @@ int amdgpu_dm_connector_atomic_get_property(struct 
drm_connector *connector,

  return ret;
  }
-/**
- * DOC: panel power savings
- *
- * The display manager allows you to set your desired **panel power savings**
- * level (between 0-4, with 0 representing off), e.g. using the following::
- *
- *   # echo 3 > /sys/class/drm/card0-eDP-1/amdgpu/panel_power_savings
- *
- * Modifying this value can have implications on color accuracy, so tread
- * carefully.
- */
-
-static ssize_t panel_power_savings_show(struct device *device,
-    struct device_attribute *attr,
-    char *buf)
-{
-    struct drm_connector *connector = dev_get_drvdata(device);
-    struct drm_device *dev = connector->dev;
-    u8 val;
-
-    drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);
-    val = to_dm_connector_state(connector->state)->abm_level ==
-    ABM_LEVEL_IMMEDIATE_DISABLE ? 0 :
-    to_dm_connector_state(connector->state)->abm_level;
-    drm_modeset_unlock(&dev->mode_config.connection_mutex);
-
-    return sysfs_emit(buf, "%u\n", val);
-}
-
-static ssize_t panel_powe

Re: [PATCH v4 2/2] drm/amd: Add power_saving_policy drm property to eDP connectors

2024-07-10 Thread Leo Li




On 2024-07-03 01:17, Mario Limonciello wrote:

When the `power_saving_policy` property is set to bit mask
"Require color accuracy" ABM should be disabled immediately and
any requests by sysfs to update will return an -EBUSY error.

When the `power_saving_policy` property is set to bit mask
"Require low latency" PSR should be disabled.

When the property is restored to an empty bit mask ABM and PSR
can be enabled again.

Signed-off-by: Mario Limonciello 


Reviewed-by: Leo Li 

Thanks!


---
v3->v4:
  * Fix enabling again after disable (Xaver)
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  4 ++
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 50 +--
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  2 +
  3 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 3ecc7ef95172..cfb5220cf182 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1350,6 +1350,10 @@ int amdgpu_display_modeset_create_props(struct 
amdgpu_device *adev)
 "dither",
 amdgpu_dither_enum_list, sz);
  
+	if (adev->dc_enabled)

+   drm_mode_create_power_saving_policy_property(adev_to_drm(adev),
+
DRM_MODE_POWER_SAVING_POLICY_ALL);
+
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index f866a02f4f48..34da987350f5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6417,6 +6417,13 @@ int amdgpu_dm_connector_atomic_set_property(struct 
drm_connector *connector,
} else if (property == adev->mode_info.underscan_property) {
dm_new_state->underscan_enable = val;
ret = 0;
+   } else if (property == dev->mode_config.power_saving_policy) {
+   dm_new_state->abm_forbidden = val & 
DRM_MODE_REQUIRE_COLOR_ACCURACY;
+   dm_new_state->abm_level = (dm_new_state->abm_forbidden || 
!dm_old_state->abm_level) ?
+   ABM_LEVEL_IMMEDIATE_DISABLE :
+   dm_old_state->abm_level;
+   dm_new_state->psr_forbidden = val & 
DRM_MODE_REQUIRE_LOW_LATENCY;
+   ret = 0;
}
  
  	return ret;

@@ -6459,6 +6466,13 @@ int amdgpu_dm_connector_atomic_get_property(struct 
drm_connector *connector,
} else if (property == adev->mode_info.underscan_property) {
*val = dm_state->underscan_enable;
ret = 0;
+   } else if (property == dev->mode_config.power_saving_policy) {
+   *val = 0;
+   if (dm_state->psr_forbidden)
+   *val |= DRM_MODE_REQUIRE_LOW_LATENCY;
+   if (dm_state->abm_forbidden)
+   *val |= DRM_MODE_REQUIRE_COLOR_ACCURACY;
+   ret = 0;
}
  
  	return ret;

@@ -6485,9 +6499,12 @@ static ssize_t panel_power_savings_show(struct device 
*device,
u8 val;
  
  	drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);

-   val = to_dm_connector_state(connector->state)->abm_level ==
-   ABM_LEVEL_IMMEDIATE_DISABLE ? 0 :
-   to_dm_connector_state(connector->state)->abm_level;
+   if (to_dm_connector_state(connector->state)->abm_forbidden)
+   val = 0;
+   else
+   val = to_dm_connector_state(connector->state)->abm_level ==
+   ABM_LEVEL_IMMEDIATE_DISABLE ? 0 :
+   to_dm_connector_state(connector->state)->abm_level;
drm_modeset_unlock(&dev->mode_config.connection_mutex);
  
  	return sysfs_emit(buf, "%u\n", val);

@@ -6511,10 +6528,16 @@ static ssize_t panel_power_savings_store(struct device 
*device,
return -EINVAL;
  
  	drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);

-   to_dm_connector_state(connector->state)->abm_level = val ?:
-   ABM_LEVEL_IMMEDIATE_DISABLE;
+   if (to_dm_connector_state(connector->state)->abm_forbidden)
+   ret = -EBUSY;
+   else
+   to_dm_connector_state(connector->state)->abm_level = val ?:
+   ABM_LEVEL_IMMEDIATE_DISABLE;
drm_modeset_unlock(&dev->mode_config.connection_mutex);
  
+	if (ret)

+   return ret;
+
drm_kms_helper_hotplug_event(dev);
  
  	return count;

@@ -7685,6 +7708,13 @@ void amdgpu_dm_connector_init_helper(struct 
amdgpu_display_manager *dm,
aconnector->base.state->max_bpc = 16;
a

Re: [PATCH v3 2/2] drm/amd: Add power_saving_policy drm property to eDP connectors

2024-06-18 Thread Leo Li




On 2024-06-05 22:04, Mario Limonciello wrote:

When the `power_saving_policy` property is set to bit mask
"Require color accuracy" ABM should be disabled immediately and
any requests by sysfs to update will return an -EBUSY error.

When the `power_saving_policy` property is set to bit mask
"Require low latency" PSR should be disabled.

When the property is restored to an empty bit mask the previous
value of ABM and pSR will be restored.

Signed-off-by: Mario Limonciello 

Reviewed-by: Leo Li 

Thanks!


---
v2->v3:
  * Use `disallow_edp_enter_psr` instead
  * Drop case in dm_update_crtc_state()
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  4 ++
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 50 +--
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  2 +
  3 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 3ecc7ef95172..cfb5220cf182 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1350,6 +1350,10 @@ int amdgpu_display_modeset_create_props(struct 
amdgpu_device *adev)
 "dither",
 amdgpu_dither_enum_list, sz);
  
+	if (adev->dc_enabled)

+   drm_mode_create_power_saving_policy_property(adev_to_drm(adev),
+
DRM_MODE_POWER_SAVING_POLICY_ALL);
+
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index f1d67c6f4b98..5fd7210b2479 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6436,6 +6436,13 @@ int amdgpu_dm_connector_atomic_set_property(struct 
drm_connector *connector,
} else if (property == adev->mode_info.underscan_property) {
dm_new_state->underscan_enable = val;
ret = 0;
+   } else if (property == dev->mode_config.power_saving_policy) {
+   dm_new_state->abm_forbidden = val & 
DRM_MODE_REQUIRE_COLOR_ACCURACY;
+   dm_new_state->abm_level = (dm_new_state->abm_forbidden || 
!amdgpu_dm_abm_level) ?
+   ABM_LEVEL_IMMEDIATE_DISABLE :
+   amdgpu_dm_abm_level;
+   dm_new_state->psr_forbidden = val & 
DRM_MODE_REQUIRE_LOW_LATENCY;
+   ret = 0;
}
  
  	return ret;

@@ -6478,6 +6485,13 @@ int amdgpu_dm_connector_atomic_get_property(struct 
drm_connector *connector,
} else if (property == adev->mode_info.underscan_property) {
*val = dm_state->underscan_enable;
ret = 0;
+   } else if (property == dev->mode_config.power_saving_policy) {
+   *val = 0;
+   if (dm_state->psr_forbidden)
+   *val |= DRM_MODE_REQUIRE_LOW_LATENCY;
+   if (dm_state->abm_forbidden)
+   *val |= DRM_MODE_REQUIRE_COLOR_ACCURACY;
+   ret = 0;
}
  
  	return ret;

@@ -6504,9 +6518,12 @@ static ssize_t panel_power_savings_show(struct device 
*device,
u8 val;
  
  	drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);

-   val = to_dm_connector_state(connector->state)->abm_level ==
-   ABM_LEVEL_IMMEDIATE_DISABLE ? 0 :
-   to_dm_connector_state(connector->state)->abm_level;
+   if (to_dm_connector_state(connector->state)->abm_forbidden)
+   val = 0;
+   else
+   val = to_dm_connector_state(connector->state)->abm_level ==
+   ABM_LEVEL_IMMEDIATE_DISABLE ? 0 :
+   to_dm_connector_state(connector->state)->abm_level;
drm_modeset_unlock(&dev->mode_config.connection_mutex);
  
  	return sysfs_emit(buf, "%u\n", val);

@@ -6530,10 +6547,16 @@ static ssize_t panel_power_savings_store(struct device 
*device,
return -EINVAL;
  
  	drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);

-   to_dm_connector_state(connector->state)->abm_level = val ?:
-   ABM_LEVEL_IMMEDIATE_DISABLE;
+   if (to_dm_connector_state(connector->state)->abm_forbidden)
+   ret = -EBUSY;
+   else
+   to_dm_connector_state(connector->state)->abm_level = val ?:
+   ABM_LEVEL_IMMEDIATE_DISABLE;
drm_modeset_unlock(&dev->mode_config.connection_mutex);
  
+	if (ret)

+   return ret;
+
drm_kms_helper_hotplug_event(dev);
  
  	return count;

@@ -7704,6 +7727,13 @@ void amdgpu_dm_connector_init_helper(struct 
amdgpu_display_manager *dm,
aconnec

Re: [PATCH v2 4/4] tests/amdgpu/amd_psr: Add support for `power saving policy` property

2024-06-18 Thread Leo Li




On 2024-05-22 18:08, Mario Limonciello wrote:

Verify that the property has disabled PSR
---
  tests/amdgpu/amd_psr.c | 74 ++
  1 file changed, 74 insertions(+)

diff --git a/tests/amdgpu/amd_psr.c b/tests/amdgpu/amd_psr.c
index 9da161a09..a9f4a6aa5 100644
--- a/tests/amdgpu/amd_psr.c
+++ b/tests/amdgpu/amd_psr.c
@@ -338,6 +338,78 @@ static void run_check_psr(data_t *data, bool 
test_null_crtc) {
test_fini(data);
  }
  
+static void psr_forbidden(data_t *data)

+{
+   int edp_idx, ret, i, psr_state;
+   igt_fb_t ref_fb, ref_fb2;
+   igt_fb_t *flip_fb;
+   igt_output_t *output;
+
+   test_init(data);
+
+   edp_idx = check_conn_type(data, DRM_MODE_CONNECTOR_eDP);
+   igt_skip_on_f(edp_idx == -1, "no eDP connector found\n");
+
+   /* check if eDP support PSR1, PSR-SU.
+*/
+   igt_skip_on(!psr_mode_supported(data, PSR_MODE_1) && 
!psr_mode_supported(data, PSR_MODE_2));
+   for_each_connected_output(&data->display, output) {
+
+   if (output->config.connector->connector_type != 
DRM_MODE_CONNECTOR_eDP)
+   continue;
+   ret = clear_power_saving_policy(data->fd, output);
+   if (ret == -ENODEV) {
+   igt_skip("No power saving policy prop\n");
+   return;
+   }
+   igt_require(ret == 0);
+
+   /* try to engage PSR */
+   ret = set_panel_power_saving_policy(data->fd, output, 
DRM_MODE_REQUIRE_LOW_LATENCY);
+   igt_assert_eq(ret, 0);
+
+   igt_create_color_fb(data->fd, data->mode->hdisplay,
+   data->mode->vdisplay, DRM_FORMAT_XRGB, 
0, 1.0,
+   0.0, 0.0, &ref_fb);
+   igt_create_color_fb(data->fd, data->mode->hdisplay,
+   data->mode->vdisplay, DRM_FORMAT_XRGB, 
0, 0.0,
+   1.0, 0.0, &ref_fb2);
+
+   igt_plane_set_fb(data->primary, &ref_fb);
+
+   igt_display_commit_atomic(&data->display, 
DRM_MODE_ATOMIC_ALLOW_MODESET, 0);
+
+   for (i = 0; i < N_FLIPS; i++) {
+   if (i % 2 == 0)
+   flip_fb = &ref_fb2;
+   else
+   flip_fb = &ref_fb;
+
+   ret = drmModePageFlip(data->fd, 
output->config.crtc->crtc_id,
+ flip_fb->fb_id, 
DRM_MODE_PAGE_FLIP_EVENT, NULL);
+   igt_require(ret == 0);
+   kmstest_wait_for_pageflip(data->fd);
+   }
+
+   /* PSR state takes some time to settle its value on static 
screen */
+   sleep(PSR_SETTLE_DELAY);
+
+   psr_state =  igt_amd_read_psr_state(data->fd, output->name);
+   igt_require(psr_state == PSR_STATE3);


igt_fail_on* or igt_assert* should be used here, igt_require simply 'skips' the
test if the condition evaluates to false.

Should we be instead asserting psr_state == PSR_STATE0 here for disabled, since
we've set REQUIRE_LOW_LATENCY?

I think as part of this test, we can also check that PSR re-enables after
clearing the power saving policy. Something like

ret = clear_power_saving_policy(data->fd, output);
... do some flipping ...
sleep(PSR_SETTLE_DELAY);
psr_state = igt_amd_read_psr_state(data->fd, output->name);
igt_assert_f(psr_state == PSR_STATE3,
 "Panel not in PSR after clearing power saving policy\n");

Thanks,
Leo


+
+   igt_remove_fb(data->fd, &ref_fb);
+   igt_remove_fb(data->fd, &ref_fb2);
+
+   ret = clear_power_saving_policy(data->fd, output);
+   if (ret == -ENODEV) {
+   igt_skip("No power saving policy prop\n");
+   return;
+   }
+   igt_require(ret == 0);
+
+   }
+}
+
  static void run_check_psr_su_mpo(data_t *data, bool scaling, float 
scaling_ratio)
  {
int edp_idx = check_conn_type(data, DRM_MODE_CONNECTOR_eDP);
@@ -756,6 +828,8 @@ igt_main_args("", long_options, help_str, opt_handler, NULL)
igt_describe("Test to validate PSR SU enablement with Visual Confirm "
 "and to validate PSR SU disable/re-enable w/ primary scaling 
ratio 0.75");
igt_subtest("psr_su_mpo_scaling_0_75") run_check_psr_su_mpo(&data, 
true, .75);
+   igt_describe("Test whether PSR can be forbidden");
+   igt_subtest("psr_forbidden") psr_forbidden(&data);
  
  	igt_fixture

{


Re: [PATCH v2 3/4] tests/amdgpu/amd_abm: Add support for panel_power_saving property

2024-06-18 Thread Leo Li



Thanks for the tests! FYI IGT patches should also cc 
igt-...@lists.freedesktop.org

Some comments inline:

On 2024-05-22 18:08, Mario Limonciello wrote:

From: Mario Limonciello 

When the "panel power saving" property is set to forbidden the
compositor has indicated that userspace prefers to have color
accuracy and fidelity instead of power saving.

Verify that the sysfs file behaves as expected in this situation.

Signed-off-by: Mario Limonciello 
---
  tests/amdgpu/amd_abm.c | 39 +++
  1 file changed, 39 insertions(+)

diff --git a/tests/amdgpu/amd_abm.c b/tests/amdgpu/amd_abm.c
index f74c3012c..3fa1366fa 100644
--- a/tests/amdgpu/amd_abm.c
+++ b/tests/amdgpu/amd_abm.c
@@ -365,6 +365,43 @@ static void abm_gradual(data_t *data)
}
  }
  
+

+static void abm_forbidden(data_t *data)
+{
+   igt_output_t *output;
+   enum pipe pipe;
+   int target, r;
+
+   for_each_pipe_with_valid_output(&data->display, pipe, output) {
+   if (output->config.connector->connector_type != 
DRM_MODE_CONNECTOR_eDP)
+   continue;
+
+   r = clear_power_saving_policy(data->drm_fd, output);
+   if (r == -ENODEV) {
+   igt_skip("No power saving policy prop\n");
+   return;
+   }
+   igt_assert_eq(r, 0);
+
+   target = 3;
+   r = set_abm_level(data, output, target);
+   igt_assert_eq(r, 0);
+
+   r = set_panel_power_saving_policy(data->drm_fd, output, 
DRM_MODE_REQUIRE_COLOR_ACCURACY);
+   igt_assert_eq(r, 0);
+
+   target = 0;


Is there a purpose of setting target abm to 0 (disabled) here?

I suppose it should fail given that we've set REQUIRE_COLOR_ACCURACY. Though I'm
not sure why we can't keep target = 3.

Thanks,
Leo


+   r = set_abm_level(data, output, target);
+   igt_assert_eq(r, -1);
+
+   r = clear_power_saving_policy(data->drm_fd, output);
+   igt_assert_eq(r, 0);
+
+   r = set_abm_level(data, output, target);
+   igt_assert_eq(r, 0);
+   }
+}
+
  igt_main
  {
data_t data = {};
@@ -393,6 +430,8 @@ igt_main
abm_enabled(&data);
igt_subtest("abm_gradual")
abm_gradual(&data);
+   igt_subtest("abm_forbidden")
+   abm_forbidden(&data);
  
  	igt_fixture {

igt_display_fini(&data.display);


Re: [PATCH v2 1/2] drm: Introduce 'power saving policy' drm property

2024-06-04 Thread Leo Li




On 2024-05-22 18:05, Mario Limonciello wrote:

The `power saving policy` DRM property is an optional property that
can be added to a connector by a driver.

This property is for compositors to indicate intent of policy of
whether a driver can use power saving features that may compromise
the experience intended by the compositor.

Signed-off-by: Mario Limonciello 


Acked-by: Leo Li 

Thanks!

---
  drivers/gpu/drm/drm_connector.c | 46 +
  include/drm/drm_connector.h |  2 ++
  include/drm/drm_mode_config.h   |  5 
  include/uapi/drm/drm_mode.h |  7 +
  4 files changed, 60 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 4d2df7f64dc5..088a5874c7a4 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -961,6 +961,11 @@ static const struct drm_prop_enum_list 
drm_scaling_mode_enum_list[] = {
{ DRM_MODE_SCALE_ASPECT, "Full aspect" },
  };
  
+static const struct drm_prop_enum_list drm_power_saving_policy_enum_list[] = {

+   { __builtin_ffs(DRM_MODE_REQUIRE_COLOR_ACCURACY) - 1, "Require color 
accuracy" },
+   { __builtin_ffs(DRM_MODE_REQUIRE_LOW_LATENCY) - 1, "Require low 
latency" },
+};
+
  static const struct drm_prop_enum_list drm_aspect_ratio_enum_list[] = {
{ DRM_MODE_PICTURE_ASPECT_NONE, "Automatic" },
{ DRM_MODE_PICTURE_ASPECT_4_3, "4:3" },
@@ -1499,6 +1504,16 @@ static const u32 dp_colorspaces =
   *
   *Drivers can set up these properties by calling
   *drm_mode_create_tv_margin_properties().
+ * power saving policy:
+ * This property is used to set the power saving policy for the connector.
+ * This property is populated with a bitmask of optional requirements set
+ * by the drm master for the drm driver to respect:
+ * - "Require color accuracy": Disable power saving features that will
+ *   affect color fidelity.
+ *   For example: Hardware assisted backlight modulation.
+ * - "Require low latency": Disable power saving features that will
+ *   affect latency.
+ *   For example: Panel self refresh (PSR)
   */
  
  int drm_connector_create_standard_properties(struct drm_device *dev)

@@ -1963,6 +1978,37 @@ int drm_mode_create_scaling_mode_property(struct 
drm_device *dev)
  }
  EXPORT_SYMBOL(drm_mode_create_scaling_mode_property);
  
+/**

+ * drm_mode_create_power_saving_policy_property - create power saving policy 
property
+ * @dev: DRM device
+ * @supported_policies: bitmask of supported power saving policies
+ *
+ * Called by a driver the first time it's needed, must be attached to desired
+ * connectors.
+ *
+ * Returns: %0
+ */
+int drm_mode_create_power_saving_policy_property(struct drm_device *dev,
+uint64_t supported_policies)
+{
+   struct drm_property *power_saving;
+
+   if (dev->mode_config.power_saving_policy)
+   return 0;
+   WARN_ON((supported_policies & DRM_MODE_POWER_SAVING_POLICY_ALL) == 0);
+
+   power_saving =
+   drm_property_create_bitmask(dev, 0, "power saving policy",
+   drm_power_saving_policy_enum_list,
+   
ARRAY_SIZE(drm_power_saving_policy_enum_list),
+   supported_policies);
+
+   dev->mode_config.power_saving_policy = power_saving;
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_mode_create_power_saving_policy_property);
+
  /**
   * DOC: Variable refresh properties
   *
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index fe88d7fc6b8f..b0ec2ec48de7 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -2025,6 +2025,8 @@ int drm_mode_create_dp_colorspace_property(struct 
drm_connector *connector,
   u32 supported_colorspaces);
  int drm_mode_create_content_type_property(struct drm_device *dev);
  int drm_mode_create_suggested_offset_properties(struct drm_device *dev);
+int drm_mode_create_power_saving_policy_property(struct drm_device *dev,
+uint64_t supported_policies);
  
  int drm_connector_set_path_property(struct drm_connector *connector,

const char *path);
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 973119a9176b..32077e701e2f 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -954,6 +954,11 @@ struct drm_mode_config {
 */
struct drm_atomic_state *suspend_state;
  
+	/**

+* @power_saving_policy: bitmask for power saving policy requests.
+*/
+   struct drm_property *power_saving_policy;
+
const struct drm_mode_config_helper_funcs *helper_private;
  };
  
diff --git a/inc

Re: [PATCH v2 2/2] drm/amd: Add power_saving_policy drm property to eDP connectors

2024-06-04 Thread Leo Li



On 2024-05-22 18:05, Mario Limonciello wrote:

When the `power_saving_policy` property is set to bit mask
"Require color accuracy" ABM should be disabled immediately and
any requests by sysfs to update will return an -EBUSY error.

When the `power_saving_policy` property is set to bit mask
"Require low latency" PSR should be disabled.

When the property is restored to an empty bit mask the previous
value of ABM and pSR will be restored.

Signed-off-by: Mario Limonciello 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   |  4 ++
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 51 +--
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  2 +
  3 files changed, 52 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 3ecc7ef95172..cfb5220cf182 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1350,6 +1350,10 @@ int amdgpu_display_modeset_create_props(struct 
amdgpu_device *adev)
 "dither",
 amdgpu_dither_enum_list, sz);
  
+	if (adev->dc_enabled)

+   drm_mode_create_power_saving_policy_property(adev_to_drm(adev),
+
DRM_MODE_POWER_SAVING_POLICY_ALL);
+
return 0;
  }
  
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 01b0a331e4a6..a480e86892de 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6421,6 +6421,13 @@ int amdgpu_dm_connector_atomic_set_property(struct 
drm_connector *connector,
} else if (property == adev->mode_info.underscan_property) {
dm_new_state->underscan_enable = val;
ret = 0;
+   } else if (property == dev->mode_config.power_saving_policy) {
+   dm_new_state->abm_forbidden = val & 
DRM_MODE_REQUIRE_COLOR_ACCURACY;
+   dm_new_state->abm_level = (dm_new_state->abm_forbidden || 
!amdgpu_dm_abm_level) ?
+   ABM_LEVEL_IMMEDIATE_DISABLE :
+   amdgpu_dm_abm_level;
+   dm_new_state->psr_forbidden = val & 
DRM_MODE_REQUIRE_LOW_LATENCY;
+   ret = 0;
}
  
  	return ret;

@@ -6463,6 +6470,13 @@ int amdgpu_dm_connector_atomic_get_property(struct 
drm_connector *connector,
} else if (property == adev->mode_info.underscan_property) {
*val = dm_state->underscan_enable;
ret = 0;
+   } else if (property == dev->mode_config.power_saving_policy) {
+   *val = 0;
+   if (dm_state->psr_forbidden)
+   *val |= DRM_MODE_REQUIRE_LOW_LATENCY;
+   if (dm_state->abm_forbidden)
+   *val |= DRM_MODE_REQUIRE_COLOR_ACCURACY;
+   ret = 0;
}
  
  	return ret;

@@ -6489,9 +6503,12 @@ static ssize_t panel_power_savings_show(struct device 
*device,
u8 val;
  
  	drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);

-   val = to_dm_connector_state(connector->state)->abm_level ==
-   ABM_LEVEL_IMMEDIATE_DISABLE ? 0 :
-   to_dm_connector_state(connector->state)->abm_level;
+   if (to_dm_connector_state(connector->state)->abm_forbidden)
+   val = 0;
+   else
+   val = to_dm_connector_state(connector->state)->abm_level ==
+   ABM_LEVEL_IMMEDIATE_DISABLE ? 0 :
+   to_dm_connector_state(connector->state)->abm_level;
drm_modeset_unlock(&dev->mode_config.connection_mutex);
  
  	return sysfs_emit(buf, "%u\n", val);

@@ -6515,10 +6532,16 @@ static ssize_t panel_power_savings_store(struct device 
*device,
return -EINVAL;
  
  	drm_modeset_lock(&dev->mode_config.connection_mutex, NULL);

-   to_dm_connector_state(connector->state)->abm_level = val ?:
-   ABM_LEVEL_IMMEDIATE_DISABLE;
+   if (to_dm_connector_state(connector->state)->abm_forbidden)
+   ret = -EBUSY;
+   else
+   to_dm_connector_state(connector->state)->abm_level = val ?:
+   ABM_LEVEL_IMMEDIATE_DISABLE;
drm_modeset_unlock(&dev->mode_config.connection_mutex);
  
+	if (ret)

+   return ret;
+
drm_kms_helper_hotplug_event(dev);
  
  	return count;

@@ -7689,6 +7712,13 @@ void amdgpu_dm_connector_init_helper(struct 
amdgpu_display_manager *dm,
aconnector->base.state->max_bpc = 16;
aconnector->base.state->max_requested_bpc = 
aconnector->base.state->max_bpc;
  
+	if (connector_type == DRM_MODE_CONNECTOR_eDP &&

+   (dc_is_dmcu_initialized(adev->dm.dc) ||
+adev->dm.dc->ctx->dmub_srv)) {
+   drm_object_attach_property(&ac

Re: [PATCH 0/2] Add support for Panel Power Savings property

2024-05-21 Thread Leo Li





On 2024-05-21 13:32, Mario Limonciello wrote:

On 5/21/2024 12:27, Leo Li wrote:



On 2024-05-21 12:21, Mario Limonciello wrote:

On 5/21/2024 11:14, Xaver Hugl wrote:

Am Di., 21. Mai 2024 um 16:00 Uhr schrieb Mario Limonciello
:


On 5/21/2024 08:43, Simon Ser wrote:

This makes sense to me in general. I like the fact that it's simple and
vendor-neutral.

Do we want to hardcode "panel" in the name? Are we sure that this will
ever only apply to panels?

Do we want to use a name which reflects the intent, rather than the
mechanism? In other words, something like "color fidelity" = "preferred"
maybe? (I don't know, just throwing ideas around.)


In that vein, how about:

"power saving policy"
--> "power saving"
--> "color fidelity"


It's not just about colors though, is it? The compositor might want to
disable it to increase the backlight brightness in bright
environments, so "color fidelity" doesn't really sound right


Either of these better?

"power saving policy"
--> "power saving"
--> "accuracy"

"power saving policy"
--> "allowed"
--> "forbidden"

Or any other idea?


Another consideration in addition to accuracy is latency.

I suppose a compositor may want to disable features such as PSR for use-cases 
requiring low latency. Touch and stylus input are some examples.


I wonder if flags would work better than enums? A compositor can set something 
like `REQUIRE_ACCURACY & REQUIRE_LOW_LATENCY`, for example.




I thought we had said the PSR latency issue discussed at the hackfest was likely 
just a bug and that nominally it won't matter?


Ah, either my memory failed, or I failed to clarify. Let me fix that below.



If it really will matter enough then yeah I think you're right that flags would 
be better.  More drivers would probably want to opt into the property for the 
purpose of turning off PSR/PSR2/panel replay as well then.


Latency technically shouldn't be a problem for Panel Replay, but is a problem
for PSR/PSR2 due to their design. At least it is a problem for amdgpu, not sure
about other drivers.

FWIU, in short, when a panel (DPRX) goes into self-refresh, it's clock can drift
from the gpu (DPTX). Since there's no resync mechanism in PSR/PSR2, it is
possible for the panel to lag behind by 1-frame-time(ms) for a noticeable period
of time. Panel replay has a resync mechanism, so it theoretically can resync
within 1 frame.

Since, fwict, replay panels are still rare, I think it'll be nice for
compositors that care about latency to have the option to temporarily disable
PSR/PSR2.

- Leo




- Leo







Would be nice to add documentation for the property in the "standard
connector properties" section.


Ack.







Re: [PATCH 0/2] Add support for Panel Power Savings property

2024-05-21 Thread Leo Li




On 2024-05-21 12:21, Mario Limonciello wrote:

On 5/21/2024 11:14, Xaver Hugl wrote:

Am Di., 21. Mai 2024 um 16:00 Uhr schrieb Mario Limonciello
:


On 5/21/2024 08:43, Simon Ser wrote:

This makes sense to me in general. I like the fact that it's simple and
vendor-neutral.

Do we want to hardcode "panel" in the name? Are we sure that this will
ever only apply to panels?

Do we want to use a name which reflects the intent, rather than the
mechanism? In other words, something like "color fidelity" = "preferred"
maybe? (I don't know, just throwing ideas around.)


In that vein, how about:

"power saving policy"
--> "power saving"
--> "color fidelity"


It's not just about colors though, is it? The compositor might want to
disable it to increase the backlight brightness in bright
environments, so "color fidelity" doesn't really sound right


Either of these better?

"power saving policy"
--> "power saving"
--> "accuracy"

"power saving policy"
--> "allowed"
--> "forbidden"

Or any other idea?


Another consideration in addition to accuracy is latency.

I suppose a compositor may want to disable features such as PSR for use-cases 
requiring low latency. Touch and stylus input are some examples.


I wonder if flags would work better than enums? A compositor can set something 
like `REQUIRE_ACCURACY & REQUIRE_LOW_LATENCY`, for example.


- Leo







Would be nice to add documentation for the property in the "standard
connector properties" section.


Ack.





Re: [PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

2024-04-17 Thread Leo Li





On 2024-04-16 10:10, Harry Wentland wrote:



On 2024-04-16 04:01, Pekka Paalanen wrote:

On Mon, 15 Apr 2024 18:33:39 -0400
Leo Li  wrote:


On 2024-04-15 04:19, Pekka Paalanen wrote:

On Fri, 12 Apr 2024 16:14:28 -0400
Leo Li  wrote:
   

On 2024-04-12 11:31, Alex Deucher wrote:

On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen
 wrote:


On Fri, 12 Apr 2024 10:28:52 -0400
Leo Li  wrote:
 

On 2024-04-12 04:03, Pekka Paalanen wrote:

On Thu, 11 Apr 2024 16:33:57 -0400
Leo Li  wrote:
 


...
 

That begs the question of what can be nailed down and what can left to
independent implementation. I guess things like which plane should be enabled
first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed)
can be defined. How to handle atomic test failures could be as well.


What room is there for the interpretation of zpos values?

I thought they are unambiguous already: only the relative numerical
order matters, and that uniquely defines the KMS plane ordering.


The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way
for vendors to communicate overlay, underlay, or mixed-arrangement support. I
don't think allowing OVERLAYs to be placed under the PRIMARY is currently
documented as a way to support underlay.


I always thought it's obvious that the zpos numbers dictate the plane
order without any other rules. After all, we have the universal planes
concept, where the plane type is only informational to aid heuristics
rather than defining anything.

Only if the zpos property does not exist, the plane types would come
into play.

Of course, if there actually exists userspace that fails if zpos allows
an overlay type plane to be placed below primary, or fails if primary
zpos is not zero, then DRM needs a new client cap.


Right, it wasn't immediately clear to me that the API allowed placement of
things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*,
there's nothing that forbids it.
  
 

libliftoff for example, assumes that the PRIMARY has the lowest zpos. So
underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY
for the underlay view.


That's totally ok. It works, right? Plane type does not matter if the
KMS driver accepts the configuration.

What is a "scanout plane"? Aren't all KMS planes by definition scanout
planes?


Pardon my terminology, I thought the scanout plane was where weston rendered
non-offloadable surfaces to. I guess it's more correct to call it the "render
plane". On weston, it seems to be always assigned to the PRIMARY.
  


The assignment restriction is just technical design debt. It is
limiting. There is no other good reason for it, than when lighting
up a CRTC for the first time, Weston should do it with the renderer FB
only, on the plane that is most likely to succeed i.e. PRIMARY. After
the CRTC is lit, there should be no built-in limitations in what can go
where.

The reason for this is that if a CRTC can be activated, it must always
be able to show the renderer FB without incurring a modeset. This is
important for ensuring that the fallback compositing (renderer) is
always possible. So we start with that configuration, and everything
else is optional bonus.


Genuinely curious - What exactly is limiting with keeping the renderer FB on
PRIMARY? IOW, what is the additional benefit of placing the renderer FB on
something other than PRIMARY?


The limitations come from a combination of hardware limitations.
Perhaps zpos is not mutable, or maybe other planes cannot arbitrarily
move between above and below the primary. This reduces the number of
possible configurations, which might cause off-loading to fail.

I think older hardware has more of these arbitrary restrictions.


I see. I was thinking that drivers can do under-the-hood stuff to present a
mutable zpos to clients, even if their hardware planes cannot be arbitrarily
rearranged, by mapping the PRIMARY to a different hardware plane. But not all
planes have the same function, so this sounds more complicated than helpful.





For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay
plane would work. But I think keeping the render plane on PRIMARY (a la weston)
makes underlay arrangements easier to allocate, and would be nice to incorporate
into a shared algorithm.


If zpos exists, I don't think such limitation is a good idea. It will
just limit the possible configurations for no reason.

With zpos, the KMS plane type should be irrelevant for their
z-ordering. Underlay vs. overlay completely loses its meaning at the
KMS level.


Right, the plane types loose their meanings. But at least with the way
libliftoff builds the plane arrangement, where we first allocate the renderer fb
matters.

libliftoff incrementally builds the atomic state by adding a single plane to the
atomic state, then testing it. It essentially does a depth-first-search of all
p

Re: [PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

2024-04-15 Thread Leo Li





On 2024-04-15 04:19, Pekka Paalanen wrote:

On Fri, 12 Apr 2024 16:14:28 -0400
Leo Li  wrote:


On 2024-04-12 11:31, Alex Deucher wrote:

On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen
 wrote:


On Fri, 12 Apr 2024 10:28:52 -0400
Leo Li  wrote:
  

On 2024-04-12 04:03, Pekka Paalanen wrote:

On Thu, 11 Apr 2024 16:33:57 -0400
Leo Li  wrote:
  


...
  

That begs the question of what can be nailed down and what can left to
independent implementation. I guess things like which plane should be enabled
first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed)
can be defined. How to handle atomic test failures could be as well.


What room is there for the interpretation of zpos values?

I thought they are unambiguous already: only the relative numerical
order matters, and that uniquely defines the KMS plane ordering.


The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way
for vendors to communicate overlay, underlay, or mixed-arrangement support. I
don't think allowing OVERLAYs to be placed under the PRIMARY is currently
documented as a way to support underlay.


I always thought it's obvious that the zpos numbers dictate the plane
order without any other rules. After all, we have the universal planes
concept, where the plane type is only informational to aid heuristics
rather than defining anything.

Only if the zpos property does not exist, the plane types would come
into play.

Of course, if there actually exists userspace that fails if zpos allows
an overlay type plane to be placed below primary, or fails if primary
zpos is not zero, then DRM needs a new client cap.


Right, it wasn't immediately clear to me that the API allowed placement of
things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*,
there's nothing that forbids it.

  

libliftoff for example, assumes that the PRIMARY has the lowest zpos. So
underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY
for the underlay view.


That's totally ok. It works, right? Plane type does not matter if the
KMS driver accepts the configuration.

What is a "scanout plane"? Aren't all KMS planes by definition scanout
planes?


Pardon my terminology, I thought the scanout plane was where weston rendered
non-offloadable surfaces to. I guess it's more correct to call it the "render
plane". On weston, it seems to be always assigned to the PRIMARY.



The assignment restriction is just technical design debt. It is
limiting. There is no other good reason for it, than when lighting
up a CRTC for the first time, Weston should do it with the renderer FB
only, on the plane that is most likely to succeed i.e. PRIMARY. After
the CRTC is lit, there should be no built-in limitations in what can go
where.

The reason for this is that if a CRTC can be activated, it must always
be able to show the renderer FB without incurring a modeset. This is
important for ensuring that the fallback compositing (renderer) is
always possible. So we start with that configuration, and everything
else is optional bonus.


Genuinely curious - What exactly is limiting with keeping the renderer FB on
PRIMARY? IOW, what is the additional benefit of placing the renderer FB on
something other than PRIMARY?





For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay
plane would work. But I think keeping the render plane on PRIMARY (a la weston)
makes underlay arrangements easier to allocate, and would be nice to incorporate
into a shared algorithm.


If zpos exists, I don't think such limitation is a good idea. It will
just limit the possible configurations for no reason.

With zpos, the KMS plane type should be irrelevant for their
z-ordering. Underlay vs. overlay completely loses its meaning at the
KMS level.


Right, the plane types loose their meanings. But at least with the way
libliftoff builds the plane arrangement, where we first allocate the renderer fb
matters.

libliftoff incrementally builds the atomic state by adding a single plane to the
atomic state, then testing it. It essentially does a depth-first-search of all
possible arrangements, pruning the search on atomic test fail. The state that
offloads the most number of FBs will be the arrangement used.

Of course, it's unlikely that the entire DFS tree will traversed in time for a
frame. So the key is to search the most probable and high-benefit branches
first, while minimizing the # of atomic tests needed, before a hard-coded
deadline is hit.

Following this algorithm, the PRIMARY needs to be enabled first, followed by all
the secondary planes. After a plane is enabled, it's not preferred to change
it's assigned FB, since that can cause the state to be rejected (in actuality,
not just the FB, but also any color and transformation stuffs associated with
the surface). It is preferable to build on the state by enabling another
fb->

Re: [PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

2024-04-12 Thread Leo Li




On 2024-04-12 11:31, Alex Deucher wrote:

On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen
 wrote:


On Fri, 12 Apr 2024 10:28:52 -0400
Leo Li  wrote:


On 2024-04-12 04:03, Pekka Paalanen wrote:

On Thu, 11 Apr 2024 16:33:57 -0400
Leo Li  wrote:



...


That begs the question of what can be nailed down and what can left to
independent implementation. I guess things like which plane should be enabled
first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed)
can be defined. How to handle atomic test failures could be as well.


What room is there for the interpretation of zpos values?

I thought they are unambiguous already: only the relative numerical
order matters, and that uniquely defines the KMS plane ordering.


The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way
for vendors to communicate overlay, underlay, or mixed-arrangement support. I
don't think allowing OVERLAYs to be placed under the PRIMARY is currently
documented as a way to support underlay.


I always thought it's obvious that the zpos numbers dictate the plane
order without any other rules. After all, we have the universal planes
concept, where the plane type is only informational to aid heuristics
rather than defining anything.

Only if the zpos property does not exist, the plane types would come
into play.

Of course, if there actually exists userspace that fails if zpos allows
an overlay type plane to be placed below primary, or fails if primary
zpos is not zero, then DRM needs a new client cap.


Right, it wasn't immediately clear to me that the API allowed placement of 
things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*, 
there's nothing that forbids it.





libliftoff for example, assumes that the PRIMARY has the lowest zpos. So
underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY
for the underlay view.


That's totally ok. It works, right? Plane type does not matter if the
KMS driver accepts the configuration.

What is a "scanout plane"? Aren't all KMS planes by definition scanout
planes?


Pardon my terminology, I thought the scanout plane was where weston rendered
non-offloadable surfaces to. I guess it's more correct to call it the "render
plane". On weston, it seems to be always assigned to the PRIMARY.


For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay
plane would work. But I think keeping the render plane on PRIMARY (a la weston)
makes underlay arrangements easier to allocate, and would be nice to incorporate
into a shared algorithm.

In an underlay arrangement, pushing down an OVERLAY's zpos below the PRIMARY's
zpos is simpler than swapping their surfaces. If such an arrangement fails
atomic_test, we won't have to worry about swapping the surfaces back. Of course,
it's not that we can't keep track of that in the algorithm, but I think it does
make things easier.

It may help with reducing the amount of atomic tests. Assuming that the same DRM
plane provides the same format/color management/transformation support
regardless of it's zpos, we should be able to reasonably expect that changing
it's z-ordering will not cause atomic_test failures (or at least, expect less
causes for failure). In other words, swapping the render plane from the PRIMARY
to an OVERLAY might have more causes for an atomic_test fail, versus changing
their z-ordering. The driver might have to do more things under-the-hood to
provide this consistent behavior, but I think that's the right place for it.
After all, drivers should know more about their hardware's behavior.

The assumption that the PRIMARY has the lowest zpos isn't always true. I
was made aware that the imx8mq platform places all of their OVERLAYS beneath the
PRIMARY. Granted, the KMS code for enabling OVERLAYS is not upstream yet, but it
is available from this thread:
https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2319898
. I guess this is more of a bad assumption that should be fixed in libliftoff.



IOW, if the KMS client understands zpos and can do a proper KMS
configuration search, and all planes have zpos property, then there is
no need to look at the plane type at all. That is the goal of the
universal planes feature.


The optimal configuration with DCN hardware is using underlays.  E.g.,
the desktop plane would be at the top and would have holes cut out of
it for videos or windows that want their own plane.  If you do it the
other way around, there are lots of limitations.

Alex


Right, patch 1/2 tries to work around one of these limitations (cursor-on-yuv). 
Others have mentioned we can do the same for scaling.


Thanks,
Leo






Thanks,
pq


Re: [PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

2024-04-12 Thread Leo Li




On 2024-04-12 04:03, Pekka Paalanen wrote:

On Thu, 11 Apr 2024 16:33:57 -0400
Leo Li  wrote:


On 2024-04-04 10:22, Marius Vlad wrote:

On Thu, Apr 04, 2024 at 09:59:03AM -0400, Harry Wentland wrote:
  

Hi all,


On 2024-04-04 06:24, Pekka Paalanen wrote:

On Wed, 3 Apr 2024 17:32:46 -0400
Leo Li  wrote:
  

On 2024-03-28 10:33, Pekka Paalanen wrote:

On Fri, 15 Mar 2024 13:09:56 -0400
 wrote:
  

From: Leo Li 

These patches aim to make the amdgpgu KMS driver play nicer with compositors
when building multi-plane scanout configurations. They do so by:

1. Making cursor behavior more sensible.
2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for
  'underlay' configurations (perhaps more of a RFC, see below).

Please see the commit messages for details.


For #2, the simplest way to accomplish this was to increase the value of the
immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with
a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an
underlay scanout configuration.

Technically speaking, DCN hardware does not have a concept of primary or overlay
planes - there are simply 4 general purpose hardware pipes that can be maped in
any configuration. So the immutable zpos restriction on the PRIMARY plane is
kind of arbitrary; it can have a mutable range of (0-254) just like the
OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat
arbitrary. We can interpret PRIMARY as the first plane that should be enabled on
a CRTC, but beyond that, it doesn't mean much for amdgpu.

Therefore, I'm curious about how compositors devs understand KMS planes and
their zpos properties, and how we would like to use them. It isn't clear to me
how compositors wish to interpret and use the DRM zpos property, or
differentiate between OVERLAY and PRIMARY planes, when it comes to setting up
multi-plane scanout.


You already quoted me on the Weston link, so I don't think I have
anything to add. Sounds fine to me, and we don't have a standard plane
arrangement algorithm that the kernel could optimize zpos ranges
against, yet.
  

Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM
plane API side, that can make building multi-plane scanout configurations easier
for compositors?" I'm hoping we can converge on something, whether that be
updating the existing documentation to better define the usage, or update the
API to provide support for something that is lacking.


I think there probably should be a standardised plane arrangement
algorithm in userspace, because the search space suffers from
permutational explosion. Either there needs to be very few planes (max
4 or 5 at-all-possible per CRTC, including shareable ones) for an
exhaustive search to be feasible, or all planes should be more or less
equal in capabilities and userspace employs some simplified or
heuristic search.

If the search algorithm is fixed, then drivers could optimize zpos
ranges to have the algorithm find a solution faster.

My worry is that userspace already has heuristic search algorithms that
may start failing if drivers later change their zpos ranges to be more
optimal for another algorithm.

OTOH, as long as exhaustive search is feasible, then it does not matter
how DRM drivers set up the zpos ranges.

In any case, the zpos ranges should try to allow all possible plane
arrangements while minimizing the number of arrangements that won't
work. The absolute values of zpos are pretty much irrelevant, so I
think setting one plane to have an immutable zpos is a good idea, even
if it's not necessary by the driver. That is one less moving part, and
only the relative ordering between the planes matters.


Thanks,
pq


Right, thanks for your thoughts! I agree that there should be a common plane
arrangement algorithm. I think libliftoff is the most obvious candidate here. It
only handles overlay arrangements currently, but mixed-mode arrangements is
something I've been trying to look at.

Taking the driver's reported zpos into account could narrow down the search
space for mixed arrangements. We could tell whether underlay, or overlay, or
both, is supported by looking at the allowed zpos ranges.

I also wonder if it'll make underlay assignments easier. libliftoff has an
assumption that the PRIMARY plane has the lowest zpos (which now I realize, is
not always true). Therefore, the underlay buffer has to be placed on the
PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between
planes when testing mixed-arrangements is kind of awkward, and simply setting
the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler.

Currently only gamescope makes use of libliftoff, but I'm curious if patches
hooking it up to Weston would be welcomed? If there are other ways to have a
common arrangement algorithm, I'd 

Re: [PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

2024-04-11 Thread Leo Li





On 2024-04-04 10:22, Marius Vlad wrote:

On Thu, Apr 04, 2024 at 09:59:03AM -0400, Harry Wentland wrote:



Hi all,


On 2024-04-04 06:24, Pekka Paalanen wrote:

On Wed, 3 Apr 2024 17:32:46 -0400
Leo Li  wrote:


On 2024-03-28 10:33, Pekka Paalanen wrote:

On Fri, 15 Mar 2024 13:09:56 -0400
 wrote:
   

From: Leo Li 

These patches aim to make the amdgpgu KMS driver play nicer with compositors
when building multi-plane scanout configurations. They do so by:

1. Making cursor behavior more sensible.
2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for
 'underlay' configurations (perhaps more of a RFC, see below).

Please see the commit messages for details.


For #2, the simplest way to accomplish this was to increase the value of the
immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with
a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an
underlay scanout configuration.

Technically speaking, DCN hardware does not have a concept of primary or overlay
planes - there are simply 4 general purpose hardware pipes that can be maped in
any configuration. So the immutable zpos restriction on the PRIMARY plane is
kind of arbitrary; it can have a mutable range of (0-254) just like the
OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat
arbitrary. We can interpret PRIMARY as the first plane that should be enabled on
a CRTC, but beyond that, it doesn't mean much for amdgpu.

Therefore, I'm curious about how compositors devs understand KMS planes and
their zpos properties, and how we would like to use them. It isn't clear to me
how compositors wish to interpret and use the DRM zpos property, or
differentiate between OVERLAY and PRIMARY planes, when it comes to setting up
multi-plane scanout.


You already quoted me on the Weston link, so I don't think I have
anything to add. Sounds fine to me, and we don't have a standard plane
arrangement algorithm that the kernel could optimize zpos ranges
against, yet.
   

Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM
plane API side, that can make building multi-plane scanout configurations easier
for compositors?" I'm hoping we can converge on something, whether that be
updating the existing documentation to better define the usage, or update the
API to provide support for something that is lacking.


I think there probably should be a standardised plane arrangement
algorithm in userspace, because the search space suffers from
permutational explosion. Either there needs to be very few planes (max
4 or 5 at-all-possible per CRTC, including shareable ones) for an
exhaustive search to be feasible, or all planes should be more or less
equal in capabilities and userspace employs some simplified or
heuristic search.

If the search algorithm is fixed, then drivers could optimize zpos
ranges to have the algorithm find a solution faster.

My worry is that userspace already has heuristic search algorithms that
may start failing if drivers later change their zpos ranges to be more
optimal for another algorithm.

OTOH, as long as exhaustive search is feasible, then it does not matter
how DRM drivers set up the zpos ranges.

In any case, the zpos ranges should try to allow all possible plane
arrangements while minimizing the number of arrangements that won't
work. The absolute values of zpos are pretty much irrelevant, so I
think setting one plane to have an immutable zpos is a good idea, even
if it's not necessary by the driver. That is one less moving part, and
only the relative ordering between the planes matters.


Thanks,
pq


Right, thanks for your thoughts! I agree that there should be a common plane
arrangement algorithm. I think libliftoff is the most obvious candidate here. It
only handles overlay arrangements currently, but mixed-mode arrangements is
something I've been trying to look at.

Taking the driver's reported zpos into account could narrow down the search
space for mixed arrangements. We could tell whether underlay, or overlay, or
both, is supported by looking at the allowed zpos ranges.

I also wonder if it'll make underlay assignments easier. libliftoff has an
assumption that the PRIMARY plane has the lowest zpos (which now I realize, is
not always true). Therefore, the underlay buffer has to be placed on the
PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between
planes when testing mixed-arrangements is kind of awkward, and simply setting
the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler.

Currently only gamescope makes use of libliftoff, but I'm curious if patches
hooking it up to Weston would be welcomed? If there are other ways to have a
common arrangement algorithm, I'd be happy to hear that as well.


A natural thing would be to document such an algorithm with the KMS
UAPI.

Re: [PATCH 0/2] drm/amdgpu/display: Make multi-plane configurations more flexible

2024-04-03 Thread Leo Li




On 2024-03-28 10:33, Pekka Paalanen wrote:

On Fri, 15 Mar 2024 13:09:56 -0400
 wrote:


From: Leo Li 

These patches aim to make the amdgpgu KMS driver play nicer with compositors
when building multi-plane scanout configurations. They do so by:

1. Making cursor behavior more sensible.
2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for
'underlay' configurations (perhaps more of a RFC, see below).

Please see the commit messages for details.


For #2, the simplest way to accomplish this was to increase the value of the
immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with
a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an
underlay scanout configuration.

Technically speaking, DCN hardware does not have a concept of primary or overlay
planes - there are simply 4 general purpose hardware pipes that can be maped in
any configuration. So the immutable zpos restriction on the PRIMARY plane is
kind of arbitrary; it can have a mutable range of (0-254) just like the
OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat
arbitrary. We can interpret PRIMARY as the first plane that should be enabled on
a CRTC, but beyond that, it doesn't mean much for amdgpu.

Therefore, I'm curious about how compositors devs understand KMS planes and
their zpos properties, and how we would like to use them. It isn't clear to me
how compositors wish to interpret and use the DRM zpos property, or
differentiate between OVERLAY and PRIMARY planes, when it comes to setting up
multi-plane scanout.


You already quoted me on the Weston link, so I don't think I have
anything to add. Sounds fine to me, and we don't have a standard plane
arrangement algorithm that the kernel could optimize zpos ranges
against, yet.


Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM
plane API side, that can make building multi-plane scanout configurations easier
for compositors?" I'm hoping we can converge on something, whether that be
updating the existing documentation to better define the usage, or update the
API to provide support for something that is lacking.


I think there probably should be a standardised plane arrangement
algorithm in userspace, because the search space suffers from
permutational explosion. Either there needs to be very few planes (max
4 or 5 at-all-possible per CRTC, including shareable ones) for an
exhaustive search to be feasible, or all planes should be more or less
equal in capabilities and userspace employs some simplified or
heuristic search.

If the search algorithm is fixed, then drivers could optimize zpos
ranges to have the algorithm find a solution faster.

My worry is that userspace already has heuristic search algorithms that
may start failing if drivers later change their zpos ranges to be more
optimal for another algorithm.

OTOH, as long as exhaustive search is feasible, then it does not matter
how DRM drivers set up the zpos ranges.

In any case, the zpos ranges should try to allow all possible plane
arrangements while minimizing the number of arrangements that won't
work. The absolute values of zpos are pretty much irrelevant, so I
think setting one plane to have an immutable zpos is a good idea, even
if it's not necessary by the driver. That is one less moving part, and
only the relative ordering between the planes matters.


Thanks,
pq


Right, thanks for your thoughts! I agree that there should be a common plane
arrangement algorithm. I think libliftoff is the most obvious candidate here. It
only handles overlay arrangements currently, but mixed-mode arrangements is
something I've been trying to look at.

Taking the driver's reported zpos into account could narrow down the search
space for mixed arrangements. We could tell whether underlay, or overlay, or
both, is supported by looking at the allowed zpos ranges.

I also wonder if it'll make underlay assignments easier. libliftoff has an
assumption that the PRIMARY plane has the lowest zpos (which now I realize, is
not always true). Therefore, the underlay buffer has to be placed on the
PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between
planes when testing mixed-arrangements is kind of awkward, and simply setting
the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler.

Currently only gamescope makes use of libliftoff, but I'm curious if patches
hooking it up to Weston would be welcomed? If there are other ways to have a
common arrangement algorithm, I'd be happy to hear that as well.

Note that libliftoff's algorithm is more complex than weston, since it searches
harder, and suffers from that permutational explosion. But it solves that by
trying high benefit arrangements first (offloading surfaces that update
frequently), and bailing out once the search reaches a hard-code

Re: [PATCH 1/2] drm/amd/display: Introduce overlay cursor mode

2024-04-01 Thread Leo Li




On 2024-03-28 11:52, Harry Wentland wrote:



On 2024-03-28 11:48, Robert Mader wrote:

Hi,

On 15.03.24 18:09, sunpeng...@amd.com wrote:

From: Leo Li 

[Why]

DCN is the display hardware for amdgpu. DRM planes are backed by DCN
hardware pipes, which carry pixel data from one end (memory), to the
other (output encoder).

Each DCN pipe has the ability to blend in a cursor early on in the
pipeline. In other words, there are no dedicated cursor planes in DCN,
which makes cursor behavior somewhat unintuitive for compositors.

For example, if the cursor is in RGB format, but the top-most DRM plane
is in YUV format, DCN will not be able to blend them. Because of this,
amdgpu_dm rejects all configurations where a cursor needs to be enabled
on top of a YUV formatted plane.

 From a compositor's perspective, when computing an allocation for
hardware plane offloading, this cursor-on-yuv configuration result in an
atomic test failure. Since the failure reason is not obvious at all,
compositors will likely fall back to full rendering, which is not ideal.

Instead, amdgpu_dm can try to accommodate the cursor-on-yuv
configuration by opportunistically reserving a separate DCN pipe just
for the cursor. We can refer to this as "overlay cursor mode". It is
contrasted with "native cursor mode", where the native DCN per-pipe
cursor is used.

[How]

On each crtc, compute whether the cursor plane should be enabled in
overlay mode (which currently, is iff the immediate plane below has a
YUV format). If it is, mark the CRTC as requesting overlay cursor mode.


iff -> if

IIRC another case where this would be needed is when the scale of the plane 
below doesn't match the cursor scale. This is especially common for videos and 
thus implicitly covered by the YUV check in most cases, but should probably be 
made explicit. Michel Dänzer might be able to comment here.


Another possible case that could be covered here is when some area is not 
covered by any plane and just shows a black background. This happens e.g. if a 
compositor puts a YUV surface on the primary plane that has a different 
width/high ratio than the display. The compositor can add black bars by just 
leaving the area uncovered and thus require only a single hardware plane for 
video playback ("Unless explicitly specified (..), the active area of a CRTC 
will be black by default." 
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#plane-abstraction). If 
the cursor is visible over this black bars, AMD currently refuses the commit 
IIUC (see also Michel Dänzers comment at 
https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3177#note_1924544)




Thanks for mentioning both of these scenarios. I agree it would be
good if we can expand the use of the overlay cursor to these cases.

Harry


Thanks,

Robert Mader


All good points, thanks for the feedback. I'll take a peek at the scaling + no
underlying plane case and send a v2 when I get around to it.

Leo





During DC validation, attempt to enable a separate DCN pipe for the
cursor if it's in overlay mode. If that fails, or if no overlay mode is
requested, then fallback to native mode.

Signed-off-by: Leo Li 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 309 +++---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |   7 +
  .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c    |   1 +
  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   |  13 +-
  4 files changed, 288 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 21a61454c878..09ab330aed17 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8359,8 +8359,19 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,

   * Disable the cursor first if we're disabling all the planes.
   * It'll remain on the screen after the planes are re-enabled
   * if we don't.
+ *
+ * If the cursor is transitioning from native to overlay mode, the
+ * native cursor needs to be disabled first.
   */
-    if (acrtc_state->active_planes == 0)
+    if (acrtc_state->cursor_mode == DM_CURSOR_OVERLAY_MODE &&
+    dm_old_crtc_state->cursor_mode == DM_CURSOR_NATIVE_MODE) {
+    struct dc_cursor_position cursor_position = {0};
+    dc_stream_set_cursor_position(acrtc_state->stream,
+  &cursor_position);
+    }
+
+    if (acrtc_state->active_planes == 0 &&
+    dm_old_crtc_state->cursor_mode == DM_CURSOR_NATIVE_MODE)
  amdgpu_dm_commit_cursors(state);
  /* update planes when needed */
@@ -8374,7 +8385,8 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
  struct dm_plane_state *dm_new_plane_state = 
to_dm_plane_state(new_plane_state);

  /* Cursor plane is han

Re: [regression][6.0] After commit b261509952bc19d1012cf732f853659be6ebc61e I see WARNING message at drivers/gpu/drm/drm_modeset_lock.c:276 drm_modeset_drop_locks+0x63/0x70

2023-02-09 Thread Leo Li
9801ca24bb00 R15: 
[ 2807.339611] FS:  7f57445b0600() GS:981198e0()
knlGS:
[ 2807.339613] CS:  0010 DS:  ES:  CR0: 80050033
[ 2807.339614] CR2: 7f574367f000 CR3: 0001236ae000 CR4: 00750ee0
[ 2807.339615] PKRU: 5554
[ 2807.339616] Call Trace:
[ 2807.339618]  
[ 2807.339621]  drm_mode_atomic_ioctl+0x3b9/0xac0
[ 2807.339627]  ? drm_atomic_set_property+0xb60/0xb60
[ 2807.339629]  drm_ioctl_kernel+0xac/0x160
[ 2807.339633]  drm_ioctl+0x22d/0x410
[ 2807.339635]  ? drm_atomic_set_property+0xb60/0xb60
[ 2807.339639]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[ 2807.339834]  __x64_sys_ioctl+0x90/0xd0
[ 2807.339838]  do_syscall_64+0x5b/0x80
[ 2807.339843]  ? rcu_read_lock_sched_held+0x10/0x80
[ 2807.339846]  ? trace_hardirqs_on_prepare+0x55/0xe0
[ 2807.339849]  ? do_syscall_64+0x67/0x80
[ 2807.339851]  ? rcu_read_lock_sched_held+0x10/0x80
[ 2807.339853]  ? trace_hardirqs_on_prepare+0x55/0xe0
[ 2807.339855]  ? do_syscall_64+0x67/0x80
[ 2807.339857]  ? do_syscall_64+0x67/0x80
[ 2807.339859]  ? rcu_read_lock_sched_held+0x10/0x80
[ 2807.339861]  ? trace_hardirqs_on_prepare+0x55/0xe0
[ 2807.339863]  ? do_syscall_64+0x67/0x80
[ 2807.339864]  ? do_syscall_64+0x67/0x80
[ 2807.339867]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 2807.339870] RIP: 0033:0x7f5749e8d04f
[ 2807.339873] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24
10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00
00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28
00 00
[ 2807.339875] RSP: 002b:7ffecf4c6460 EFLAGS: 0246 ORIG_RAX:
0010
[ 2807.339877] RAX: ffda RBX: 55c222fb32f0 RCX: 7f5749e8d04f
[ 2807.339878] RDX: 7ffecf4c6500 RSI: c03864bc RDI: 000e
[ 2807.339880] RBP: 7ffecf4c6500 R08:  R09: 
[ 2807.339881] R10: 55c21e4b9010 R11: 0246 R12: c03864bc
[ 2807.339882] R13: 000e R14: 55c222e317e0 R15: 55c21f0a4080
[ 2807.339887]  
[ 2807.339889] irq event stamp: 171599
[ 2807.339890] hardirqs last  enabled at (171599):
[] asm_exc_page_fault+0x22/0x30
[ 2807.339893] hardirqs last disabled at (171598):
[] exc_page_fault+0x121/0x2b0
[ 2807.339896] softirqs last  enabled at (171482):
[] __irq_exit_rcu+0xed/0x160
[ 2807.339900] softirqs last disabled at (171371):
[] __irq_exit_rcu+0xed/0x160
[ 2807.339903] ---[ end trace  ]---

bisect points to this commit: b261509952bc19d1012cf732f853659be6ebc61e.

b261509952bc19d1012cf732f853659be6ebc61e is the first bad commit
commit b261509952bc19d1012cf732f853659be6ebc61e
Author: Leo Li 
Date:   Tue Aug 30 16:38:16 2022 -0400

 drm/amd/display: Fix double cursor on non-video RGB MPO

 [Why]

 DC makes use of layer_index (zpos) when picking the HW plane to enable
 HW cursor on. However, some compositors will not attach zpos information
 to each DRM plane. Consequently, in amdgpu, we default layer_index to 0
 and do not update it.

 This causes said DC logic to enable HW cursor on all planes of the same
 layer_index, which manifests as a double cursor issue if one of the
 planes is scaled (and hence scaling the cursor as well).

 [How]

 Use DRM core helpers to calculate a normalized_zpos value for each
 drm_plane_state under each crtc, within the atomic state.

 This helper will first consider existing zpos values, and if
 identical/unset, fallback to plane ID ordering.

 The normalized_zpos is then passed to dc_plane_info during atomic check
 for later use by the cursor logic.

 Reviewed-by: Bhawanpreet Lakha 
 Acked-by: Wayne Lin 
     Signed-off-by: Leo Li 
 Tested-by: Daniel Wheeler 
 Signed-off-by: Alex Deucher 

  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)

After reverting this commit the WARNING messages described here disappeared.

Thanks.

--
Best Regards,
Mike Gavrilov.






Re: [PATCH v2 10/21] drm/amd/display: Signal mode_changed if colorspace changed

2023-01-24 Thread Leo Li




On 1/13/23 11:24, Harry Wentland wrote:

We need to signal mode_changed to make sure we update the output
colorspace.

v2: No need to call drm_hdmi_avi_infoframe_colorimetry as DC does its
 own infoframe packing.

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Cc: Joshua Ashton 
Cc: dri-devel@lists.freedesktop.org
Cc: amd-...@lists.freedesktop.org


Reviewed-by: Leo Li 


---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 +-
  1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index c311135f1e6f..f74462c282a6 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -6492,6 +6492,14 @@ amdgpu_dm_connector_atomic_check(struct drm_connector 
*conn,
if (!crtc)
return 0;
  
+	if (new_con_state->colorspace != old_con_state->colorspace) {

+   new_crtc_state = drm_atomic_get_crtc_state(state, crtc);
+   if (IS_ERR(new_crtc_state))
+   return PTR_ERR(new_crtc_state);
+
+   new_crtc_state->mode_changed = true;
+   }
+
if (!drm_connector_atomic_hdr_metadata_equal(old_con_state, 
new_con_state)) {
struct dc_info_packet hdr_infopacket;
  
@@ -6514,7 +6522,7 @@ amdgpu_dm_connector_atomic_check(struct drm_connector *conn,

 * set is permissible, however. So only force a
 * modeset if we're entering or exiting HDR.
 */
-   new_crtc_state->mode_changed =
+   new_crtc_state->mode_changed = new_crtc_state->mode_changed ||
!old_con_state->hdr_output_metadata ||
!new_con_state->hdr_output_metadata;
}


Re: [PATCH v2 17/21] drm/amd/display: Format input and output CSC matrix

2023-01-24 Thread Leo Li




On 1/13/23 11:24, Harry Wentland wrote:

Format the input and output CSC matrix so they
look like 3x4 matrixes. This will make parsing them
much easier and allows us to quickly spot potential
mistakes.

Signed-off-by: Harry Wentland 
Cc: Pekka Paalanen 
Cc: Sebastian Wick 
Cc: vitaly.pros...@amd.com
Cc: Joshua Ashton 
Cc: dri-devel@lists.freedesktop.org
Cc: amd-...@lists.freedesktop.org


Reviewed-by: Leo Li 


---
  .../drm/amd/display/dc/core/dc_hw_sequencer.c | 38 -
  drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h   | 54 +++
  2 files changed, 56 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c
index 471078fc3900..a70f045fc5c1 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_hw_sequencer.c
@@ -73,28 +73,38 @@ struct out_csc_color_matrix_type {
  
  static const struct out_csc_color_matrix_type output_csc_matrix[] = {

{ COLOR_SPACE_RGB_TYPE,
-   { 0x2000, 0, 0, 0, 0, 0x2000, 0, 0, 0, 0, 0x2000, 0} },
+   { 0x2000, 0,  0,  0,
+ 0,  0x2000, 0,  0,
+ 0,  0,  0x2000, 0} },
{ COLOR_SPACE_RGB_LIMITED_TYPE,
-   { 0x1B67, 0, 0, 0x201, 0, 0x1B67, 0, 0x201, 0, 0, 0x1B67, 
0x201} },
+   { 0x1B67, 0,  0,  0x201,
+ 0,  0x1B67, 0,  0x201,
+ 0,  0,  0x1B67, 0x201} },
{ COLOR_SPACE_YCBCR601_TYPE,
-   { 0xE04, 0xF444, 0xFDB9, 0x1004, 0x831, 0x1016, 0x320, 0x201, 
0xFB45,
-   0xF6B7, 0xE04, 0x1004} },
+   { 0xE04,  0xF444, 0xFDB9, 0x1004,
+ 0x831,  0x1016, 0x320,  0x201,
+ 0xFB45, 0xF6B7, 0xE04,  0x1004} },
{ COLOR_SPACE_YCBCR709_TYPE,
-   { 0xE04, 0xF345, 0xFEB7, 0x1004, 0x5D3, 0x1399, 0x1FA,
-   0x201, 0xFCCA, 0xF533, 0xE04, 0x1004} },
+   { 0xE04,  0xF345, 0xFEB7, 0x1004,
+ 0x5D3,  0x1399, 0x1FA,  0x201,
+ 0xFCCA, 0xF533, 0xE04,  0x1004} },
/* TODO: correct values below */
{ COLOR_SPACE_YCBCR601_LIMITED_TYPE,
-   { 0xE00, 0xF447, 0xFDB9, 0x1000, 0x991,
-   0x12C9, 0x3A6, 0x200, 0xFB47, 0xF6B9, 0xE00, 
0x1000} },
+   { 0xE00,  0xF447, 0xFDB9, 0x1000,
+ 0x991,  0x12C9, 0x3A6,  0x200,
+ 0xFB47, 0xF6B9, 0xE00,  0x1000} },
{ COLOR_SPACE_YCBCR709_LIMITED_TYPE,
-   { 0xE00, 0xF349, 0xFEB7, 0x1000, 0x6CE, 0x16E3,
-   0x24F, 0x200, 0xFCCB, 0xF535, 0xE00, 0x1000} },
+   { 0xE00, 0xF349, 0xFEB7, 0x1000,
+ 0x6CE, 0x16E3, 0x24F,  0x200,
+ 0xFCCB, 0xF535, 0xE00, 0x1000} },
{ COLOR_SPACE_YCBCR2020_TYPE,
-   { 0x1000, 0xF149, 0xFEB7, 0x, 0x0868, 0x15B2,
-   0x01E6, 0x, 0xFB88, 0xF478, 0x1000, 0x} 
},
+   { 0x1000, 0xF149, 0xFEB7, 0x,
+ 0x0868, 0x15B2, 0x01E6, 0x,
+ 0xFB88, 0xF478, 0x1000, 0x} },
{ COLOR_SPACE_YCBCR709_BLACK_TYPE,
-   { 0x, 0x, 0x, 0x1000, 0x, 0x,
-   0x, 0x0200, 0x, 0x, 0x, 0x1000} 
},
+   { 0x, 0x, 0x, 0x1000,
+ 0x, 0x, 0x, 0x0200,
+ 0x, 0x, 0x, 0x1000} },
  };
  
  static bool is_rgb_type(

diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h 
b/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
index 131fcfa28bca..f4aa76e02518 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
@@ -70,28 +70,38 @@ struct dpp_input_csc_matrix {
  };
  
  static const struct dpp_input_csc_matrix __maybe_unused dpp_input_csc_matrix[] = {

-   {COLOR_SPACE_SRGB,
-   {0x2000, 0, 0, 0, 0, 0x2000, 0, 0, 0, 0, 0x2000, 0} },
-   {COLOR_SPACE_SRGB_LIMITED,
-   {0x2000, 0, 0, 0, 0, 0x2000, 0, 0, 0, 0, 0x2000, 0} },
-   {COLOR_SPACE_YCBCR601,
-   {0x2cdd, 0x2000, 0, 0xe991, 0xe926, 0x2000, 0xf4fd, 0x10ef,
-   0, 0x2000, 0x38b4, 0xe3a6} },
-   {COLOR_SPACE_YCBCR601_LIMITED,
-   {0x3353, 0x2568, 0, 0xe400, 0xe5dc, 0x2568, 0xf367, 0x1108,
-   0, 0x2568, 0x40de, 0xdd3a} },
-   {COLOR_SPACE_YCBCR709,
-   {0x3265, 0x2000, 0, 0xe6ce, 0xf105, 0x2000, 0xfa01, 0xa7d, 0,
-   0x2000, 0x3b61, 0xe24f} },
-   {COLOR_SPACE_YCBCR709_LIMITED,
-   {0x39a6, 0x2568, 0, 0xe0d6, 0xeedd, 0x2568, 0xf925, 0x9a8, 0,
-   0x2568, 0x43ee

Re: [PATCH v2] drm/amd/display: fix PSR-SU/DSC interoperability support

2023-01-05 Thread Leo Li





On 1/5/23 15:07, Hamza Mahfooz wrote:

On 1/5/23 13:29, Harry Wentland wrote:



On 1/5/23 12:38, Hamza Mahfooz wrote:

Currently, there are issues with enabling PSR-SU + DSC. This stems from
the fact that DSC imposes a slice height on transmitted video data and
we are not conforming to that slice height in PSR-SU regions. So, pass
slice_height into su_y_granularity to feed the DSC slice height into
PSR-SU code.

Signed-off-by: Hamza Mahfooz 
---
v2: move code to modules/power.
---
  .../drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c |  3 ++
  .../amd/display/modules/power/power_helpers.c | 35 +++
  .../amd/display/modules/power/power_helpers.h |  3 ++
  3 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c

index 26291db0a3cf..872d06fe1436 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c
@@ -122,6 +122,9 @@ bool amdgpu_dm_link_setup_psr(struct 
dc_stream_state *stream)

  psr_config.allow_multi_disp_optimizations =
  (amdgpu_dc_feature_mask & DC_PSR_ALLOW_MULTI_DISP_OPT);
+    if (!psr_su_set_y_granularity(dc, link, stream, &psr_config))
+    return false;
+
  ret = dc_link_setup_psr(link, stream, &psr_config, 
&psr_context);

  }
diff --git 
a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c 
b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c

index 9b5d9b2c9a6a..4d27ad9f7370 100644
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
+++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.c
@@ -916,3 +916,38 @@ bool mod_power_only_edp(const struct dc_state 
*context, const struct dc_stream_s

  {
  return context && context->stream_count == 1 && 
dc_is_embedded_signal(stream->signal);

  }
+
+bool psr_su_set_y_granularity(struct dc *dc, struct dc_link *link,
+  struct dc_stream_state *stream,
+  struct psr_config *config)
+{
+    uint16_t pic_height;
+    uint8_t slice_height;
+
+    if (!dc->caps.edp_dsc_support ||
+    link->panel_config.dsc.disable_dsc_edp ||
+
!link->dpcd_caps.dsc_caps.dsc_basic_caps.fields.dsc_support.DSC_SUPPORT ||

+    !stream->timing.dsc_cfg.num_slices_v)


I'm not sure this condition is correct. We can have DSC but not eDP DSC
support.



AFAIK PSR-SU displays use eDP exclusively, so we shouldn't have to worry 
about this case.


Right, the dc_link here should only be eDP. I suppose that isn't quite
clear.

Maybe add this as part of the condition?

if (!(link->connector_signal & SIGNAL_TYPE_EDP))
return true;

Thanks,
Leo




+    return true;
+
+    pic_height = stream->timing.v_addressable +
+    stream->timing.v_border_top + stream->timing.v_border_bottom;
+    slice_height = pic_height / stream->timing.dsc_cfg.num_slices_v;
+
+    if (slice_height) {
+    if (config->su_y_granularity &&
+    (slice_height % config->su_y_granularity)) {
+    WARN(1,


We don't use WARN in display/dc or display/modules. DC_LOG_WARNING
might be better, or log it in the caller.

Harry


+ "%s: dsc: %d, slice_height: %d, num_slices_v: %d\n",
+ __func__,
+ stream->sink->dsc_caps.dsc_dec_caps.is_dsc_supported,
+ slice_height,
+ stream->timing.dsc_cfg.num_slices_v);
+    return false;
+    }
+
+    config->su_y_granularity = slice_height;
+    }
+
+    return true;
+}
diff --git 
a/drivers/gpu/drm/amd/display/modules/power/power_helpers.h 
b/drivers/gpu/drm/amd/display/modules/power/power_helpers.h

index 316452e9dbc9..bb16b37b83da 100644
--- a/drivers/gpu/drm/amd/display/modules/power/power_helpers.h
+++ b/drivers/gpu/drm/amd/display/modules/power/power_helpers.h
@@ -59,4 +59,7 @@ void mod_power_calc_psr_configs(struct psr_config 
*psr_config,

  const struct dc_stream_state *stream);
  bool mod_power_only_edp(const struct dc_state *context,
  const struct dc_stream_state *stream);
+bool psr_su_set_y_granularity(struct dc *dc, struct dc_link *link,
+  struct dc_stream_state *stream,
+  struct psr_config *config);
  #endif /* MODULES_POWER_POWER_HELPERS_H_ */






Re: [PATCH v2] drm/amd/display: add FB_DAMAGE_CLIPS support

2022-12-01 Thread Leo Li




On 11/18/22 16:51, Hamza Mahfooz wrote:

Currently, userspace doesn't have a way to communicate selective updates
to displays. So, enable support for FB_DAMAGE_CLIPS for DCN ASICs newer
than DCN301, convert DRM damage clips to dc dirty rectangles and fill
them into dirty_rects in fill_dc_dirty_rects().

Signed-off-by: Hamza Mahfooz 


Thanks for the patch, it LGTM.

Reviewed-by: Leo Li 

It would be good to add an IGT case to cover combinations of MPO & 
damage clip commits. Perhaps something that covers the usecase of moving 
a MPO video, while desktop UI updates. We'd expect the SU region to 
cover both the MPO plane + UI damage clips, or fallback to FFU.


Thanks,
Leo

---
v2: fallback to FFU if we run into too many dirty rectangles, consider
 dirty rectangles in non MPO case and always add a dirty rectangle
 for the new plane if there are bb changes in the MPO case.
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 130 +++---
  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   |   4 +
  2 files changed, 88 insertions(+), 46 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 4eb8201a2608..7af94a2c6237 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4842,6 +4842,35 @@ static int fill_dc_plane_attributes(struct amdgpu_device 
*adev,
return 0;
  }
  
+static inline void fill_dc_dirty_rect(struct drm_plane *plane,

+ struct rect *dirty_rect, int32_t x,
+ int32_t y, int32_t width, int32_t height,
+ int *i, bool ffu)
+{
+   if (*i > DC_MAX_DIRTY_RECTS)
+   return;
+
+   if (*i == DC_MAX_DIRTY_RECTS)
+   goto out;
+
+   dirty_rect->x = x;
+   dirty_rect->y = y;
+   dirty_rect->width = width;
+   dirty_rect->height = height;
+
+   if (ffu)
+   drm_dbg(plane->dev,
+   "[PLANE:%d] PSR FFU dirty rect size (%d, %d)\n",
+   plane->base.id, width, height);
+   else
+   drm_dbg(plane->dev,
+   "[PLANE:%d] PSR SU dirty rect at (%d, %d) size (%d, 
%d)",
+   plane->base.id, x, y, width, height);
+
+out:
+   (*i)++;
+}
+
  /**
   * fill_dc_dirty_rects() - Fill DC dirty regions for PSR selective updates
   *
@@ -4862,10 +4891,6 @@ static int fill_dc_plane_attributes(struct amdgpu_device 
*adev,
   * addition, certain use cases - such as cursor and multi-plane overlay (MPO) 
-
   * implicitly provide damage clips without any client support via the plane
   * bounds.
- *
- * Today, amdgpu_dm only supports the MPO and cursor usecase.
- *
- * TODO: Also enable for FB_DAMAGE_CLIPS
   */
  static void fill_dc_dirty_rects(struct drm_plane *plane,
struct drm_plane_state *old_plane_state,
@@ -4876,12 +4901,11 @@ static void fill_dc_dirty_rects(struct drm_plane *plane,
struct dm_crtc_state *dm_crtc_state = to_dm_crtc_state(crtc_state);
struct rect *dirty_rects = flip_addrs->dirty_rects;
uint32_t num_clips;
+   struct drm_mode_rect *clips;
bool bb_changed;
bool fb_changed;
uint32_t i = 0;
  
-	flip_addrs->dirty_rect_count = 0;

-
/*
 * Cursor plane has it's own dirty rect update interface. See
 * dcn10_dmub_update_cursor_data and dmub_cmd_update_cursor_info_data
@@ -4889,20 +4913,20 @@ static void fill_dc_dirty_rects(struct drm_plane *plane,
if (plane->type == DRM_PLANE_TYPE_CURSOR)
return;
  
-	/*

-* Today, we only consider MPO use-case for PSR SU. If MPO not
-* requested, and there is a plane update, do FFU.
-*/
+   num_clips = drm_plane_get_damage_clips_count(new_plane_state);
+   clips = drm_plane_get_damage_clips(new_plane_state);
+
if (!dm_crtc_state->mpo_requested) {
-   dirty_rects[0].x = 0;
-   dirty_rects[0].y = 0;
-   dirty_rects[0].width = dm_crtc_state->base.mode.crtc_hdisplay;
-   dirty_rects[0].height = dm_crtc_state->base.mode.crtc_vdisplay;
-   flip_addrs->dirty_rect_count = 1;
-   DRM_DEBUG_DRIVER("[PLANE:%d] PSR FFU dirty rect size (%d, 
%d)\n",
-new_plane_state->plane->base.id,
-dm_crtc_state->base.mode.crtc_hdisplay,
-dm_crtc_state->base.mode.crtc_vdisplay);
+   if (!num_clips || num_clips > DC_MAX_DIRTY_RECTS)
+   goto ffu;
+
+   for (; flip_addrs->dirty_rect_count < num_clips; clips++)
+   fill_dc_dirty_rect(new_plane_state->plane,
+ 

Re: [PATCH] drm/amd/display: add FB_DAMAGE_CLIPS support

2022-11-17 Thread Leo Li




On 11/15/22 15:24, Hamza Mahfooz wrote:

Currently, userspace doesn't have a way to communicate selective updates
to displays. So, enable support for FB_DAMAGE_CLIPS for DCN ASICs newer
than DCN301, convert DRM damage clips to dc dirty rectangles and fill
them into dirty_rects in fill_dc_dirty_rects().

Signed-off-by: Hamza Mahfooz 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 91 +++
  .../amd/display/amdgpu_dm/amdgpu_dm_plane.c   |  4 +
  2 files changed, 58 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 185d09c760ba..18b710ba802d 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4842,6 +4842,31 @@ static int fill_dc_plane_attributes(struct amdgpu_device 
*adev,
return 0;
  }
  
+static inline void fill_dc_dirty_rect(struct drm_plane *plane,

+ struct rect *dirty_rect, int32_t x,
+ int32_t y, int32_t width, int32_t height,
+ int *i, bool ffu)
+{
+   WARN_ON(*i >= DC_MAX_DIRTY_RECTS);


Hi Hamza,

Since dc_flip_addrs->dirty_rects has a fixed length of 
DC_MAX_DIRTY_RECTS per pipe (a restriction by DMUB FW), I think it makes 
more sense to fallback to a full-frame-update (FFU) if num_clips > 
DC_MAX_DIRTY_RECTS. An alternative would be to reject the atomic commit, 
but that sounds heavy handed.




+
+   dirty_rect->x = x;
+   dirty_rect->y = y;
+   dirty_rect->width = width;
+   dirty_rect->height = height;
+
+   if (ffu)
+   drm_dbg(plane->dev,
+   "[PLANE:%d] PSR FFU dirty rect size (%d, %d)\n",
+   plane->base.id, width, height);
+   else
+   drm_dbg(plane->dev,
+   "[PLANE:%d] PSR SU dirty rect at (%d, %d) size (%d, 
%d)",
+   plane->base.id, x, y, width, height);
+
+   (*i)++;
+}
+
+
  /**
   * fill_dc_dirty_rects() - Fill DC dirty regions for PSR selective updates
   *
@@ -4862,10 +4887,6 @@ static int fill_dc_plane_attributes(struct amdgpu_device 
*adev,
   * addition, certain use cases - such as cursor and multi-plane overlay (MPO) 
-
   * implicitly provide damage clips without any client support via the plane
   * bounds.
- *
- * Today, amdgpu_dm only supports the MPO and cursor usecase.
- *
- * TODO: Also enable for FB_DAMAGE_CLIPS
   */
  static void fill_dc_dirty_rects(struct drm_plane *plane,
struct drm_plane_state *old_plane_state,
@@ -4876,12 +4897,11 @@ static void fill_dc_dirty_rects(struct drm_plane *plane,
struct dm_crtc_state *dm_crtc_state = to_dm_crtc_state(crtc_state);
struct rect *dirty_rects = flip_addrs->dirty_rects;
uint32_t num_clips;
+   struct drm_mode_rect *clips;
bool bb_changed;
bool fb_changed;
uint32_t i = 0;
  
-	flip_addrs->dirty_rect_count = 0;

-
/*
 * Cursor plane has it's own dirty rect update interface. See
 * dcn10_dmub_update_cursor_data and dmub_cmd_update_cursor_info_data
@@ -4894,15 +4914,11 @@ static void fill_dc_dirty_rects(struct drm_plane *plane,
 * requested, and there is a plane update, do FFU.
 */
if (!dm_crtc_state->mpo_requested) {
-   dirty_rects[0].x = 0;
-   dirty_rects[0].y = 0;
-   dirty_rects[0].width = dm_crtc_state->base.mode.crtc_hdisplay;
-   dirty_rects[0].height = dm_crtc_state->base.mode.crtc_vdisplay;
-   flip_addrs->dirty_rect_count = 1;
-   DRM_DEBUG_DRIVER("[PLANE:%d] PSR FFU dirty rect size (%d, 
%d)\n",
-new_plane_state->plane->base.id,
-dm_crtc_state->base.mode.crtc_hdisplay,
-dm_crtc_state->base.mode.crtc_vdisplay);
+   fill_dc_dirty_rect(new_plane_state->plane, &dirty_rects[0], 0,
+  0, dm_crtc_state->base.mode.crtc_hdisplay,
+  dm_crtc_state->base.mode.crtc_vdisplay, &i,
+  true);
+   flip_addrs->dirty_rect_count = i;
return;
}


Previously, we always do FFU on plane updates if no MPO has been 
requested. Now, since we want to look at user-provided DRM damage clips, 
this bit needs a bit of a rework.


In short, if there are valid clips for this plane 
(drm_plane_get_damage_clips_count() > 0), they should be added to 
dc_dirty_rects. Otherwise, fallback to old FFU logic.


With MPO, the damage clips are more interesting, since the entire 
plane's bounding box can be moved. I wonder if that is reflected in 
DRM's damage clips. Do you know if a plane bb change will be reflected 
in drm_plane_get_damage_clips()?


Thanks,
Leo
  
@@ -4914,6 +49

Re: [PATCH v2] drm/amd/display: only fill dirty rectangles when PSR is enabled

2022-11-09 Thread Leo Li




On 11/9/22 13:20, Hamza Mahfooz wrote:

Currently, we are calling fill_dc_dirty_rects() even if PSR isn't
supported by the relevant link in amdgpu_dm_commit_planes(), this is
undesirable especially because when drm.debug is enabled we are printing
messages in fill_dc_dirty_rects() that are only useful for debugging PSR
(and confusing otherwise). So, we can instead limit the filling of dirty
rectangles to only when PSR is enabled.

Signed-off-by: Hamza Mahfooz 


Reviewed-by: Leo Li 
Thanks


---
v2: give a more concrete reason.
---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 66eb16fbe09f..956a6e494709 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -7697,9 +7697,10 @@ static void amdgpu_dm_commit_planes(struct 
drm_atomic_state *state,
bundle->surface_updates[planes_count].plane_info =
&bundle->plane_infos[planes_count];
  
-		fill_dc_dirty_rects(plane, old_plane_state, new_plane_state,

-   new_crtc_state,
-   &bundle->flip_addrs[planes_count]);
+   if (acrtc_state->stream->link->psr_settings.psr_feature_enabled)
+   fill_dc_dirty_rects(plane, old_plane_state,
+   new_plane_state, new_crtc_state,
+   &bundle->flip_addrs[planes_count]);
  
  		/*

 * Only allow immediate flips for fast updates that don't


RE: [BUG] ls1046a: eDMA does not transfer data from I2C

2022-09-20 Thread Leo Li
> >
> > Despite the DMA completing successfully, no data was copied into the
> > buffer, leaving the original (now junk) contents. I probed the I2C bus
> > with an oscilloscope, and I verified that the transfer did indeed occur.
> > The timing between submission and completion seems reasonable for the
> > bus speed (50 kHz for whatever reason).
> >
> > I had a look over the I2C driver, and nothing looked obviously
> > incorrect. If anyone has ideas on what to try, I'm more than willing.
> 
> Is the DMA controller cache-coherent? I see the mainline LS1046A DT doesn't
> have a "dma-coherent" property for it, but the behaviour is entirely
> consistent with that being wrong - dma_map_single() cleans the cache,
> coherent DMA write hits the still-present cache lines,

So the coherent DMA write only gets data into the cache not also the DRAM?  
Otherwise a read back would get the updated data too.

- Leo

> dma_unmap_single() invalidates the cache, and boom, the data is gone and
> you read back the previous content of the buffer that was cleaned out to
> DRAM beforehand.
> 
> Robin.
> 
> > --Sean



RE: [BUG] ls1046a: eDMA does not transfer data from I2C

2022-09-20 Thread Leo Li


> -Original Message-
> From: Sean Anderson 
> Sent: Tuesday, September 20, 2022 11:21 AM
> To: Robin Murphy ; Oleksij Rempel
> ; Pengutronix Kernel Team
> ; linux-...@vger.kernel.org; linux-arm-kernel
> ; Vinod Koul ;
> dmaeng...@vger.kernel.org; Leo Li ; Laurentiu Tudor
> 
> Cc: Linux Kernel Mailing List ; dri-
> de...@lists.freedesktop.org; Christian König ;
> linaro-mm-...@lists.linaro.org; Shawn Guo ; Sumit
> Semwal ; Joy Zou ; linux-
> me...@vger.kernel.org
> Subject: Re: [BUG] ls1046a: eDMA does not transfer data from I2C
> 
> 
> 
> On 9/20/22 11:44 AM, Sean Anderson wrote:
> >
> >
> > On 9/20/22 11:24 AM, Sean Anderson wrote:
> >>
> >>
> >> On 9/20/22 6:07 AM, Robin Murphy wrote:
> >>> On 2022-09-19 23:24, Sean Anderson wrote:
> >>>> Hi all,
> >>>>
> >>>> I discovered a bug in either imx_i2c or fsl-edma on the LS1046A
> >>>> where no data is read in i2c_imx_dma_read except for the last two
> >>>> bytes (which are not read using DMA). This is perhaps best
> >>>> illustrated with the following example:
> >>>>
> >>>> # hexdump -C /sys/bus/nvmem/devices/0-00540/nvmem
> >>>> [  308.914884] i2c i2c-0: 00080938 0x00088938
> 0xf5401000 75401000
> >>>> [  308.923529] src= 2180004 dst=f5401000 attr=   0 soff=   0 nbytes=1
> slast=   0
> >>>> [  308.923529] citer=  7e biter=  7e doff=   1 dlast_sga=   0
> >>>> [  308.923529] major_int=1 disable_req=1 enable_sg=0 [  308.942113]
> >>>> fsl-edma 2c0.edma: vchan 1b4371fc: txd
> >>>> d9dd26c5[4]: submitted [  308.974049] fsl-edma
> >>>> 2c0.edma: txd d9dd26c5[4]: marked complete [
> >>>> 308.981339] i2c i2c-0: 00080938 = [2e 2e 2f 2e 2e 2f 2e 2e
> >>>> 2f 64 65 76 69 63 65 73 2f 70 6c 61 74 66 6f 72 6d 2f 73 6f 63 2f 32 31 
> >>>> 38 30
> 30 30 30 2e 69 32 63 2f 69 32 63 2d 30 2f 30 2d 30 30 35 34 2f 30 2d 30 30 35 
> 34
> 30 00 00] [  309.002226] i2c i2c-0: 75401000 = [2e 2e 2f 2e 2e 2f 2e 
> 2e 2f
> 64 65 76 69 63 65 73 2f 70 6c 61 74 66 6f 72 6d 2f 73 6f 63 2f 32 31 38 30 30 
> 30 30
> 2e 69 32 63 2f 69 32 63 2d 30 2f 30 2d 30 30 35 34 2f 30 2d 30 30 35 34 30 00 
> 00]
> [  309.024649] i2c i2c-0: 000809380080 0x000889380080
> 0xf5401800 75401800
> >>>> [  309.033270] src= 2180004 dst=f5401800 attr=   0 soff=   0 nbytes=1
> slast=   0
> >>>> [  309.033270] citer=  7e biter=  7e doff=   1 dlast_sga=   0
> >>>> [  309.033270] major_int=1 disable_req=1 enable_sg=0 [  309.051633]
> >>>> fsl-edma 2c0.edma: vchan 1b4371fc: txd
> >>>> d9dd26c5[5]: submitted [  309.083526] fsl-edma
> >>>> 2c0.edma: txd d9dd26c5[5]: marked complete [
> >>>> 309.090807] i2c i2c-0: 000809380080 = [00 00 00 00 00 00 00 00
> >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>> 00 00 00 00 00 00 00 00 00 00 00 00] [  309.111694] i2c i2c-0:
> >>>> 75401800 = [00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >>>> 00 00 00 00]
> >>>>   2e 2e 2f 2e 2e 2f 2e 2e  2f 64 65 76 69 63 65 73
> >>>> |../../../devices|
> >>>> 0010  2f 70 6c 61 74 66 6f 72  6d 2f 73 6f 63 2f 32 31
> >>>> |/platform/soc/21|
> >>>> 0020  38 30 30 30 30 2e 69 32  63 2f 69 32 63 2d 30 2f
> >>>> |8.i2c/i2c-0/|
> >>>> 0030  30 2d 30 30 35 34 2f 30  2d 30 30 35 34 30 00 00
> >>>> |0-0054/0-00540..|
> >>>> 0040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> >>>> ||
> >>>> *
> >>>> 0070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff ff
> >>>> ||
> >>>> 0080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> >>>> ||
> >>>> *
> >>>> 00f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 ff 5b
> >>>> |...[|
> >>>> 0100
> >>>>
> >>>> (patch with my debug prints appended below)
> >>&g

RE: [BUG] ls1046a: eDMA does not transfer data from I2C

2022-09-19 Thread Leo Li


> -Original Message-
> From: Sean Anderson 
> Sent: Monday, September 19, 2022 5:24 PM
> To: Oleksij Rempel ; Pengutronix Kernel Team
> ; linux-...@vger.kernel.org; linux-arm-kernel
> ; Vinod Koul ;
> dmaeng...@vger.kernel.org
> Cc: Sumit Semwal ; Christian König
> ; Linux Kernel Mailing List  ker...@vger.kernel.org>; linux-me...@vger.kernel.org; dri-
> de...@lists.freedesktop.org; linaro-mm-...@lists.linaro.org; Joy Zou
> ; Peng Ma ; Robin Gong
> ; Shawn Guo ; Leo Li
> 
> Subject: [BUG] ls1046a: eDMA does not transfer data from I2C
> 
> Hi all,
> 
> I discovered a bug in either imx_i2c or fsl-edma on the LS1046A where no
> data is read in i2c_imx_dma_read except for the last two bytes (which are
> not read using DMA). This is perhaps best illustrated with the following
> example:

What is the kernel tree/tag that you are testing with?

Regards,
Leo


Re: [PATCH] drm: Get ref on CRTC commit object when waiting for flip_done

2018-10-12 Thread Leo Li



On 2018-10-12 03:34 AM, Daniel Vetter wrote:

On Thu, Oct 11, 2018 at 08:27:43PM -0400, sunpeng...@amd.com wrote:

From: Leo Li 

This fixes a general protection fault, caused by accessing the contents
of a flip_done completion object that has already been freed. It occurs
due to the preemption of a non-blocking commit worker thread W by
another commit thread X. X continues to clear its atomic state at the
end, destroying the CRTC commit object that W still needs. Switching
back to W and accessing the commit objects then leads to bad results.

Worker W becomes preemptable when waiting for flip_done to complete. At
this point, a frequently occurring commit thread X can take over. Here's
an example where W is a worker thread that flips on both CRTCs, and X
does a legacy cursor update on both CRTCs:

 ...
  1. W does flip work
  2. W runs commit_hw_done()
  3. W waits for flip_done on CRTC 1
  4. > flip_done for CRTC 1 completes
  5. W finishes waiting for CRTC 1
  6. W waits for flip_done on CRTC 2

  7. > Preempted by X
  8. > flip_done for CRTC 2 completes
  9. X atomic_check: hw_done and flip_done are complete on all CRTCs
 10. X updates cursor on both CRTCs
 11. X destroys atomic state
 12. X done

 13. > Switch back to W
 14. W waits for flip_done on CRTC 2
 15. W raises general protection fault

The error looks like so:

 general protection fault:  [#1] PREEMPT SMP PTI
 **snip**
 Call Trace:
  lock_acquire+0xa2/0x1b0
  _raw_spin_lock_irq+0x39/0x70
  wait_for_completion_timeout+0x31/0x130
  drm_atomic_helper_wait_for_flip_done+0x64/0x90 [drm_kms_helper]
  amdgpu_dm_atomic_commit_tail+0xcae/0xdd0 [amdgpu]
  commit_tail+0x3d/0x70 [drm_kms_helper]
  process_one_work+0x212/0x650
  worker_thread+0x49/0x420
  kthread+0xfb/0x130
  ret_from_fork+0x3a/0x50
 Modules linked in: x86_pkg_temp_thermal amdgpu(O) chash(O)
 gpu_sched(O) drm_kms_helper(O) syscopyarea sysfillrect sysimgblt
 fb_sys_fops ttm(O) drm(O)

Note that i915 has this issue masked, since hw_done is signaled after
waiting for flip_done. Doing so will block the cursor update from
happening until hw_done is signaled, preventing the cursor commit from
destroying the state.

Signed-off-by: Leo Li 


Great analysis!

Bugfix itself needs a bit more work though, see below.


---
  drivers/gpu/drm/drm_atomic_helper.c | 30 ++
  1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index 80be74d..efdf043 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1410,20 +1410,42 @@ void drm_atomic_helper_wait_for_flip_done(struct 
drm_device *dev,
  {
struct drm_crtc_state *new_crtc_state;
struct drm_crtc *crtc;
+   struct drm_crtc_commit **commits;
int i;
+   int num_crtc = dev->mode_config.num_crtc;
  
+	commits = kcalloc(num_crtc, sizeof(*commits), GFP_KERNEL);

+
+   /*
+* Because we open ourselves to preemption by calling
+* wait_for_completion_timeout(), we need to get our own references to
+* the commit objects. This is so that a preempting commit won't free
+* them.
+*/


The scheduler can preempt you at any point before here too, if you enable
CONFIG_PREEMPT. It definitely can preempt/block in the kcalloc above, even
if you have CONFIG_PREEMPT disabled. The scheduling point you identified
below is simply the most likely.



Ah, didn't think of this.


The underlying bug is that after drm_atomic_helper_commit_hw_done() we
unblock the next commit worker (thread X in your example), and cannot
assume anymore that the new state will stay around (since new state is
released by the subsequent commit work run in thread X).

Therefore your drm_crtc_commit_get here already can access freed memory.
The correct fix here is to store the temporary reference before we call
drm_atomic_helper_commit_hw_done(), and then use that in this function
here. The problem is that this is somewhat tricky to pull off, since some
drivers call wait_for_flip_done() before hw_done() (like i915).

Here's what I'd do:
- Add a new drm_crtc_commit pointer to struct __drm_crtcs_state.

- Fill that out in drm_atomic_helper_setup_commit, and make sure you
   release the reference for it in drm_atomic_state_default_clear().

- Use that pointer instead in drm_atomic_helper_wait_for_flip_done() here.


Thanks for the pointers, patch incoming :)

Leo



Cheers, Daniel

for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i) {
-   struct drm_crtc_commit *commit = new_crtc_state->commit;
+   commits[i] = new_crtc_state->commit;
+   if (commits[i])
+   drm_crtc_commit_get(commits[i]);
+   }
+
+   for (i = 0; i < n

Re: [PATCH 01/21] drm/amdgpu: Remove default best_encoder hook from DC

2018-10-05 Thread Leo Li



On 2018-10-05 11:41 AM, Harry Wentland wrote:

On 2018-10-04 04:24 PM, Daniel Vetter wrote:

For atomic driver this is the default, no need to reimplement it. We
still need to keep the copypasta for not-atomic drivers though, since
no one polished the legacy crtc helpers as much as the atomic ones.

v2: amdgpu uses ->best_encoder internally, give it a local copy. It
might be a good idea to merge the connector and encoder into one
amdgpu_dm_sink structure, that might match DC internals better. At
least for non-DPMST outputs. Kudos to Ville for spotting this.

v3: Rebase onto a487411a6481 ("drm/amd/display: Use DRM helper for
best_encoder").

Cc: Ville Syrjälä 
Signed-off-by: Daniel Vetter 
Cc: Alex Deucher 
Cc: Harry Wentland 
Cc: Andrey Grodzovsky 
Cc: Tony Cheng 
Cc: "Leo (Sunpeng) Li" 
Cc: Shirish S 


Acked-by: Harry Wentland 


Thx for rebasing :)

Reviewed-by: Leo Li 



Harry


---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 14 +++---
  1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 6a2342d72742..107e70658238 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -3189,7 +3189,6 @@ amdgpu_dm_connector_helper_funcs = {
 */
.get_modes = get_modes,
.mode_valid = amdgpu_dm_connector_mode_valid,
-   .best_encoder = drm_atomic_helper_best_encoder
  };
  
  static void dm_crtc_helper_disable(struct drm_crtc *crtc)

@@ -3592,14 +3591,17 @@ static int to_drm_connector_type(enum signal_type st)
}
  }
  
+static struct drm_encoder *amdgpu_dm_connector_to_encoder(struct drm_connector *connector)

+{
+   return drm_encoder_find(connector->dev, NULL, 
connector->encoder_ids[0]);
+}
+
  static void amdgpu_dm_get_native_mode(struct drm_connector *connector)
  {
-   const struct drm_connector_helper_funcs *helper =
-   connector->helper_private;
struct drm_encoder *encoder;
struct amdgpu_encoder *amdgpu_encoder;
  
-	encoder = helper->best_encoder(connector);

+   encoder = amdgpu_dm_connector_to_encoder(connector);
  
  	if (encoder == NULL)

return;
@@ -3726,14 +3728,12 @@ static void amdgpu_dm_connector_ddc_get_modes(struct 
drm_connector *connector,
  
  static int amdgpu_dm_connector_get_modes(struct drm_connector *connector)

  {
-   const struct drm_connector_helper_funcs *helper =
-   connector->helper_private;
struct amdgpu_dm_connector *amdgpu_dm_connector =
to_amdgpu_dm_connector(connector);
struct drm_encoder *encoder;
struct edid *edid = amdgpu_dm_connector->edid;
  
-	encoder = helper->best_encoder(connector);

+   encoder = amdgpu_dm_connector_to_encoder(connector);
  
  	if (!edid || !drm_edid_is_valid(edid)) {

amdgpu_dm_connector->num_modes =


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 08/14] drm/amdgpu: Remove default best_encoder hook from DC

2018-09-05 Thread Leo Li



On 2018-09-03 12:54 PM, Daniel Vetter wrote:

For atomic driver this is the default, no need to reimplement it. We
still need to keep the copypasta for not-atomic drivers though, since
no one polished the legacy crtc helpers as much as the atomic ones.


Thanks for the patch! It seems I made an identical change here:
https://lists.freedesktop.org/archives/amd-gfx/2018-August/026064.html

One line difference though. I wasn't aware that the default is used when
.best_encoder is null, so I just hooked it to the default helper.

If it's OK with you, I'll do a follow-up to drop the hook, so we don't
have two conflicting patches in flight.

Leo



Signed-off-by: Daniel Vetter 
Cc: Alex Deucher 
Cc: Harry Wentland 
Cc: Andrey Grodzovsky 
Cc: Tony Cheng 
Cc: "Leo (Sunpeng) Li" 
Cc: Shirish S 
---
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 ---
  1 file changed, 23 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index af6adffba788..333f9904f135 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2794,28 +2794,6 @@ static const struct drm_connector_funcs 
amdgpu_dm_connector_funcs = {
.atomic_get_property = amdgpu_dm_connector_atomic_get_property
  };
  
-static struct drm_encoder *best_encoder(struct drm_connector *connector)

-{
-   int enc_id = connector->encoder_ids[0];
-   struct drm_mode_object *obj;
-   struct drm_encoder *encoder;
-
-   DRM_DEBUG_DRIVER("Finding the best encoder\n");
-
-   /* pick the encoder ids */
-   if (enc_id) {
-   obj = drm_mode_object_find(connector->dev, NULL, enc_id, 
DRM_MODE_OBJECT_ENCODER);
-   if (!obj) {
-   DRM_ERROR("Couldn't find a matching encoder for our 
connector\n");
-   return NULL;
-   }
-   encoder = obj_to_encoder(obj);
-   return encoder;
-   }
-   DRM_ERROR("No encoder id\n");
-   return NULL;
-}
-
  static int get_modes(struct drm_connector *connector)
  {
return amdgpu_dm_connector_get_modes(connector);
@@ -2934,7 +2912,6 @@ amdgpu_dm_connector_helper_funcs = {
 */
.get_modes = get_modes,
.mode_valid = amdgpu_dm_connector_mode_valid,
-   .best_encoder = best_encoder
  };
  
  static void dm_crtc_helper_disable(struct drm_crtc *crtc)



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] [v3] drm: amd: dc: don't use FP math when Kcov is enabled

2018-08-16 Thread Leo Li



On 2018-08-11 11:54 AM, Arnd Bergmann wrote:

Building the DCN 1.0 Raven display driver with CONFIG_KCOV_INSTRUMENT_ALL=y
and CONFIG_KCOV_ENABLE_COMPARISONS=y results in warnings about many functions
that do a comparison of floating-point variables:

drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.o: In function 
`dcn_bw_calc_rq_dlg_ttu':
dcn_calcs.c:(.text+0x263): undefined reference to `__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.o: In function 
`hack_force_pipe_split':
dcn_calcs.c:(.text+0x155b): undefined reference to `__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.o: In function 
`dcn_find_dcfclk_suits_all':
dcn_calcs.c:(.text+0x190e): undefined reference to `__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.o: In function 
`dcn_validate_bandwidth':
dcn_calcs.c:(.text+0xe121): undefined reference to `__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.o: In function `dcn_bw_mod':
dcn_calc_math.c:(.text+0x22): undefined reference to 
`__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.o: In function `dcn_bw_min2':
dcn_calc_math.c:(.text+0xb2): undefined reference to 
`__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.o: In function 
`dcn_bw_ceil2':
dcn_calc_math.c:(.text+0x2a0): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.o: In function `dcn_bw_max3':
dcn_calc_math.c:(.text+0x325): undefined reference to 
`__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.o: In function `dcn_bw_max5':
dcn_calc_math.c:(.text+0x3c3): undefined reference to 
`__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.o: In function `dcn_bw_log':
dcn_calc_math.c:(.text+0x54e): undefined reference to 
`__sanitizer_cov_trace_cmpd'
dcn_calc_math.c:(.text+0x57c): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`scaler_settings_calculation':
dcn_calc_auto.c:(.text+0x5c5): undefined reference to 
`__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`mode_support_and_system_configuration':
dcn_calc_auto.c:(.text+0x137c): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`mode_support_and_system_configuration':
dcn_calc_auto.c:(.text+0x9233): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`mode_support_and_system_configuration':
dcn_calc_auto.c:(.text+0xb70f): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`mode_support_and_system_configuration':
dcn_calc_auto.c:(.text+0x121fd): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`display_pipe_configuration':
dcn_calc_auto.c:(.text+0x15a2f): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`dispclkdppclkdcfclk_deep_sleep_prefetch_parameters_watermarks_and_performance_calculation':
dcn_calc_auto.c:(.text+0x17c2d): undefined reference to 
`__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`dispclkdppclkdcfclk_deep_sleep_prefetch_parameters_watermarks_and_performance_calculation':
dcn_calc_auto.c:(.text+0x19362): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`dispclkdppclkdcfclk_deep_sleep_prefetch_parameters_watermarks_and_performance_calculation':
dcn_calc_auto.c:(.text+0x25575): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_auto.o: In function 
`dispclkdppclkdcfclk_deep_sleep_prefetch_parameters_watermarks_and_performance_calculation':
dcn_calc_auto.c:(.text+0x27f33): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/dml/display_rq_dlg_calc.o: In function 
`get_refcyc_per_delivery':
display_rq_dlg_calc.c:(.text+0xb5): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/dml/display_rq_dlg_calc.o: In function 
`calculate_ttu_cursor.isra.1':
display_rq_dlg_calc.c:(.text+0x9f6): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/dml/display_rq_dlg_calc.o: In function 
`dml_rq_dlg_get_dlg_params':
display_rq_dlg_calc.c:(.text+0x82cc): undefined reference to 
`__sanitizer_cov_trace_cmpf'
drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.o: In function 
`get_refcyc_per_delivery.isra.0':
dml1_display_rq_dlg_calc.c:(.text+0x6c4): undefined reference to 
`__sanitizer_cov_trace_cmpd'
drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.o: In function 
`get_vratio_pre.isra.2'

Re: [bug report] drm/amd/display: Read AUX channel even if only status byte is returned

2018-07-31 Thread Leo Li



On 2018-07-31 02:24 PM, Dan Carpenter wrote:

[ Potential security issue, if I'm reading the code correctly.  I don't
   really know the code and I haven't looked at the larger context. -dan ]

Hello Leo (Sunpeng) Li,

The patch edf6ffe4f47e: "drm/amd/display: Read AUX channel even if
only status byte is returned" from Jun 26, 2018, leads to the
following static checker warning:

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_ddc.c:673 
dc_link_aux_transfer()
warn: 'returned_bytes' is unsigned

drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_ddc.c
634  int dc_link_aux_transfer(struct ddc_service *ddc,
635   unsigned int address,
636   uint8_t *reply,
637   void *buffer,
638   unsigned int size,
639   enum aux_transaction_type type,
640   enum i2caux_transaction_action action)
641  {
642  struct ddc *ddc_pin = ddc->ddc_pin;
643  struct engine *engine;
644  struct aux_engine *aux_engine;
645  enum aux_channel_operation_result operation_result;
646  struct aux_request_transaction_data aux_req;
647  struct aux_reply_transaction_data aux_rep;
648  uint8_t returned_bytes = 0;
 ^^
returned_bytes is a u8.

649  int res = -1;
650  uint32_t status;
651
652  memset(&aux_req, 0, sizeof(aux_req));
653  memset(&aux_rep, 0, sizeof(aux_rep));
654
655  engine = 
ddc->ctx->dc->res_pool->engines[ddc_pin->pin_data->en];
656  aux_engine = engine->funcs->acquire(engine, ddc_pin);
657
658  aux_req.type = type;
659  aux_req.action = action;
660
661  aux_req.address = address;
662  aux_req.delay = 0;
663  aux_req.length = size;
664  aux_req.data = buffer;
665
666  aux_engine->funcs->submit_channel_request(aux_engine, 
&aux_req);
667  operation_result = 
aux_engine->funcs->get_channel_status(aux_engine, &returned_bytes);
668
669  switch (operation_result) {
670  case AUX_CHANNEL_OPERATION_SUCCEEDED:
671  res = returned_bytes;
 
res = 0-255

672
673  if (res <= size && res >= 0)
 ^^^
So obviously the res >= 0 check can be removed, but that's harmless.
The bigger problem is that the other test looks to be reversed.  Instead
of <= it should be >= size.  Otherwise we are reading beyond the end of
the returned bytes.


Right, the >= 0 is pointless. And upon closer look at the context, the
<= size check also seems to be pointless.

`size` refers to the size of the buffer we're given to save the reply
in. So although the `res <= size` check seems correct, it is redundant.
Calling read_channel_reply will read from hw the returned_bytes again,
and check for a buffer overflow. (At least all the current hooks do).

Thanks for the catch, will clean it up.
Leo



674  res = 
aux_engine->funcs->read_channel_reply(aux_engine, size,
675  
buffer, reply,
676  
&status);
677
678  break;

regards,
dan carpenter


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 08/10] drm/crc: Cleanup crtc_crc_open function

2018-07-03 Thread Leo Li



On 2018-07-02 07:07 AM, Mahesh Kumar wrote:

This patch make changes to allocate crc-entries buffer before
enabling CRC generation.
It moves all the failure check early in the function before setting
the source or memory allocation.
Now set_crc_source takes only two variable inputs, values_cnt we
already gets as part of verify_crc_source.

Signed-off-by: Mahesh Kumar 
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Maarten Lankhorst 


Acked-by: Leo Li 


---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h  |  3 +-
  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c  |  4 +-
  drivers/gpu/drm/drm_debugfs_crc.c  | 52 +-
  drivers/gpu/drm/i915/intel_drv.h   |  3 +-
  drivers/gpu/drm/i915/intel_pipe_crc.c  |  4 +-
  drivers/gpu/drm/rcar-du/rcar_du_crtc.c |  5 +--
  drivers/gpu/drm/rockchip/rockchip_drm_vop.c|  6 +--
  include/drm/drm_crtc.h |  3 +-
  8 files changed, 30 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index e43ed064dc46..54056d180003 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -258,8 +258,7 @@ amdgpu_dm_remove_sink_from_freesync_module(struct 
drm_connector *connector);
  
  /* amdgpu_dm_crc.c */

  #ifdef CONFIG_DEBUG_FS
-int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name,
- size_t *values_cnt);
+int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name);
  int amdgpu_dm_crtc_verify_crc_source(struct drm_crtc *crtc,
 const char *src_name,
 size_t *values_cnt);
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
index dfcca594d52a..e7ad528f5853 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
@@ -62,8 +62,7 @@ amdgpu_dm_crtc_verify_crc_source(struct drm_crtc *crtc, const 
char *src_name,
return 0;
  }
  
-int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name,

-  size_t *values_cnt)
+int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name)
  {
struct dm_crtc_state *crtc_state = to_dm_crtc_state(crtc->state);
struct dc_stream_state *stream_state = crtc_state->stream;
@@ -99,7 +98,6 @@ int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, 
const char *src_name,
return -EINVAL;
}
  
-	*values_cnt = 3;

/* Reset crc_skipped on dm state */
crtc_state->crc_skip_count = 0;
return 0;
diff --git a/drivers/gpu/drm/drm_debugfs_crc.c 
b/drivers/gpu/drm/drm_debugfs_crc.c
index f4d76528d24c..739a813b4b09 100644
--- a/drivers/gpu/drm/drm_debugfs_crc.c
+++ b/drivers/gpu/drm/drm_debugfs_crc.c
@@ -124,11 +124,9 @@ static ssize_t crc_control_write(struct file *file, const 
char __user *ubuf,
if (source[len] == '\n')
source[len] = '\0';
  
-	if (crtc->funcs->verify_crc_source) {

-   ret = crtc->funcs->verify_crc_source(crtc, source, &values_cnt);
-   if (ret)
-   return ret;
-   }
+   ret = crtc->funcs->verify_crc_source(crtc, source, &values_cnt);
+   if (ret)
+   return ret;
  
  	spin_lock_irq(&crc->lock);
  
@@ -193,12 +191,15 @@ static int crtc_crc_open(struct inode *inode, struct file *filep)

return ret;
}
  
-	if (crtc->funcs->verify_crc_source) {

-   ret = crtc->funcs->verify_crc_source(crtc, crc->source,
-&values_cnt);
-   if (ret)
-   return ret;
-   }
+   ret = crtc->funcs->verify_crc_source(crtc, crc->source, &values_cnt);
+   if (ret)
+   return ret;
+
+   if (WARN_ON(values_cnt > DRM_MAX_CRC_NR))
+   return -EINVAL;
+
+   if (WARN_ON(values_cnt == 0))
+   return -EINVAL;
  
  	spin_lock_irq(&crc->lock);

if (!crc->opened)
@@ -210,30 +211,22 @@ static int crtc_crc_open(struct inode *inode, struct file 
*filep)
if (ret)
return ret;
  
-	ret = crtc->funcs->set_crc_source(crtc, crc->source, &values_cnt);

-   if (ret)
-   goto err;
-
-   if (WARN_ON(values_cnt > DRM_MAX_CRC_NR)) {
-   ret = -EINVAL;
-   goto err_disable;
-   }
-
-   if (WARN_ON(values_cnt == 0)) {
-   ret = -EINVAL;
-   goto err_disable;
-   }
-
entries = kcalloc(DRM_CRC_ENTRIES

Re: [PATCH 04/10] drm/amdgpu_dm/crc: Implement verify_crc_source callback

2018-07-03 Thread Leo Li



On 2018-07-03 06:19 AM, Maarten Lankhorst wrote:

Op 02-07-18 om 13:07 schreef Mahesh Kumar:

This patch implements "verify_crc_source" callback function for
AMD drm driver.

Signed-off-by: Mahesh Kumar 
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Maarten Lankhorst 


Acked-by: Leo Li 


---
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  1 +
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  4 
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c | 16 
  3 files changed, 21 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 66bd3cc3e387..dd97cbca0527 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -2563,6 +2563,7 @@ static const struct drm_crtc_funcs amdgpu_dm_crtc_funcs = 
{
.atomic_duplicate_state = dm_crtc_duplicate_state,
.atomic_destroy_state = dm_crtc_destroy_state,
.set_crc_source = amdgpu_dm_crtc_set_crc_source,
+   .verify_crc_source = amdgpu_dm_crtc_verify_crc_source,
.enable_vblank = dm_enable_vblank,
.disable_vblank = dm_disable_vblank,
  };
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index a29dc35954c9..e43ed064dc46 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -260,9 +260,13 @@ amdgpu_dm_remove_sink_from_freesync_module(struct 
drm_connector *connector);
  #ifdef CONFIG_DEBUG_FS
  int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name,
  size_t *values_cnt);
+int amdgpu_dm_crtc_verify_crc_source(struct drm_crtc *crtc,
+const char *src_name,
+size_t *values_cnt);
  void amdgpu_dm_crtc_handle_crc_irq(struct drm_crtc *crtc);
  #else
  #define amdgpu_dm_crtc_set_crc_source NULL
+#define amdgpu_dm_crtc_verify_crc_source NULL
  #define amdgpu_dm_crtc_handle_crc_irq(x)
  #endif
  
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c

index 52f2c01349e3..dfcca594d52a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c
@@ -46,6 +46,22 @@ static enum amdgpu_dm_pipe_crc_source 
dm_parse_crc_source(const char *source)
return AMDGPU_DM_PIPE_CRC_SOURCE_INVALID;
  }
  
+int

+amdgpu_dm_crtc_verify_crc_source(struct drm_crtc *crtc, const char *src_name,
+size_t *values_cnt)
+{
+   enum amdgpu_dm_pipe_crc_source source = dm_parse_crc_source(src_name);
+
+   if (source < 0) {
+   DRM_DEBUG_DRIVER("Unknown CRC source %s for CRTC%d\n",
+src_name, crtc->index);
+   return -EINVAL;
+   }
+
+   *values_cnt = 3;
+   return 0;
+}
+
  int amdgpu_dm_crtc_set_crc_source(struct drm_crtc *crtc, const char *src_name,
   size_t *values_cnt)
  {


Ack for merging this and 8/10 through drm-misc?

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH] drm/atomic: Fix memleak on ERESTARTSYS during non-blocking commits

2018-01-29 Thread Leo Li

Updated IGT results seem sane:
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7698/shards.html

Would someone be able to apply this patch?

Thanks,
Leo

On 2018-01-17 03:18 PM, Sean Paul wrote:

On Wed, Jan 17, 2018 at 10:39 AM, Maarten Lankhorst
 wrote:

Op 17-01-18 om 19:29 schreef Sean Paul:

On Wed, Jan 17, 2018 at 12:51:08PM +0100, Maarten Lankhorst wrote:

From: "Leo (Sunpeng) Li" 

During a non-blocking commit, it is possible to return before the
commit_tail work is queued (-ERESTARTSYS, for example).

Since a reference on the crtc commit object is obtained for the pending
vblank event when preparing the commit, the above situation will leave
us with an extra reference.

Therefore, if the commit_tail worker has not consumed the event at the
end of a commit, release it's reference.

Changes since v1:
- Also check for state->event->base.completion being set, to
   handle the case where stall_checks() fails in setup_crtc_commit().
Changes since v2:
- Add a flag to drm_crtc_commit, to prevent dereferencing a freed event.
   i915 may unreference the state in a worker.

Fixes: 24835e442f28 ("drm: reference count event->completion")
Cc:  # v4.11+
Signed-off-by: Leo (Sunpeng) Li 
Acked-by: Harry Wentland  #v1
Signed-off-by: Maarten Lankhorst 
---
  drivers/gpu/drm/drm_atomic_helper.c | 15 +++
  include/drm/drm_atomic.h|  9 +
  2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index ab4032167094..ae3cbfe9e01c 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1878,6 +1878,8 @@ int drm_atomic_helper_setup_commit(struct 
drm_atomic_state *state,
  new_crtc_state->event->base.completion = &commit->flip_done;
  new_crtc_state->event->base.completion_release = 
release_crtc_commit;
  drm_crtc_commit_get(commit);
+
+commit->abort_completion = true;
  }

  for_each_oldnew_connector_in_state(state, conn, old_conn_state, 
new_conn_state, i) {
@@ -3421,8 +3423,21 @@ EXPORT_SYMBOL(drm_atomic_helper_crtc_duplicate_state);
  void __drm_atomic_helper_crtc_destroy_state(struct drm_crtc_state *state)
  {
  if (state->commit) {
+/*
+ * In the event that a non-blocking commit returns
+ * -ERESTARTSYS before the commit_tail work is queued, we will
+ * have an extra reference to the commit object. Release it, if
+ * the event has not been consumed by the worker.
+ *
+ * state->event may be freed, so we can't directly look at
+ * state->event->base.completion.
+ */
+if (state->event && state->commit->abort_completion)
+drm_crtc_commit_put(state->commit);
+
  kfree(state->commit->event);
  state->commit->event = NULL;
+
  drm_crtc_commit_put(state->commit);
  }

diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
index 1c27526c499e..cf13842a6dbd 100644
--- a/include/drm/drm_atomic.h
+++ b/include/drm/drm_atomic.h
@@ -134,6 +134,15 @@ struct drm_crtc_commit {
   * &drm_pending_vblank_event pointer to clean up private events.
   */
  struct drm_pending_vblank_event *event;
+
+/**
+ * @abort_completion:
+ *
+ * A flag that's set after drm_atomic_helper_setup_commit takes a second
+ * reference for the completion of $drm_crtc_state.event. It's used by
+ * the free code to remove the second reference if commit fails.
+ */

Perhaps it's just me, or I'm oversimplifying the problem. I think this would
be easier to understand if we just dropped the additional reference at the point
of failure (ie: in swap_state). That way we don't have to add Yet Another Piece
Of State.


That's assuming nothing can fail between setup_commit() and swap_state() and
also that the driver implementing atomci puts no calls in between. And also
assumes that even setup_commit has proper rollback. I think it's overkill,
and we have no choice but to do it like this. :(



yeah, fair enough.

Reviewed-by: Sean Paul 


~Maarten


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Non-blocking commits on -ERESTARTSYS

2017-12-14 Thread Leo Li



On 2017-12-13 02:24 PM, Leo Li wrote:



On 2017-12-13 12:23 PM, Maarten Lankhorst wrote:

Op 13-12-17 om 17:19 schreef Leo Li:

Hi Daniel, Maarten,

Just digging an old thread out of the grave:
https://lists.freedesktop.org/archives/dri-devel/2017-August/150495.html

It's suppose to fix a memory leak on the drm_commit object during
non-blocking commits. Within drm_atomic_helper_setup_commit, a reference
to the commit object is obtained by the new_crtc_state. This reference
is suppose to be 'put' once flip_done is signaled (via the
release_crtc_commit callback), but never happens if .prepare_fb returns
-ERESTARTSYS: drm_atomic_helper_commit early returns before the
commit_tail work is queued.

We're starting to bump into this issue again. Regarding Daniel's
suggestion for an IGT test, has there been any work done on it? I'd be
interested in taking a look otherwise. As a side note, I can also
reproduce this on i915.

Thanks,
Leo


I'm curious, isn't it better to handle this in 
__drm_atomic_helper_crtc_destroy_state with the attached patch?


Good point, it seems sane to me. I gave it a spin and it fixes the issue.

I was concerned that it'll contend with the worker thread, possibly
freeing the crtc_commit before the flip is done. It seems the
atomic_state_get before the work is queued will prevent that.

Leo



No idea if sane though, but drivers are supposed to clear 
crtc_state->event on success..


Hi Maarten,

If there are no objections, can I submit a patch with your change below?

Thanks,
Leo



diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c

index 593b30d38ce0..e71233b4c651 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -3435,6 +3435,8 @@ 
EXPORT_SYMBOL(drm_atomic_helper_crtc_duplicate_state);
 void __drm_atomic_helper_crtc_destroy_state(struct drm_crtc_state 
*state)

 {
 if (state->commit) {
+    if (state->event)
+    drm_crtc_commit_put(state->commit);
 kfree(state->commit->event);
 state->commit->event = NULL;
 drm_crtc_commit_put(state->commit);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Non-blocking commits on -ERESTARTSYS

2017-12-13 Thread Leo Li

Hi Daniel, Maarten,

Just digging an old thread out of the grave:
https://lists.freedesktop.org/archives/dri-devel/2017-August/150495.html

It's suppose to fix a memory leak on the drm_commit object during
non-blocking commits. Within drm_atomic_helper_setup_commit, a reference
to the commit object is obtained by the new_crtc_state. This reference
is suppose to be 'put' once flip_done is signaled (via the
release_crtc_commit callback), but never happens if .prepare_fb returns
-ERESTARTSYS: drm_atomic_helper_commit early returns before the
commit_tail work is queued.

We're starting to bump into this issue again. Regarding Daniel's
suggestion for an IGT test, has there been any work done on it? I'd be
interested in taking a look otherwise. As a side note, I can also
reproduce this on i915.

Thanks,
Leo
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Non-blocking commits on -ERESTARTSYS

2017-12-13 Thread Leo Li



On 2017-12-13 12:23 PM, Maarten Lankhorst wrote:

Op 13-12-17 om 17:19 schreef Leo Li:

Hi Daniel, Maarten,

Just digging an old thread out of the grave:
https://lists.freedesktop.org/archives/dri-devel/2017-August/150495.html

It's suppose to fix a memory leak on the drm_commit object during
non-blocking commits. Within drm_atomic_helper_setup_commit, a reference
to the commit object is obtained by the new_crtc_state. This reference
is suppose to be 'put' once flip_done is signaled (via the
release_crtc_commit callback), but never happens if .prepare_fb returns
-ERESTARTSYS: drm_atomic_helper_commit early returns before the
commit_tail work is queued.

We're starting to bump into this issue again. Regarding Daniel's
suggestion for an IGT test, has there been any work done on it? I'd be
interested in taking a look otherwise. As a side note, I can also
reproduce this on i915.

Thanks,
Leo


I'm curious, isn't it better to handle this in 
__drm_atomic_helper_crtc_destroy_state with the attached patch?


Good point, it seems sane to me. I gave it a spin and it fixes the issue.

I was concerned that it'll contend with the worker thread, possibly
freeing the crtc_commit before the flip is done. It seems the
atomic_state_get before the work is queued will prevent that.

Leo



No idea if sane though, but drivers are supposed to clear crtc_state->event on 
success..

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index 593b30d38ce0..e71233b4c651 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -3435,6 +3435,8 @@ EXPORT_SYMBOL(drm_atomic_helper_crtc_duplicate_state);
 void __drm_atomic_helper_crtc_destroy_state(struct drm_crtc_state *state)
 {
if (state->commit) {
+   if (state->event)
+   drm_crtc_commit_put(state->commit);
kfree(state->commit->event);
state->commit->event = NULL;
drm_crtc_commit_put(state->commit);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel