Re: [NOT A REGRESSION] firmware: framebuffer-coreboot: duplicate device name "simple-framebuffer.0"
Hi Javier, On Thu, Sep 12, 2024 at 06:33:58PM +0200, Javier Martinez Canillas wrote: > That's a very good point. I'm actually not familiar with Coreboot and I > used an educated guess (in the case of DT for example, that's the main > source of truth and I didn't know if a Core table was in a similar vein). > > Maybe something like the following (untested) patch then? Julius is more familiar with the Coreboot + payload ecosystem than me, but his explanations make sense to me, as does this patch. > From de1c32017006f4671d91b695f4d6b4e99c073ab2 Mon Sep 17 00:00:00 2001 > From: Javier Martinez Canillas > Date: Thu, 12 Sep 2024 18:31:55 +0200 > Subject: [PATCH] firmware: coreboot: Don't register a pdev if screen_info data > is available > > On Coreboot platforms, a system framebuffer may be provided to the Linux > kernel by filling a LB_TAG_FRAMEBUFFER entry in the Coreboot table. But > a Coreboot payload (e.g: SeaBIOS) could also provide this information to > the Linux kernel. > > If that the case, early arch x86 boot code will fill the global struct > screen_info data and that data used by the Generic System Framebuffers > (sysfb) framework to add a platform device with platform data about the > system framebuffer. Normally, these sorts of "early" and "later" ordering descriptions would set alarm bells when talking about independent drivers. But I suppose the "early arch" code has better ordering guaranteeds than drivers, so this should be fine. > But later then the framebuffer_coreboot driver will try to do the same > framebuffer (using the information from the Coreboot table), which will > lead to an error due a simple-framebuffer.0 device already registered: > > sysfs: cannot create duplicate filename > '/bus/platform/devices/simple-framebuffer.0' > ... > coreboot: could not register framebuffer > framebuffer coreboot8: probe with driver framebuffer failed with error -17 > > To prevent the issue, make the framebuffer_core driver to not register a > platform device if the global struct screen_info data has been filled. > > Reported-by: Brian Norris > Link: > https://lore.kernel.org/all/ZuCG-DggNThuF4pj@b20ea791c01f/T/#ma7fb65acbc1a56042258adac910992bb225a20d2 > Suggested-by: Julius Werner > Signed-off-by: Javier Martinez Canillas > --- > drivers/firmware/google/framebuffer-coreboot.c | 16 > 1 file changed, 16 insertions(+) > > diff --git a/drivers/firmware/google/framebuffer-coreboot.c > b/drivers/firmware/google/framebuffer-coreboot.c > index daadd71d8ddd..4e50da17cd7e 100644 > --- a/drivers/firmware/google/framebuffer-coreboot.c > +++ b/drivers/firmware/google/framebuffer-coreboot.c > @@ -15,6 +15,7 @@ > #include > #include > #include > +#include > > #include "coreboot_table.h" > > @@ -27,6 +28,7 @@ static int framebuffer_probe(struct coreboot_device *dev) > int i; > u32 length; > struct lb_framebuffer *fb = &dev->framebuffer; > + struct screen_info *si = &screen_info; > struct platform_device *pdev; > struct resource res; > struct simplefb_platform_data pdata = { > @@ -36,6 +38,20 @@ static int framebuffer_probe(struct coreboot_device *dev) > .format = NULL, > }; > > + /* > + * If the global screen_info data has been filled, the Generic > + * System Framebuffers (sysfb) will already register a platform Did you mean 'platform_device'? > + * and pass the screen_info as platform_data to a driver that > + * could scan-out using the system provided framebuffer. > + * > + * On Coreboot systems, the advertise LB_TAG_FRAMEBUFFER entry s/advertise/advertised/ ? > + * in the Coreboot table should only be used if the payload did > + * not set video mode info and passed it to the Linux kernel. s/passed/pass/ > + */ > + if (si->orig_video_isVGA == VIDEO_TYPE_VLFB || > +si->orig_video_isVGA == VIDEO_TYPE_EFI) This line is using spaces for indentation. It should use a tab, and then spaces for alignment. But presumably this will change based on Thomas's suggestions anyway. > + return -EINVAL; Is EINVAL right? IIUC, that will print a noisier error to the logs. I believe the "expected" sorts of return codes are ENODEV or ENXIO. (See call_driver_probe().) ENODEV seems like a fine choice, similar to several of the other return codes already used here. Anyway, this seems along the right track. Thanks for tackling, and feel free to carry a: Reviewed-by: Brian Norris > + > if (!fb->physical_address) > return -ENODEV; > > -- > Best regards, > > Javier Martinez Canillas > Core Platforms > Red Hat >
[NOT A REGRESSION] firmware: framebuffer-coreboot: duplicate device name "simple-framebuffer.0"
(Tweaking subject; this indeed isn't related to the regression at all) Hi, On Mon, Sep 09, 2024 at 10:02:00AM +0200, Borislav Petkov wrote: > Looking at your log, the first warn is in framebuffer_coreboot. Some mess in > the sysfs platform devices registration. > > Adding the relevant people for that: > > Aug 20 20:29:36 luna kernel: sysfs: cannot create duplicate filename > '/bus/platform/devices/simple-framebuffer.0' > Aug 20 20:29:36 luna kernel: CPU: 5 PID: 571 Comm: (udev-worker) Tainted: G > OE 6.10.6-arch1-1 #1 703d152c24f1971e36f16e505405e456fc9e23f8 > Aug 20 20:29:36 luna kernel: Hardware name: Purism Librem 14/Librem 14, BIOS > 4.14-Purism-1 06/18/2021 > Aug 20 20:29:36 luna kernel: Call Trace: > Aug 20 20:29:36 luna kernel: > Aug 20 20:29:36 luna kernel: dump_stack_lvl+0x5d/0x80 > Aug 20 20:29:36 luna kernel: sysfs_warn_dup.cold+0x17/0x23 > Aug 20 20:29:36 luna kernel: sysfs_do_create_link_sd+0xcf/0xe0 > Aug 20 20:29:36 luna kernel: bus_add_device+0x6b/0x130 > Aug 20 20:29:36 luna kernel: device_add+0x3b3/0x870 > Aug 20 20:29:36 luna kernel: platform_device_add+0xed/0x250 > Aug 20 20:29:36 luna kernel: platform_device_register_full+0xbb/0x140 > Aug 20 20:29:36 luna kernel: > platform_device_register_resndata.constprop.0+0x54/0x80 [framebuffer_coreboot > a587d2fc243ebaa0205c3badd33442a004d284e0] > Aug 20 20:29:36 luna kernel: framebuffer_probe+0x165/0x1b0 > [framebuffer_coreboot a587d2fc243ebaa0205c3badd33442a004d284e0] > Aug 20 20:29:36 luna kernel: really_probe+0xdb/0x340 > Aug 20 20:29:36 luna kernel: ? pm_runtime_barrier+0x54/0x90 > Aug 20 20:29:36 luna kernel: ? __pfx___driver_attach+0x10/0x10 > Aug 20 20:29:36 luna kernel: __driver_probe_device+0x78/0x110 > Aug 20 20:29:36 luna kernel: driver_probe_device+0x1f/0xa0 > Aug 20 20:29:36 luna kernel: __driver_attach+0xba/0x1c0 > Aug 20 20:29:36 luna kernel: bus_for_each_dev+0x8c/0xe0 > Aug 20 20:29:36 luna kernel: bus_add_driver+0x112/0x1f0 > Aug 20 20:29:36 luna kernel: driver_register+0x72/0xd0 > Aug 20 20:29:36 luna kernel: ? __pfx_framebuffer_driver_init+0x10/0x10 > [framebuffer_coreboot a587d2fc243ebaa0205c3badd33442a004d284e0] > Aug 20 20:29:36 luna kernel: do_one_initcall+0x58/0x310 > Aug 20 20:29:36 luna kernel: do_init_module+0x60/0x220 > Aug 20 20:29:36 luna kernel: init_module_from_file+0x89/0xe0 > Aug 20 20:29:36 luna kernel: idempotent_init_module+0x121/0x320 > Aug 20 20:29:36 luna kernel: __x64_sys_finit_module+0x5e/0xb0 > Aug 20 20:29:36 luna kernel: do_syscall_64+0x82/0x190 > Aug 20 20:29:36 luna kernel: ? __do_sys_newfstatat+0x3c/0x80 > Aug 20 20:29:36 luna kernel: ? syscall_exit_to_user_mode+0x72/0x200 > Aug 20 20:29:36 luna kernel: ? do_syscall_64+0x8e/0x190 > Aug 20 20:29:36 luna kernel: ? do_sys_openat2+0x9c/0xe0 > Aug 20 20:29:36 luna kernel: ? syscall_exit_to_user_mode+0x72/0x200 > Aug 20 20:29:36 luna kernel: ? do_syscall_64+0x8e/0x190 > Aug 20 20:29:36 luna kernel: ? clear_bhb_loop+0x25/0x80 > Aug 20 20:29:36 luna kernel: ? clear_bhb_loop+0x25/0x80 > Aug 20 20:29:36 luna kernel: ? clear_bhb_loop+0x25/0x80 > Aug 20 20:29:36 luna kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e > Aug 20 20:29:36 luna kernel: RIP: 0033:0x7b1bee2f81fd Looks like it might be a conflict with drivers/firmware/sysfb_simplefb.c, which also uses the "simple-framebuffer" name with a constant ID of 0. It's possible both drivers should be switched to use PLATFORM_DEVID_AUTO? Or at least one of them. Or they should use different base names. I'm not really sure what the best option is (does anyone rely on or care about the device naming?), and I don't actually use this driver. But here's an untested diff to try if you'd really like. If you test it, feel free to submit as a proper patch with my: Signed-off-by: Brian Norris diff --git a/drivers/firmware/google/framebuffer-coreboot.c b/drivers/firmware/google/framebuffer-coreboot.c index daadd71d8ddd..3f1b8f664c3f 100644 --- a/drivers/firmware/google/framebuffer-coreboot.c +++ b/drivers/firmware/google/framebuffer-coreboot.c @@ -62,7 +62,8 @@ static int framebuffer_probe(struct coreboot_device *dev) return -EINVAL; pdev = platform_device_register_resndata(&dev->dev, -"simple-framebuffer", 0, +"simple-framebuffer", +PLATFORM_DEVID_AUTO, &res, 1, &pdata, sizeof(pdata)); if (IS_ERR(pdev))
Re: [PATCH 1/1] drm/bridge: analogix_dp: add a quirk for Bob panel
Hi, You seem to have sent this twice, perhaps to adjust the To/CC list. I think I've picked the right one to reply to, but it's usually nice to use a "v2" notation or otherwise put a comment somewhere in the email. On Wed, Feb 08, 2023 at 12:44:06PM +0800, Kencp huang wrote: > From: zain wang > > Some panels' DP_PSR_STATUS (DPCD 0x2008) may be unstable when we > enable psr. If we get the unexpected psr state, We need try the > debounce to ensure the panel was in PSR > > Signed-off-by: zain wang > Signed-off-by: Chris Zhong > Commit-Ready: Kristian H. Kristensen 'Commit-Ready' isn't something that makes sense for upstream Linux. The other 'Tested-by' and 'Reviewed-by' *might* make sense to carry forward, even though these were from the Chromium Gerrit instance, but they also applied to a very old and different version of this patch, so probably not. I'd suggest starting over with only the Signed-off-by tags. > Tested-by: Kristian H. Kristensen > Reviewed-by: Kristian H. Kristensen > Tested-by: Kencp huang > Signed-off-by: Kencp huang > --- > .../gpu/drm/bridge/analogix/analogix_dp_reg.c | 71 +++ > 1 file changed, 42 insertions(+), 29 deletions(-) > > diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c > b/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c > index 6a4f20fccf84..7b6e3f8f85b0 100644 > --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c > +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c > @@ -935,25 +935,54 @@ void analogix_dp_enable_psr_crc(struct > analogix_dp_device *dp) > writel(PSR_VID_CRC_ENABLE, dp->reg_base + ANALOGIX_DP_CRC_CON); > } > > -static ssize_t analogix_dp_get_psr_status(struct analogix_dp_device *dp) > +static int analogix_dp_get_psr_status(struct analogix_dp_device *dp, Probably could be called 'analogix_dp_wait_psr_status()', since it's no longer just a "getter" function. > + int status) This would probably make more sense as a 'bool psr_active' or some similar naming, since it doesn't really represent a "status" field now, but more of a "am I entering or exiting PSR?" parameter. > { > ssize_t val; > - u8 status; > + u8 reg, store = 0; > + int cnt = 0; > + > + /* About 3ms for a loop */ The commit description explains why you need this polling/debounce loop, but it's good to document such artifacts in the code too, when they're as strange as this one. Perhaps a short explanation about the "bouncing" behavior of some panels here? "Some panels' DP_PSR_STATUS register is unstable when entering PSR." Also, I already had a (pre mailing list) question about why this doesn't use readx_poll_timeout(), so I'll repeat for the record one reason not to: it's difficult to handle the stateful debouncing aspect with that macro, and keep the 'store' variable around. > + while (cnt < 100) { > + /* Read operation would takes 900us */ > + val = drm_dp_dpcd_readb(&dp->aux, DP_PSR_STATUS, ®); > + if (val < 0) { > + dev_err(dp->dev, "PSR_STATUS read failed ret=%zd", val); > + return val; > + } > + > + /* > + * Ensure the PSR_STATE should go to DP_PSR_SINK_ACTIVE_RFB > + * from DP_PSR_SINK_ACTIVE_SINK_SYNCED or > + * DP_PSR_SINK_ACTIVE_SRC_SYNCED. > + * Otherwise, if we get DP_PSR_SINK_ACTIVE_RFB twice in > + * succession, it show the Panel is stable PSR enabled state. > + */ > + if (status == DP_PSR_SINK_ACTIVE_RFB) { > + if ((reg == DP_PSR_SINK_ACTIVE_RFB) && > + ((store == DP_PSR_SINK_ACTIVE_SINK_SYNCED) || > + (store == DP_PSR_SINK_ACTIVE_SRC_SYNCED) || > + (store == DP_PSR_SINK_ACTIVE_RFB))) > + return 0; > + else > + store = reg; > + } else { You dropped the ACTIVE_RESYNC and INACTIVE comments from below. Those probably should move here. With those fixed, I think this would be fine: Reviewed-by: Brian Norris > + if ((reg == DP_PSR_SINK_INACTIVE) || > + (reg == DP_PSR_SINK_ACTIVE_RESYNC)) > + return 0; > + } > > - val = drm_dp_dpcd_readb(&dp->aux, DP_PSR_STATUS, &status); > - if (val < 0) { > - dev_err(dp->dev, "PSR_STATUS read failed ret=%z
Re: [PATCH] drm: bridge: Use devm_platform_get_and_ioremap_resource()
On Thu, Jan 19, 2023 at 03:59:01PM +0800, ye.xingc...@zte.com.cn wrote: > From: ye xingchen > > Convert platform_get_resource(), devm_ioremap_resource() to a single > call to devm_platform_get_and_ioremap_resource(), as this is exactly > what this function does. > > Signed-off-by: ye xingchen > --- > drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > index df9370e0ff23..50f092b316d0 100644 > --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > @@ -1686,7 +1686,6 @@ analogix_dp_probe(struct device *dev, struct > analogix_dp_plat_data *plat_data) > { > struct platform_device *pdev = to_platform_device(dev); > struct analogix_dp_device *dp; > - struct resource *res; > unsigned int irq_flags; > int ret; > > @@ -1740,9 +1739,7 @@ analogix_dp_probe(struct device *dev, struct > analogix_dp_plat_data *plat_data) > > clk_prepare_enable(dp->clock); > > - res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > - > - dp->reg_base = devm_ioremap_resource(&pdev->dev, res); > + dp->reg_base = devm_platform_get_and_ioremap_resource(pdev, 0, NULL); Rather than a NULL 3rd argument, couldn't you just use devm_platform_ioremap_resource()? With that: Reviewed-by: Brian Norris > if (IS_ERR(dp->reg_base)) { > ret = PTR_ERR(dp->reg_base); > goto err_disable_clk; > -- > 2.25.1
[PATCH v3 2/2] drm/rockchip: vop: Leave vblank enabled in self-refresh
If we disable vblank when entering self-refresh, vblank APIs (like DRM_IOCTL_WAIT_VBLANK) no longer work. But user space is not aware when we enter self-refresh, so this appears to be an API violation -- that DRM_IOCTL_WAIT_VBLANK fails with EINVAL whenever the display is idle and enters self-refresh. The downstream driver used by many of these systems never used to disable vblank for PSR, and in fact, even upstream, we didn't do that until radically redesigning the state machine in commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"). Thus, it seems like a reasonable API fix to simply restore that behavior, and leave vblank enabled. Note that this appears to potentially unbalance the drm_crtc_vblank_{off,on}() calls in some cases, but: (a) drm_crtc_vblank_on() documents this as OK and (b) if I do the naive balancing, I find state machine issues such that we're not in sync properly; so it's easier to take advantage of (a). This issue was exposed by IGT's kms_vblank tests, and reported by KernelCI. The bug has been around a while (longer than KernelCI noticed), but was only exposed once self-refresh was bugfixed more recently, and so KernelCI could properly test it. Some other notes in: https://lore.kernel.org/dri-devel/y6ocg9bpnjvim...@google.com/ Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin == Backporting notes: == Marking as 'Fixes' commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), but it probably depends on commit bed030a49f3e ("drm/rockchip: Don't fully disable vop on self refresh") as well. We also need the previous patch ("drm/atomic: Allow vblank-enabled + self-refresh "disable""), of course. v3: * no update v2: * skip unnecessary lock/unlock Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Cc: Link: https://lore.kernel.org/dri-devel/y5itf0+yniqa6...@sirena.org.uk/ Reported-by: "kernelci.org bot" Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index fa1f4ee6d195..9fea03121247 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -717,13 +717,13 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc, if (crtc->state->self_refresh_active) rockchip_drm_set_win_enabled(crtc, false); + if (crtc->state->self_refresh_active) + goto out; + mutex_lock(&vop->vop_lock); drm_crtc_vblank_off(crtc); - if (crtc->state->self_refresh_active) - goto out; - /* * Vop standby will take effect at end of current frame, * if dsp hold valid irq happen, it means standby complete. @@ -757,9 +757,9 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc, vop_core_clks_disable(vop); pm_runtime_put(vop->dev); -out: mutex_unlock(&vop->vop_lock); +out: if (crtc->state->event && !crtc->state->active) { spin_lock_irq(&crtc->dev->event_lock); drm_crtc_send_vblank_event(crtc, crtc->state->event); -- 2.39.0.314.g84b9a713c41-goog
[PATCH v3 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
The self-refresh helper framework overloads "disable" to sometimes mean "go into self-refresh mode," and this mode activates automatically (e.g., after some period of unchanging display output). In such cases, the display pipe is still considered "on", and user-space is not aware that we went into self-refresh mode. Thus, users may expect that vblank-related features (such as DRM_IOCTL_WAIT_VBLANK) still work properly. However, we trigger the WARN_ONCE() here if a CRTC driver tries to leave vblank enabled. Add a different expectation: that CRTCs *should* leave vblank enabled when going into self-refresh. This patch is preparation for another patch -- "drm/rockchip: vop: Leave vblank enabled in self-refresh" -- which resolves conflicts between the above self-refresh behavior and the API tests in IGT's kms_vblank test module. == Some alternatives discussed: == It's likely that on many display controllers, vblank interrupts will turn off when the CRTC is disabled, and so in some cases, self-refresh may not support vblank. To support such cases, we might consider additions to the generic helpers such that we fire vblank events based on a timer. However, there is currently only one driver using the common self-refresh helpers (i.e., rockchip), and at least as of commit bed030a49f3e ("drm/rockchip: Don't fully disable vop on self refresh"), the CRTC hardware is powered enough to continue to generate vblank interrupts. So we chose the simpler option of leaving vblank interrupts enabled. We can reevaluate this decision and perhaps augment the helpers if/when we gain a second driver that has different requirements. v3: * include discussion summary v2: * add 'ret != 0' warning case for self-refresh * describe failing test case and relation to drm/rockchip patch better Cc: # dependency for "drm/rockchip: vop: Leave # vblank enabled in self-refresh" Signed-off-by: Brian Norris --- drivers/gpu/drm/drm_atomic_helper.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index d579fd8f7cb8..a22485e3e924 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1209,7 +1209,16 @@ disable_outputs(struct drm_device *dev, struct drm_atomic_state *old_state) continue; ret = drm_crtc_vblank_get(crtc); - WARN_ONCE(ret != -EINVAL, "driver forgot to call drm_crtc_vblank_off()\n"); + /* +* Self-refresh is not a true "disable"; ensure vblank remains +* enabled. +*/ + if (new_crtc_state->self_refresh_active) + WARN_ONCE(ret != 0, + "driver disabled vblank in self-refresh\n"); + else + WARN_ONCE(ret != -EINVAL, + "driver forgot to call drm_crtc_vblank_off()\n"); if (ret == 0) drm_crtc_vblank_put(crtc); } -- 2.39.0.314.g84b9a713c41-goog
Re: [PATCH v2 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
On Fri, Jan 6, 2023 at 5:23 PM Brian Norris wrote: > v2: > * add 'ret != 0' warning case for self-refresh > * describe failing test case and relation to drm/rockchip patch better Ugh, there's always something you remember right after you hit send: I forgot to better summarize some of the other discussion from v1, and alternatives we didn't entertain. I'll write that up now (not sure whether in patch 1 or 2) and plan on sending a v3 for next week, in case there are any other comments I should address at the same time. Sorry for the noise, Brian
[PATCH v2 2/2] drm/rockchip: vop: Leave vblank enabled in self-refresh
If we disable vblank when entering self-refresh, vblank APIs (like DRM_IOCTL_WAIT_VBLANK) no longer work. But user space is not aware when we enter self-refresh, so this appears to be an API violation -- that DRM_IOCTL_WAIT_VBLANK fails with EINVAL whenever the display is idle and enters self-refresh. The downstream driver used by many of these systems never used to disable vblank for PSR, and in fact, even upstream, we didn't do that until radically redesigning the state machine in commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"). Thus, it seems like a reasonable API fix to simply restore that behavior, and leave vblank enabled. Note that this appears to potentially unbalance the drm_crtc_vblank_{off,on}() calls in some cases, but: (a) drm_crtc_vblank_on() documents this as OK and (b) if I do the naive balancing, I find state machine issues such that we're not in sync properly; so it's easier to take advantage of (a). This issue was exposed by IGT's kms_vblank tests, and reported by KernelCI. Backporting notes: Marking as 'Fixes' commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), but it probably depends on commit bed030a49f3e ("drm/rockchip: Don't fully disable vop on self refresh") as well. We also need the previous patch ("drm/atomic: Allow vblank-enabled + self-refresh "disable""), of course. v2: * skip unnecessary lock/unlock Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Cc: Link: https://lore.kernel.org/dri-devel/y5itf0+yniqa6...@sirena.org.uk/ Reported-by: "kernelci.org bot" Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index fa1f4ee6d195..9fea03121247 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -717,13 +717,13 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc, if (crtc->state->self_refresh_active) rockchip_drm_set_win_enabled(crtc, false); + if (crtc->state->self_refresh_active) + goto out; + mutex_lock(&vop->vop_lock); drm_crtc_vblank_off(crtc); - if (crtc->state->self_refresh_active) - goto out; - /* * Vop standby will take effect at end of current frame, * if dsp hold valid irq happen, it means standby complete. @@ -757,9 +757,9 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc, vop_core_clks_disable(vop); pm_runtime_put(vop->dev); -out: mutex_unlock(&vop->vop_lock); +out: if (crtc->state->event && !crtc->state->active) { spin_lock_irq(&crtc->dev->event_lock); drm_crtc_send_vblank_event(crtc, crtc->state->event); -- 2.39.0.314.g84b9a713c41-goog
[PATCH v2 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
The self-refresh helper framework overloads "disable" to sometimes mean "go into self-refresh mode," and this mode activates automatically (e.g., after some period of unchanging display output). In such cases, the display pipe is still considered "on", and user-space is not aware that we went into self-refresh mode. Thus, users may expect that vblank-related features (such as DRM_IOCTL_WAIT_VBLANK) still work properly. However, we trigger the WARN_ONCE() here if a CRTC driver tries to leave vblank enabled. Add a different expectation: that CRTCs *should* leave vblank enabled when going into self-refresh. This patch is preparation for another patch -- "drm/rockchip: vop: Leave vblank enabled in self-refresh" -- which resolves conflicts between the above self-refresh behavior and the API tests in IGT's kms_vblank test module. v2: * add 'ret != 0' warning case for self-refresh * describe failing test case and relation to drm/rockchip patch better Cc: # dependency for "drm/rockchip: vop: Leave # vblank enabled in self-refresh" Signed-off-by: Brian Norris --- drivers/gpu/drm/drm_atomic_helper.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index d579fd8f7cb8..a22485e3e924 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1209,7 +1209,16 @@ disable_outputs(struct drm_device *dev, struct drm_atomic_state *old_state) continue; ret = drm_crtc_vblank_get(crtc); - WARN_ONCE(ret != -EINVAL, "driver forgot to call drm_crtc_vblank_off()\n"); + /* +* Self-refresh is not a true "disable"; ensure vblank remains +* enabled. +*/ + if (new_crtc_state->self_refresh_active) + WARN_ONCE(ret != 0, + "driver disabled vblank in self-refresh\n"); + else + WARN_ONCE(ret != -EINVAL, + "driver forgot to call drm_crtc_vblank_off()\n"); if (ret == 0) drm_crtc_vblank_put(crtc); } -- 2.39.0.314.g84b9a713c41-goog
Re: [PATCH 2/2] drm/rockchip: vop: Leave vblank enabled in self-refresh
On Fri, Jan 06, 2023 at 12:42:54PM +0100, Michel Dänzer wrote: > On 1/6/23 02:40, Brian Norris wrote: > > --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c > > @@ -719,11 +719,11 @@ static void vop_crtc_atomic_disable(struct drm_crtc > > *crtc, > > > > mutex_lock(&vop->vop_lock); > > > > - drm_crtc_vblank_off(crtc); > > - > > if (crtc->state->self_refresh_active) > > goto out; > > > > + drm_crtc_vblank_off(crtc); > > + > > /* > > * Vop standby will take effect at end of current frame, > > * if dsp hold valid irq happen, it means standby complete. > > The out label immediately unlocks vop->vop_lock again, seems a bit pointless. > :) Oops, of course. Will change that in v2. > AFAICT the self_refresh_active case should just return immediately, > the out label would also send any pending atomic commit completion > event, which seems wrong now that the vblank interrupt is left > enabled. If I return immediately and drop that completion, I hit a warning: [4.423876] WARNING: CPU: 5 PID: 58 at drivers/gpu/drm/drm_atomic_helper.c:2460 drm_atomic_helper_commit_hw_done+0xe0/0xe8 ... [4.424036] Call trace: [4.424039] drm_atomic_helper_commit_hw_done+0xe0/0xe8 [4.424045] drm_atomic_helper_commit_tail_rpm+0x58/0x7c ... I plan to leave it as-is for v2. > (I also wonder if drm_crtc_vblank_off doesn't take care of > that already, at least amdgpu doesn't do this explicitly AFAICT) I'm not sure. Brian
Re: [PATCH 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
On Fri, Jan 06, 2023 at 09:30:56PM +0100, Daniel Vetter wrote: > On Fri, Jan 06, 2023 at 11:33:06AM -0800, Brian Norris wrote: > > On Fri, Jan 06, 2023 at 07:17:53PM +0100, Daniel Vetter wrote: > > > - check that drivers which use self_refresh are not using > > > drm_atomic_helper_wait_for_vblanks(), because that would defeat the > > > point > > > > I'm a bit lost on this one. drm_atomic_helper_wait_for_vblanks() is part > > of the common drm_atomic_helper_commit_tail*() helpers, and so it's > > naturally used in many cases (including Rockchip/PSR). And how does it > > defeat the point? > > Yeah, but that's for backwards compat reasons, the much better function is > drm_atomic_helper_wait_for_flip_done(). And if you go into self refresh > that's really the better one. My knowledge is certainly going to diminish once you talk about backwards compat for drivers I'm very unfamiliar with. Are you suggesting I can change the default drm_atomic_helper_commit_tail{,_rpm}() to use drm_atomic_helper_wait_for_flip_done()? (Or, just when self_refresh_active==true?) I can try to read through drivers for compatibility, but I may be prone to breaking things. Otherwise, I'd be copy/paste/modifying the generic commit helpers. > > > - have a drm_crtc_vblank_off/on which take the crtc state, so they can > > > look at the self-refresh state > > > > And I suppose you mean this helper variant would kick off the next step > > (fake vblank timer)? > > Yeah, I figured that's the better way to implement this since it would be > driver agnostic. But rockchip is still the only driver using the > self-refresh helpers, so I guess it doesn't really matter. I've run into enough gotchas with these helpers that I feel like it might be difficult to ever get a second driver using this. Or at least, we'd have to learn what requirements they have when we get there. (Well, maybe you know better, but I certainly don't.) I'm tempted to just go with what's the simplest for Rockchip now, and look at some generic timer fallbacks later if the need arises. > > Also, I still haven't found that fake timer machinery, but maybe I just > > don't know what I'm looking for. > > I ... didn't find it either. I'm honestly not sure whether this works for > intel, or whether we do something silly like disable self-refresh when a > vblank interrupt is pending :-/ Nice to know I'm not the only one, I suppose :) > I think new proposal from me is to just respin this patch here with our > discussion all summarized (it's good to record this stuff for the next > person that comes around), and the WARN_ON adjusted so it also checks that > vblank interrupts keep working (per the ret value at least, it's not a > real functional check). And call that good enough. Sounds good. I'll try to summarize without immortalizing too much of my ignorance ;) And thanks for your thoughts. > Also maybe look into switching from wait_for_vblanks to > wait_for_flip_done, it's the right thing to do (see kerneldoc, it should > explain things a bit). I've read some, and will probably reread a few more times. And I left one question above. Brian
Re: [PATCH 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
On Fri, Jan 06, 2023 at 07:17:53PM +0100, Daniel Vetter wrote: > Ok I think I was a bit slow here, and it makes sense. Except this now > means we loose this check, and I'm also not sure whether we really want > drivers to implement this all. > > What I think we want here is a bit more: > - for the self-refresh case check that the vblank all still works You mean, keep the WARN_ONCE(), but invert it to ensure that 'ret == 0'? I did consider that, but I don't know why I stopped. > - check that drivers which use self_refresh are not using > drm_atomic_helper_wait_for_vblanks(), because that would defeat the > point I'm a bit lost on this one. drm_atomic_helper_wait_for_vblanks() is part of the common drm_atomic_helper_commit_tail*() helpers, and so it's naturally used in many cases (including Rockchip/PSR). And how does it defeat the point? > - have a drm_crtc_vblank_off/on which take the crtc state, so they can > look at the self-refresh state And I suppose you mean this helper variant would kick off the next step (fake vblank timer)? > - fake vblanks with hrtimer, because on most hw when you turn off the crtc > the vblanks are also turned off, and so your compositor would still > hang. The vblank machinery already has all the code to make this happen > (and if it's not all, then i915 psr code should have it). Is a timer better than an interrupt? I'm pretty sure the vblank interrupts still can fire on Rockchip CRTC (VOP) (see also the other branch of this thread), so this isn't really necessary. (IGT vblank tests pass without hanging.) Unless you simply prefer a fake timer for some reason. Also, I still haven't found that fake timer machinery, but maybe I just don't know what I'm looking for. > - I think kunit tests for this all would be really good, it's a rather > complex state machinery between modesets and vblank functionality. You > can speed up the kunit tests with some really high refresh rate, which > isn't possible on real hw. Last time I tried my hand at kunit in a subsystem with no prior kunit tests, I had a miserable time and gave up. At least DRM has a few already, so maybe this wouldn't be as terrible. Perhaps I can give this a shot, but there's a chance this will kick things to the back burner far enough that I simply don't get around to it at all. (So far, I'm only addressing this because KernelCI complained.) > I'm also wondering why we've had this code for years and only hit issues > now? I'd guess a few reasons: 1. drm_self_refresh_helper_init() is only used by one driver -- Rockchip 2. Rockchip systems are most commonly either Chromebooks, or else otherwise cheap embedded things, and may not have displays at all, let alone displays with PSR 3. Rockchip Chromebooks shipped with a kernel forked off of the earlier PSR support, before everything got refactored (and vblank handling regressed) for the self-refresh "helpers". They only upgraded to a newer upstream kernel within the last few months. 4. AFAICT, ChromeOS user space doesn't even exercise the vblank-related ioctls, so we don't actually notice that this is "broken". I suppose it would only be IGT tests that notice. 5. I fixed up various upstream PSR bugs are part of #3 [0], along the way I unborked PSR enough that KernelCI finally caught the bug. See my explanation in [1] for why the vblank bug was masked, and appeared to be a "regression" due to my more recent fixes. Brian [0] Combined with point #2: ChromeOS would be the first serious users of the refactored PSR support. All this was needed to make it actually usable: (2021) c4c6ef229593 drm/bridge: analogix_dp: Make PSR-exit block less (2022) ca871659ec16 drm/bridge: analogix_dp: Support PSR-exit to disable transition <--- KernelCI "blamed" this one, because PSR was less broken (2022) e54a4424925a drm/atomic: Force bridge self-refresh-exit on CRTC switch [1] https://lore.kernel.org/dri-devel/y6ocg9bpnjvim...@google.com/ Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
Re: [PATCH 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
On Fri, Jan 06, 2023 at 07:20:40PM +0100, Daniel Vetter wrote: > On Fri, Jan 06, 2023 at 10:08:53AM -0800, Brian Norris wrote: > > OK! Then that sounds like it at least ACKs my general idea for this > > series. (Michel and I poked at a few ideas in the thread at [1] and > > landed on approx. this solution, or else a fake/timer like you suggest.) > > Yeah once I stopped looking at this the wrong way round it does make sense > what you're doing. See my other reply, I think with just this series here > the vblanks still stall out? Or does your hw actually keep generating > vblank irq with the display off? I might not be understanding all of the IGT tests that I've run through, but the display is not actually off -- it's sitting on a still frame. But yes, IRQs still come, and I see no hangs. This is also ref'd in patch 2: bed030a49f3e drm/rockchip: Don't fully disable vop on self refresh So, we're not even doing that much to power-down the CRTC/VOP. That's probably why IRQs are still active, contrary to your expectation? NB: this is how Rockchip Chromebooks implemented PSR from essentially day 1. > > On Fri, Jan 06, 2023 at 06:53:49PM +0100, Daniel Vetter wrote: > > > We might need a few more helpers. Also, probably more igt, or is this > > > something igt testing has uncovered? If so, please cite the igt testcase > > > which hits this. > > > > The current patch only fixes a warning that comes when I try to do the > > second patch. The second patch is a direct product of an IGT test > > failure (a few of kms_vblank's subtests), and I linked [1] the KernelCI > > report there. > > Ah yeah that makes sense. Would be good to cite that in this patch too, > because I guess the same kms_vblank tests can also hit this warning here? Well, before this series: no, those tests don't hit this warning. The warning is only uncovered after patch 2. If I do just patch 2, it's super-trivial to hit the warning. You just have to turn the display on and go idle long enough for PSR to activate once. I suppose that can count as "caught by a test", with a little stretch of the imagination. At any rate, I'll improve this description to refer precisely to the "next patch" (as Greg suggested already), and that might involve describing this test case a little. Brian
Re: [PATCH 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
Hi Daniel, On Fri, Jan 06, 2023 at 06:53:49PM +0100, Daniel Vetter wrote: > On Thu, Jan 05, 2023 at 05:40:17PM -0800, Brian Norris wrote: > > The self-refresh helper framework overloads "disable" to sometimes mean > > "go into self-refresh mode," and this mode activates automatically > > (e.g., after some period of unchanging display output). In such cases, > > the display pipe is still considered "on", and user-space is not aware > > that we went into self-refresh mode. Thus, users may expect that > > vblank-related features (such as DRM_IOCTL_WAIT_VBLANK) still work > > properly. > > > > However, we trigger the WARN_ONCE() here if a CRTC driver tries to leave > > vblank enabled here. > > > > Add a new exception, such that we allow CRTCs to be "disabled" (with > > self-refresh active) with vblank interrupts still enabled. > > > > Cc: # dependency for subsequent patch > > Signed-off-by: Brian Norris > > --- > > > > drivers/gpu/drm/drm_atomic_helper.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > > b/drivers/gpu/drm/drm_atomic_helper.c > > index d579fd8f7cb8..7b5eddadebd5 100644 > > --- a/drivers/gpu/drm/drm_atomic_helper.c > > +++ b/drivers/gpu/drm/drm_atomic_helper.c > > @@ -1207,6 +1207,12 @@ disable_outputs(struct drm_device *dev, struct > > drm_atomic_state *old_state) > > > > if (!drm_dev_has_vblank(dev)) > > continue; > > + /* > > +* Self-refresh is not a true "disable"; let vblank remain > > +* enabled. > > +*/ > > + if (new_crtc_state->self_refresh_active) > > + continue; > > This very fishy, because we check in crtc_needs_disable whether this > output should stay on due to self-refresh. Which means you should never > end up in here. That's not what crtc_needs_disable() does w.r.t. self-refresh. In fact, it's the opposite; see, for example, the |new_state->self_refresh_active| clause. That clause means that if we're entering self-refresh, we *intend* to disable (i.e., we return 'true'). That's because like I mention above, the self-refresh helpers overload what "disable" means. I'll also add my caveat again that I'm a bit new to DRM, so feel free to continue to correct me if I'm wrong :) Or perhaps Sean Paul could provide second opinions, as I believe he wrote this stuff. > And yes vblank better work in self refresh :-) If it doesn't, then you > need to fake it with a timer, that's at least what i915 has done for > transparent self-refresh. OK! Then that sounds like it at least ACKs my general idea for this series. (Michel and I poked at a few ideas in the thread at [1] and landed on approx. this solution, or else a fake/timer like you suggest.) > We might need a few more helpers. Also, probably more igt, or is this > something igt testing has uncovered? If so, please cite the igt testcase > which hits this. The current patch only fixes a warning that comes when I try to do the second patch. The second patch is a direct product of an IGT test failure (a few of kms_vblank's subtests), and I linked [1] the KernelCI report there. Brian [1] Link: https://lore.kernel.org/dri-devel/y5itf0+yniqa6...@sirena.org.uk/ Reported-by: "kernelci.org bot"
Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
On Thu, Jan 05, 2023 at 04:59:55PM -0800, Brian Norris wrote: > On Wed, Jan 04, 2023 at 10:11:46AM +0100, Michel Dänzer wrote: > > On 1/4/23 03:11, Brian Norris wrote: > > > On Tue, Jan 03, 2023 at 07:04:00PM +0100, Michel Dänzer wrote: > > >> On 12/21/22 23:02, Brian Norris wrote: > > >>> 3. leave vblank enabled even in the presence of PSR > > > > > > I'm leaning toward this. > > > > If this means vblank interrupts will arrive as expected even while in PSR, > > that may be the ideal solution indeed. > > Yes. And I think I have a working patchset for this now. It passes all > the igt-gpu-tools/kms_vblank tests for me now, and I don't think it > causes any other issues. I'll send it out shortly, after a bit more > testing. For the record, that's here: https://lore.kernel.org/lkml/20230105174001.2.Ic07cba4ab9a7bd3618a9e4258b8f92ea7d10ae5a@changeid/ I didn't CC everyone who got this report. In included a: Reported-by: "kernelci.org bot" even though it didn't really bisect the right thing. Let me know if there's a different/better way to give credit. Brian
[PATCH 2/2] drm/rockchip: vop: Leave vblank enabled in self-refresh
If we disable vblank when entering self-refresh, vblank APIs (like DRM_IOCTL_WAIT_VBLANK) no longer work. But user space is not aware when we enter self-refresh, so this appears to be an API violation -- that DRM_IOCTL_WAIT_VBLANK fails with EINVAL whenever the display is idle and enters self-refresh. The downstream driver used by many of these systems never used to disable vblank for PSR, and in fact, even upstream, we didn't do that until radically redesigning the state machine in commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"). Thus, it seems like a reasonable API fix to simply restore that behavior, and leave vblank enabled. Note that this appears to potentially unbalance the drm_crtc_vblank_{off,on}() calls in some cases, but: (a) drm_crtc_vblank_on() documents this as OK and (b) if I do the naive balancing, I find state machine issues such that we're not in sync properly; so it's easier to take advantage of (a). Backporting notes: Marking as 'Fixes' commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), but it probably depends on commit bed030a49f3e ("drm/rockchip: Don't fully disable vop on self refresh") as well. We also need the previous patch ("drm/atomic: Allow vblank-enabled + self-refresh "disable""), of course. Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Cc: Link: https://lore.kernel.org/dri-devel/y5itf0+yniqa6...@sirena.org.uk/ Reported-by: "kernelci.org bot" Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index fa1f4ee6d195..c541967114b4 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -719,11 +719,11 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc, mutex_lock(&vop->vop_lock); - drm_crtc_vblank_off(crtc); - if (crtc->state->self_refresh_active) goto out; + drm_crtc_vblank_off(crtc); + /* * Vop standby will take effect at end of current frame, * if dsp hold valid irq happen, it means standby complete. -- 2.39.0.314.g84b9a713c41-goog
[PATCH 1/2] drm/atomic: Allow vblank-enabled + self-refresh "disable"
The self-refresh helper framework overloads "disable" to sometimes mean "go into self-refresh mode," and this mode activates automatically (e.g., after some period of unchanging display output). In such cases, the display pipe is still considered "on", and user-space is not aware that we went into self-refresh mode. Thus, users may expect that vblank-related features (such as DRM_IOCTL_WAIT_VBLANK) still work properly. However, we trigger the WARN_ONCE() here if a CRTC driver tries to leave vblank enabled here. Add a new exception, such that we allow CRTCs to be "disabled" (with self-refresh active) with vblank interrupts still enabled. Cc: # dependency for subsequent patch Signed-off-by: Brian Norris --- drivers/gpu/drm/drm_atomic_helper.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index d579fd8f7cb8..7b5eddadebd5 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1207,6 +1207,12 @@ disable_outputs(struct drm_device *dev, struct drm_atomic_state *old_state) if (!drm_dev_has_vblank(dev)) continue; + /* +* Self-refresh is not a true "disable"; let vblank remain +* enabled. +*/ + if (new_crtc_state->self_refresh_active) + continue; ret = drm_crtc_vblank_get(crtc); WARN_ONCE(ret != -EINVAL, "driver forgot to call drm_crtc_vblank_off()\n"); -- 2.39.0.314.g84b9a713c41-goog
Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
On Wed, Jan 04, 2023 at 10:11:46AM +0100, Michel Dänzer wrote: > On 1/4/23 03:11, Brian Norris wrote: > > On Tue, Jan 03, 2023 at 07:04:00PM +0100, Michel Dänzer wrote: > >> On 12/21/22 23:02, Brian Norris wrote: > > > >>> 3. leave vblank enabled even in the presence of PSR > > > > I'm leaning toward this. > > If this means vblank interrupts will arrive as expected even while in PSR, > that may be the ideal solution indeed. Yes. And I think I have a working patchset for this now. It passes all the igt-gpu-tools/kms_vblank tests for me now, and I don't think it causes any other issues. I'll send it out shortly, after a bit more testing. Side note: I believe this vblank-disabled issue actually came in as an upstream regression at commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"); before that, we were doing this very differently, and didn't touch vblank at all for PSR, similar to the "downstream" stuff I mentioned earlier. > >> 5. Go/stay out of PSR while vblank interrupts are enabled (probably want > >> to make sure vblankoffdelay is set up such that vblank interrupts are > >> disabled ASAP) > > > > That seems not extremely nice, to waste power. Based on the earlier > > downstream implementation (which left vblank interrupts enabled), I'd > > wager there's a much larger power win from PSR (on the display-transport > > and memory-bandwidth costs), relative to the power cost of vblank > > interrupts. > > > > Also, messing with vblankoffdelay sounds likely to trigger the races > > mentioned in the drm_vblank.c kerneldoc. > > Not sure how likely that is, quite a few drivers are setting > dev->vblank_disable_immediate = true. > > With that, vblank interrupts should generally be enabled only while there are > screen updates as well[0], in which case PSR shouldn't be possible anyway. > > [0] There may be user space which uses the vblank ioctls even while there are > no screen updates though, which would prevent PSR in this case. OK. I'm just reading docs here; definitely not an expert. > >>> [1] Or is it? I don't really know the DRM_IOCTL_WAIT_VBLANK ABI > >>> contract in the presence of PSR, but I believe the upstream PSR > >>> support has always worked this way. I could imagine > >>> DRM_IOCTL_WAIT_VBLANK users might not love seeing EINVAL here > >>> though. > >> > >> Yeah, that's pretty likely to cause issues with existing real-world user > >> space. > > > > OK. Any hints as to what kind of user space uses this? > > I don't have any specific example, just thinking about how user space could > respond to the vblank ioctls returning an error, and it would seem to be > generally either of: > > * Just run non-throttled, which might negate any energy savings from PSR > * Fail to work altogether > > > > I'm in part simply curious, but I'm also wondering what the > > error-handling and reliability (e.g., what if vblanks don't come?) > > expectations might be in practice. > > If vblank events never come, user space might freeze. Thanks. If my patchset works out like I expect, this should no longer be noticeable to user space, so I don't really have to test out your educated guesses :) Thanks for the idea-bouncing. Brian
Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
Hi Michel, Thanks for your thoughts. I'll give my attempt at weighing my and your options, with the caveat that I'm still not much of a DRM expert. On Tue, Jan 03, 2023 at 07:04:00PM +0100, Michel Dänzer wrote: > On 12/21/22 23:02, Brian Norris wrote: > > So how to fix this? > > > > 1. ignore it; it's "just" a test failure, IIUC [1] Probably discarded, per Michel's notes. > > 2. change the test, to skip this test if the device has PSR Doesn't seem great, and not just because PSR is not directly detectable in user space. > > 3. leave vblank enabled even in the presence of PSR I'm leaning toward this. > > 4. otherwise modify the vblank ioctl interface, such that one can "wait" > >for a vblank that won't come (because the display is in self-refresh > >/ there are no new frames or vblanks) That doesn't sound so great of an API, to essentially induce hangs, pending other activity. (Assuming I understand the DRM APIs here correctly.) > FWIW, a couple more alternatives: > > 5. Go/stay out of PSR while vblank interrupts are enabled (probably want to > make sure vblankoffdelay is set up such that vblank interrupts are disabled > ASAP) That seems not extremely nice, to waste power. Based on the earlier downstream implementation (which left vblank interrupts enabled), I'd wager there's a much larger power win from PSR (on the display-transport and memory-bandwidth costs), relative to the power cost of vblank interrupts. Also, messing with vblankoffdelay sounds likely to trigger the races mentioned in the drm_vblank.c kerneldoc. > 6. Use fallback timers for vblank events while in PSR (there might be some > infrastructure for this, since some drivers don't have real vblank interrupts) There's drm_atomic_helper_fake_vblank(), but I don't think that's really hooked up to a timer. That's potentially promising though. > > [1] Or is it? I don't really know the DRM_IOCTL_WAIT_VBLANK ABI > > contract in the presence of PSR, but I believe the upstream PSR > > support has always worked this way. I could imagine > > DRM_IOCTL_WAIT_VBLANK users might not love seeing EINVAL here > > though. > > Yeah, that's pretty likely to cause issues with existing real-world user > space. OK. Any hints as to what kind of user space uses this? A cursory look doesn't show that any of the ChromeOS user space uses it, which may be why it was overlooked in the initial PSR development and upstreaming. I'm in part simply curious, but I'm also wondering what the error-handling and reliability (e.g., what if vblanks don't come?) expectations might be in practice. All in all, it's seeming like maybe option 3 or 6 are best. I'd lean toward 3, assuming I can hack my way through "driver forgot to call drm_crtc_vblank_off()" and similar problems, in part because it's ground that has partially been tread before. I hope that doesn't complicate the DRM state machine even more though, now that I'll have to better coordinate the "enabled"/"self_refresh_active" states with "vblank enabled"... Thoughts still welcome of course. Brian
Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
Hi Mark, Sean, (and dri-devel) On Wed, Dec 14, 2022 at 07:04:37PM -0800, Brian Norris wrote: > On Tue, Dec 13, 2022 at 04:51:11PM +, Mark Brown wrote: > > On Tue, Dec 13, 2022 at 05:56:30AM -0800, KernelCI bot wrote: > > > > The KernelCI bisection bot found regressions in at least two KMS tests > > in the Renesas tree on rk3399-gru-kevin just after the Renesas tree > > merged up mainline: > > > >igt-kms-rockchip.kms_vblank.pipe-A-wait-forked > >igt-kms-rockchip.kms_vblank.pipe-A-query-busy > > > > which it bisected to ca871659ec16 ("drm/bridge: analogix_dp: Support > > PSR-exit to disable transition"). I'm not *100%* sure I trust the > > bisection but it sure is suspicous that two separate bisects for related > > issues landed on the same commit. ... > > | IGT-Version: 1.26-gf8a4a0b (aarch64) (Linux: 6.1.0 aarch64) > > | <14>[ 35.48] [IGT] drm_read: starting subtest short-buffer-wakeup > > | Starting subtest: short-buffer-wakeup > > | > > | (| drm_read:350) CRITICAL: Test assertion failure function > > generate_event, file ../tests/drm_read.c:65: > > | (drm_read:350) CRITICAL: <14>[ 36.155642] [IGT] drm_read: exiting, > > ret=98 > > | Failed assertion: kmstest_get_vblank(fd, pipe, DRM_VBLANK_EVENT) > > | > > | (drm_read:350) CRITICAL: Last errno: 22, Invalid argument > > | Stack trace: > > | > > | #0 ../lib/igt_core.c:1933 __igt_fail_assert() > > | #1 [+0xd5362770] > > | #2 [+0xd536193c] > > | #3 [__libc_start_main+0xe8] > > | #4 [+0xd5361974] > > | #5 [[ 36.162851] Console: switching to colour frame buffer > > device 300x100>+0xd5361974] > > | Subtest short-buffer-wakeup failed. ... > I'll look some more, but this might be a difference of test setup, such > that my setup has the issue before and after that commit, but your setup > sees a regression. I believe this is the case; things are broken both before and after, but depending on test sequencing, etc., they might have seemed less broken before. When we're failing, we're getting EINVAL from drm_vblank_get(). That essentially means that vblank has been disabled (drm_crtc_vblank_off()). As it happens, this is exactly what we do when we enter PSR (Panel Self Refresh). Now, these test cases (especially 'kms_vblank.pipe-A-wait-forked') seem to display a static image, and then wait for a bunch of vblank events. This is exactly the sort of case where we should enter PSR, and so we're likely to disable vblank events. Thus, if everything is working right, we will often hit EINVAL in ioctl(DRM_IOCTL_WAIT_VBLANK) ... -> drm_vblank_get(), and fail the test. As to why this appears to be a regression: like mentioned in my previous mail, these tests tend to hose the Analogix PSR state before my patch set. When PSR is hosed, we tend to stay with PSR disabled, and so drm_vblank_get() doesn't return EINVAL, and the test is happy. (Never mind that the display is likely non-functional.) [0] So how to fix this? 1. ignore it; it's "just" a test failure, IIUC [1] 2. change the test, to skip this test if the device has PSR 3. leave vblank enabled even in the presence of PSR 4. otherwise modify the vblank ioctl interface, such that one can "wait" for a vblank that won't come (because the display is in self-refresh / there are no new frames or vblanks) I don't know how to do #2, because this variant of PSR is intentionally opaque to user space. For #3: the downstream PSR support that these systems shipped with initially did not disable vblank in PSR. But that was massively rewritten/refactored by Sean Paul before it made it upstream, and now it does. I tried briefly to factor that part out (drivers/gpu/drm/rockchip/rockchip_drm_vop.c / drm_crtc_vblank_{on,off}()), but because drm_self_refresh_helper.c (ab?)uses the commit step to "disable" the CRTC for PSR (even though the CRTC is not fully disabled), I tend to hit this in drm_atomic_helper_commit_modeset_disables()->disable_outputs(): ret = drm_crtc_vblank_get(crtc); WARN_ONCE(ret != -EINVAL, "driver forgot to call drm_crtc_vblank_off()\n"); And I hit a few more problems too... ...I'm sure I could hack my way through this somehow, but this is all sounding like it could use some advice from someone more familiar with DRM and/or the drm_self_refresh_helper design. I've learned my way around this a bit by necessity, but I still feel like I don't know all the right answers here. (*cough*, Sean?) Brian [0] I definitely reproduce this part locally, before my patches. I can't find non-expired CI logs for "passing" runs to be sure that's what the CI is seeing though. [1] Or is it? I don't really know the DRM_IOCTL_WAIT_VBLANK ABI contract in the presence of PSR, but I believe the upstream PSR support has always worked this way. I could imagine DRM_IOCTL_WAIT_VBLANK users might not love seeing EINVAL here though.
Re: renesas/master bisection: igt-kms-rockchip.kms_vblank.pipe-A-wait-forked on rk3399-gru-kevin
On Tue, Dec 13, 2022 at 04:51:11PM +, Mark Brown wrote: > On Tue, Dec 13, 2022 at 05:56:30AM -0800, KernelCI bot wrote: > > The KernelCI bisection bot found regressions in at least two KMS tests > in the Renesas tree on rk3399-gru-kevin just after the Renesas tree > merged up mainline: > >igt-kms-rockchip.kms_vblank.pipe-A-wait-forked >igt-kms-rockchip.kms_vblank.pipe-A-query-busy > > which it bisected to ca871659ec16 ("drm/bridge: analogix_dp: Support > PSR-exit to disable transition"). I'm not *100%* sure I trust the > bisection but it sure is suspicous that two separate bisects for related > issues landed on the same commit. > > Below is the full report for the bisect for the first test, the bisect > for the latter looks identical. It's got links to full logs for the > test run and a Reported-by for the bot - I do see some backtraces from > userspace in the output, the first is: I think the backtraces are just because IGT calls assert(). > | IGT-Version: 1.26-gf8a4a0b (aarch64) (Linux: 6.1.0 aarch64) > | <14>[ 35.48] [IGT] drm_read: starting subtest short-buffer-wakeup > | Starting subtest: short-buffer-wakeup > | > | (| drm_read:350) CRITICAL: Test assertion failure function generate_event, > file ../tests/drm_read.c:65: > | (drm_read:350) CRITICAL: <14>[ 36.155642] [IGT] drm_read: exiting, ret=98 > | Failed assertion: kmstest_get_vblank(fd, pipe, DRM_VBLANK_EVENT) > | > | (drm_read:350) CRITICAL: Last errno: 22, Invalid argument > | Stack trace: > | > | #0 ../lib/igt_core.c:1933 __igt_fail_assert() > | #1 [+0xd5362770] > | #2 [+0xd536193c] > | #3 [__libc_start_main+0xe8] > | #4 [+0xd5361974] > | #5 [[ 36.162851] Console: switching to colour frame buffer > device 300x100>+0xd5361974] > | Subtest short-buffer-wakeup failed. > > Unfortunately we don't have current results from mainline or -next. Thanks for the notification. I'm running: 6e516faf0431 drm/panfrost: Job should reference MMU not file_priv which is allegedly a "good" kernel. And running this: ### First kill my display manager, etc. ## Then: IGT_FORCE_DRIVER=rockchip /usr/libexec/igt-gpu-tools/kms_vblank --run-subtest pipe-A-wait-forked Gives the appended log [1]. If I'm looking right, it seems like it's failing the same way as the "regression." I also tested this: # the 5.10.x backport, and its parent: dbe04e874d4fbd56be64fdfcb29410241b6ad08a^ -- i.e.: 61297ee0c329 Input: bcm5974 - set missing URB_NO_TRANSFER_DMA_MAP urb flag and saw the same failures, and I also see the failures I was trying to avoid in this series (see e54a4424925a ("drm/atomic: Force bridge self-refresh-exit on CRTC switch")). Perhaps that's because of the "First kill my display manager, etc." -- because that step means we might end up switching CRTCs (VOPs) when the test starts, and triggering the bug. I'll look some more, but this might be a difference of test setup, such that my setup has the issue before and after that commit, but your setup sees a regression. Small tangent: It's possible this is similar to stuff I had to debug a while back in this space: atomictest: Disable CRTCs before test https://chromium-review.googlesource.com/c/chromiumos/platform/drm-tests/+/3309960 null_platform_test: Disable CRTCs first https://chromium-review.googlesource.com/c/chromiumos/platform/drm-tests/+/3315478 In that case, there was actually an underlying kernel regression due to: 846c7dfc1193 drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2. But our tests were broken too (assuming an initial state that wasn't guaranteed), so we just fixed the tests. Anyway, I'll confirm when I get some fresh eyes and can try a few more things (like neutering the ChromeOS display framework for my tests). Brian [1] IGT-Version: 1.26-gf8a4a0b5 (arm) (Linux: 5.18.0-rc6+ aarch64) Starting subtest: pipe-A-wait-forked Beginning pipe-A-wait-forked on pipe A, connector eDP-1 (kms_vblank:2532) CRITICAL: Test assertion failure function vblank_wait, file ../igt-gpu-tools-1.25/tests/kms_vblank.c:319: (kms_vblank:2535) CRITICAL: Test assertion failure function vblank_wait, file ../igt-gpu-tools-1.25/tests/kms_vblank.c:319: (kms_vblank:2531) CRITICAL: Test assertion failure function vblank_wait, file ../igt-gpu-tools-1.25/tests/kms_vblank.c:319: (kms_vblank:2532) CRITICAL: Failed assertion: wait_vblank(fd, &vbl) == 0 (kms_vblank:2534) CRITICAL: Test assertion failure function vblank_wait, file ../igt-gpu-tools-1.25/tests/kms_vblank.c:319: (kms_vblank:2535) CRITICAL: Failed assertion: wait_vblank(fd, &vbl) == 0 (kms_vblank:2531) CRITICAL: Failed assertion: wait_vblank(fd, &vbl) == 0 (kms_vblank:2536) CRITICAL: Test assertion failure function vblank_wait, file ../igt-gpu-tools-1.25/tests/kms_vblank.c:319: (kms_vblank:2532) CRITICAL: Last errno: 22, Invalid argument (kms_vblank:2534) CRITICAL: Failed assertion: wait_vblank(fd, &vbl) == 0 (kms_vblank:2536) CRITICAL: Failed assertion: wait_vblank(fd,
Re: [PATCH] drm/i915/huc: fix leak of debug object in huc load fence on driver unload
Hi Daniele, On Thu, Nov 10, 2022 at 04:56:51PM -0800, Daniele Ceraolo Spurio wrote: > The fence is always initialized in huc_init_early, but the cleanup in > huc_fini is only being run if HuC is enabled. This causes a leaking of > the debug object when HuC is disabled/not supported, which can in turn > trigger a warning if we try to register a new debug offset at the same > address on driver reload. > > To fix the issue, make sure to always run the cleanup code. > > Reported-by: Tvrtko Ursulin > Reported-by: Brian Norris > Fixes: 27536e03271d ("drm/i915/huc: track delayed HuC load with a fence") > Signed-off-by: Daniele Ceraolo Spurio > Cc: Tvrtko Ursulin > Cc: Brian Norris > Cc: Alan Previn > Cc: John Harrison > --- > > Note: I didn't manage to repro the reported warning, but I did confirm > that we weren't correctly calling i915_sw_fence_fini and that this patch > fixes that. I *did* reproduce, and with this patch, I no longer reproduce. So: Tested-by: Brian Norris I see this differs very slightly from the draft version (which didn't work for me): https://lore.kernel.org/all/ac5fde11-c17d-8574-c938-c2278d53c...@intel.com/ so presumably that diff is the fix. Thanks a bunch! Brian > drivers/gpu/drm/i915/gt/uc/intel_huc.c | 12 +++- > drivers/gpu/drm/i915/gt/uc/intel_uc.c | 1 + > 2 files changed, 8 insertions(+), 5 deletions(-)
Re: [PATCH 1/2] drm/amdgpu: Move racy global PMU list into device
On Tue, Nov 08, 2022 at 11:50:04AM -0500, Felix Kuehling wrote: > While you're making the pmu list per-device, I'd suggest removing adev from > the pmu entry because it is now redundant. The device is implied by the list > that the entry is on. Instead, add an adev parameter to > init_pmu_entry_by_type_and_add. Or you could move the list_add_tail to > amdgpu_pmu_init and remove "_and_add" from the function name. Sorry if I'm being naive here, but does that mean trying to navigate the list pointers to move from a 'pmu_entry' to an 'adev' (list_first_entry(), etc.)? There are quite a few cases where we're trying to go between 'pmu_entry' and 'adev'. I guess I could turn that into a mini helper. I'll also need to scrounge around a bit to see if I have an amdgpu system around that actually supports PMU. I realized the one I tested on doesn't actually hit this code path... and this would be getting a little less obvious/trivial. > Other than that, the patch looks good to me. Thanks for looking! Brian
[PATCH] drm/rockchip: vop: Quiet always-warning AFBC log
The downstream code from which this was derived didn't ever run through this 'switch' block with non-AFBC formats, but the upstream code does -- we use this function to probe whether a given format is supported. Demote the warning to eliminate this sort of warning seen on every boot: [drm] unsupported AFBC format[3231564e] And make it warn more than once, because if we *actually* care to see what formats we're probing/rejecting and for what reasons, we probably care about more than just the first message. Drop the comment, because one of the two *is* commonly reachable. And lastly, drop the unreachable return; we'd do better to let the compiler complain if we start hitting this unexpectedly. Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index fa1f4ee6d195..aab77eb6caa3 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -316,13 +316,10 @@ static int vop_convert_afbc_format(uint32_t format) case DRM_FORMAT_RGB565: case DRM_FORMAT_BGR565: return AFBC_FMT_RGB565; - /* either of the below should not be reachable */ default: - DRM_WARN_ONCE("unsupported AFBC format[%08x]\n", format); + DRM_DEBUG_KMS("unsupported AFBC format[%08x]\n", format); return -EINVAL; } - - return -EINVAL; } static uint16_t scl_vop_cal_scale(enum scale_mode mode, uint32_t src, -- 2.38.1.273.g43a17bfeac-goog
[PATCH 2/2] drm/amdgpu: Set PROBE_PREFER_ASYNCHRONOUS
This driver often takes over 200ms to start, so it can improve boot speed to probe it asynchronously. I did a short review of the driver, and apart from an issue fixed in the parent patch ("drm/amdgpu: Move racy global PMU list into device"), there don't appear to be many cross-device dependencies or racy accesses to global state, so this should be safe. This driver was pinpointed as part of a survey of top slowest initcalls (i.e., are built in, and probing synchronously) on a lab of ChromeOS systems. Signed-off-by: Brian Norris --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 3c9fecdd6b2f..2d180e48df1b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2793,7 +2793,10 @@ static struct pci_driver amdgpu_kms_pci_driver = { .probe = amdgpu_pci_probe, .remove = amdgpu_pci_remove, .shutdown = amdgpu_pci_shutdown, - .driver.pm = &amdgpu_pm_ops, + .driver = { + .pm = &amdgpu_pm_ops, + .probe_type = PROBE_PREFER_ASYNCHRONOUS, + }, .err_handler = &amdgpu_pci_err_handler, .dev_groups = amdgpu_sysfs_groups, }; -- 2.38.1.273.g43a17bfeac-goog
[PATCH 1/2] drm/amdgpu: Move racy global PMU list into device
If there are multiple amdgpu devices, this list processing can be racy. We're really treating this like a per-device list, so make that explicit and remove the global list. Signed-off-by: Brian Norris --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c | 12 +--- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 0e6ddf05c23c..e968b7f2417c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1063,6 +1063,10 @@ struct amdgpu_device { struct work_struct reset_work; booljob_hang; + +#if IS_ENABLED(CONFIG_PERF_EVENTS) + struct list_head pmu_list; +#endif }; static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c index 71ee361d0972..24f2055a2f23 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c @@ -23,6 +23,7 @@ #include #include +#include #include "amdgpu.h" #include "amdgpu_pmu.h" @@ -72,9 +73,6 @@ static ssize_t amdgpu_pmu_event_show(struct device *dev, amdgpu_pmu_attr->event_str, amdgpu_pmu_attr->type); } -static LIST_HEAD(amdgpu_pmu_list); - - struct amdgpu_pmu_attr { const char *name; const char *config; @@ -558,7 +556,7 @@ static int init_pmu_entry_by_type_and_add(struct amdgpu_pmu_entry *pmu_entry, pr_info("Detected AMDGPU %d Perf Events.\n", total_num_events); - list_add_tail(&pmu_entry->entry, &amdgpu_pmu_list); + list_add_tail(&pmu_entry->entry, &pmu_entry->adev->pmu_list); return 0; err_register: @@ -579,9 +577,7 @@ void amdgpu_pmu_fini(struct amdgpu_device *adev) { struct amdgpu_pmu_entry *pe, *temp; - list_for_each_entry_safe(pe, temp, &amdgpu_pmu_list, entry) { - if (pe->adev != adev) - continue; + list_for_each_entry_safe(pe, temp, &adev->pmu_list, entry) { list_del(&pe->entry); perf_pmu_unregister(&pe->pmu); kfree(pe->pmu.attr_groups); @@ -623,6 +619,8 @@ int amdgpu_pmu_init(struct amdgpu_device *adev) int ret = 0; struct amdgpu_pmu_entry *pmu_entry, *pmu_entry_df; + INIT_LIST_HEAD(&adev->pmu_list); + switch (adev->asic_type) { case CHIP_VEGA20: pmu_entry_df = create_pmu_entry(adev, AMDGPU_PMU_PERF_TYPE_DF, -- 2.38.1.273.g43a17bfeac-goog
[PATCH] drm/i915: Set PROBE_PREFER_ASYNCHRONOUS
This driver often takes a good 100ms to start, but in some particularly bad cases takes more than 1 second. In surveying risk for this driver, I poked around for cross-device shared state, which can be a source of race conditions. GVT support (intel_gvt_devices) seems potentially suspect, but it has an appropriate mutex, and doesn't seem to care about ordering -- if devices are present when the kvmgt module loads, they'll get picked up; and if they probe later than kvmgt, they'll attach then. Additionally, I see past discussions about this patch [1], which concluded that there were problems at the time due to the way hdac_i915.c called request_module("i915") and expected it to complete probing [2]. Work has since been merged [3] to fix that problem. This driver was pinpointed as part of a survey of drivers that take more than 100ms in their initcall (i.e., are built in, and probing synchronously) on a lab of ChromeOS systems. [1] [RFC] i915: make the probe asynchronous https://lore.kernel.org/all/20180604053219.2040-1-feng.t...@intel.com/ [2] https://lore.kernel.org/all/s5hin4d1e3f.wl-ti...@suse.de/ [3] Commit f9b54e1961c7 ("ALSA: hda/i915: Allow delayed i915 audio component binding") Signed-off-by: Brian Norris --- drivers/gpu/drm/i915/i915_pci.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 38460a0bd7cb..1cb1f87aea86 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1371,7 +1371,10 @@ static struct pci_driver i915_pci_driver = { .probe = i915_pci_probe, .remove = i915_pci_remove, .shutdown = i915_pci_shutdown, - .driver.pm = &i915_pm_ops, + .driver = { + .pm = &i915_pm_ops, + .probe_type = PROBE_PREFER_ASYNCHRONOUS, + }, }; int i915_pci_register_driver(void) -- 2.38.1.273.g43a17bfeac-goog
[PATCH 2/2] drm/rockchip: dsi: Force synchronous probe
We can't safely probe a dual-DSI display asynchronously (driver_async_probe='*' or driver_async_probe='dw-mipi-dsi-rockchip' cmdline), because dw_mipi_dsi_rockchip_find_second() pokes one DSI device's drvdata from the other device without any locking. Request synchronous probe, at least until this driver learns some appropriate locking for dual-DSI initialization. Cc: Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index d222c6811207..528ddce144e5 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -1689,5 +1689,11 @@ struct platform_driver dw_mipi_dsi_rockchip_driver = { .of_match_table = dw_mipi_dsi_rockchip_dt_ids, .pm = &dw_mipi_dsi_rockchip_pm_ops, .name = "dw-mipi-dsi-rockchip", + /* +* For dual-DSI display, one DSI pokes at the other DSI's +* drvdata in dw_mipi_dsi_rockchip_find_second(). This is not +* safe for asynchronous probe. +*/ + .probe_type = PROBE_FORCE_SYNCHRONOUS, }, }; -- 2.38.0.413.g74048e4d9e-goog
[PATCH 1/2] drm/rockchip: dsi: Clean up 'usage_mode' when failing to attach
If we fail to attach the first time (especially: EPROBE_DEFER), we fail to clean up 'usage_mode', and thus will fail to attach on any subsequent attempts, with "dsi controller already in use". Re-set to DW_DSI_USAGE_IDLE on attach failure. This is especially common to hit when enabling asynchronous probe on a duel-DSI system (such as RK3399 Gru/Scarlet), such that we're more likely to fail dw_mipi_dsi_rockchip_find_second() the first time. Fixes: 71f68fe7f121 ("drm/rockchip: dsi: add ability to work as a phy instead of full dsi") Cc: Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index bf6948125b84..d222c6811207 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -1051,23 +1051,31 @@ static int dw_mipi_dsi_rockchip_host_attach(void *priv_data, if (ret) { DRM_DEV_ERROR(dsi->dev, "Failed to register component: %d\n", ret); - return ret; + goto out; } second = dw_mipi_dsi_rockchip_find_second(dsi); - if (IS_ERR(second)) - return PTR_ERR(second); + if (IS_ERR(second)) { + ret = PTR_ERR(second); + goto out; + } if (second) { ret = component_add(second, &dw_mipi_dsi_rockchip_ops); if (ret) { DRM_DEV_ERROR(second, "Failed to register component: %d\n", ret); - return ret; + goto out; } } return 0; + +out: + mutex_lock(&dsi->usage_mutex); + dsi->usage_mode = DW_DSI_USAGE_IDLE; + mutex_unlock(&dsi->usage_mutex); + return ret; } static int dw_mipi_dsi_rockchip_host_detach(void *priv_data, -- 2.38.0.413.g74048e4d9e-goog
Re: [PATCH] Revert "drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time"
On Thu, Aug 25, 2022 at 11:06 AM Brian Norris wrote: > On Thu, Aug 25, 2022 at 10:37 AM Doug Anderson wrote: > > Given that this is _not_ an area that I'm an expert in nor is it an > > area where the consequences are super easy to analyze, I'm a little > > hesitant to apply this to drm-misc-next myself. Ideally someone more > > familiar with the driver would do it. However, if nobody steps up > > after a few weeks and nobody has yelled about this patch, I'll apply > > it. For this particular patch, I'd be interested in whether Zhang Zekun has any feedback (even a Tested-by?), since they were patching this function in the first place, which is why I paid attention: Subject: [PATCH -next] drm/bridge: Add missing clk_disable_unprepare() in analogix_dp_resume() https://lore.kernel.org/lkml/20220816064231.60473-1-zhangzeku...@huawei.com/ But in absence of that...it has now been a few weeks :) I'll also mark this to come back to again in a week or two, in case somebody is still hoping to wait longer. Regards, Brian
Re: [PATCH] Revert "drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time"
Hi, On Thu, Aug 25, 2022 at 10:37 AM Doug Anderson wrote: > Reviewed-by: Douglas Anderson Thanks :) > Given that this is _not_ an area that I'm an expert in nor is it an > area where the consequences are super easy to analyze, I'm a little > hesitant to apply this to drm-misc-next myself. Ideally someone more > familiar with the driver would do it. However, if nobody steps up > after a few weeks and nobody has yelled about this patch, I'll apply > it. I'll take this opportunity to correct Andrzej's email address (my bad; I need to use the up-to-date MAINTAINERS / .mailmap when generating CC lists), in case he's one such expert. Brian
[PATCH] Revert "drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time"
This reverts commit 211f276ed3d96e964d2d1106a198c7f4a4b3f4c0. For quite some time, core DRM helpers already ensure that any relevant connectors/CRTCs/etc. are disabled, as well as their associated components (e.g., bridges) when suspending the system. Thus, analogix_dp_bridge_{enable,disable}() already get called, which in turn call drm_panel_{prepare,unprepare}(). This makes these drm_panel_*() calls redundant. Besides redundancy, there are a few problems with this handling: (1) drm_panel_{prepare,unprepare}() are *not* reference-counted APIs and are not in general designed to be handled by multiple callers -- although some panel drivers have a coarse 'prepared' flag that mitigates some damage, at least. So at a minimum this is redundant and confusing, but in some cases, this could be actively harmful. (2) The error-handling is a bit non-standard. We ignored errors in suspend(), but handled errors in resume(). And recently, people noticed that the clk handling is unbalanced in error paths, and getting *that* right is not actually trivial, given the current way errors are mostly ignored. (3) In the particular way analogix_dp_{suspend,resume}() get used (e.g., in rockchip_dp_*(), as a late/early callback), we don't necessarily have a proper PM relationship between the DP/bridge device and the panel device. So while the DP bridge gets resumed, the panel's parent device (e.g., platform_device) may still be suspended, and so any prepare() calls may fail. So remove the superfluous, possibly-harmful suspend()/resume() handling of panel state. Fixes: 211f276ed3d9 ("drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time") Link: https://lore.kernel.org/all/yv2cpbd3picg%2f...@google.com/ Signed-off-by: Brian Norris --- drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 13 - 1 file changed, 13 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index 8aadcc0aa90b..df9370e0ff23 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1864,12 +1864,6 @@ EXPORT_SYMBOL_GPL(analogix_dp_remove); int analogix_dp_suspend(struct analogix_dp_device *dp) { clk_disable_unprepare(dp->clock); - - if (dp->plat_data->panel) { - if (drm_panel_unprepare(dp->plat_data->panel)) - DRM_ERROR("failed to turnoff the panel\n"); - } - return 0; } EXPORT_SYMBOL_GPL(analogix_dp_suspend); @@ -1884,13 +1878,6 @@ int analogix_dp_resume(struct analogix_dp_device *dp) return ret; } - if (dp->plat_data->panel) { - if (drm_panel_prepare(dp->plat_data->panel)) { - DRM_ERROR("failed to setup the panel\n"); - return -EBUSY; - } - } - return 0; } EXPORT_SYMBOL_GPL(analogix_dp_resume); -- 2.37.2.609.g9ff673ca1a-goog
Re: [PATCH -next] drm/bridge: Add missing clk_disable_unprepare() in analogix_dp_resume()
On Mon, Aug 15, 2022 at 11:46 PM Zhang Zekun wrote: > > Add the missing clk_disable_unprepare() before return from > analogix_dp_resume() in the error handling case. > > Fixes: 211f276ed3d9 ("drm: bridge: analogix/dp: add panel prepare/unprepare > in suspend/resume time") > Signed-off-by: Zhang Zekun Reviewed-by: Brian Norris
Re: [PATCH -next] drm/bridge: Add missing clk_disable_unprepare() in analogix_dp_resume()
Hi, On Wed, Aug 17, 2022 at 05:05:25PM -0700, Brian Norris wrote: > Hmm, actually I'm going to have to retract that now that I've given it > some more testing locally. I happen to have a system where I commonly > hit this error case, and I'm thinking commit 211f276ed3d9 is actually > wrong, and so we shouldn't be "fixing" its error handling -- we should > be reverting it. I've submitted that for review here: https://lore.kernel.org/all/20220822180729.1.I8ac5abe3a4c1c6fd5c061686c6e883c22f69022c@changeid/ [PATCH] Revert "drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time" I'd appreciate your review/testing. (NB: I failed to honor the .mailmap for Andrzej Hajda.) > Now separately, I have to figure out why I'm hitting this error case in > the first place... FWIW, I captured the reason in point 3 on the above Revert. The pm_runtime_*() handling in the panel driver fails (-EACCES) because the bridge driver is resuming before the panel. (The DRM suspend/resume helpers handle things in the correct order.) This problem is also resolved by simply reverting. Brian
Re: [PATCH -next] drm/bridge: Add missing clk_disable_unprepare() in analogix_dp_resume()
On Wed, Aug 17, 2022 at 01:34:13PM -0700, Brian Norris wrote: > On Mon, Aug 15, 2022 at 11:46 PM Zhang Zekun wrote: > > > > Add the missing clk_disable_unprepare() before return from > > analogix_dp_resume() in the error handling case. > > > > Fixes: 211f276ed3d9 ("drm: bridge: analogix/dp: add panel prepare/unprepare > > in suspend/resume time") > > Signed-off-by: Zhang Zekun > > Reviewed-by: Brian Norris Hmm, actually I'm going to have to retract that now that I've given it some more testing locally. I happen to have a system where I commonly hit this error case, and I'm thinking commit 211f276ed3d9 is actually wrong, and so we shouldn't be "fixing" its error handling -- we should be reverting it. In particular, drm_panel_prepare()/drm_panel_unprepare() are *not* reference-counted APIs, and this is already managed by analogix_dp_bridge_disable(), which is called by the core DRM helpers during suspend. Thus, analogix_dp_suspend()/analogix_dp_resume() is serving to be an unwanted *second* client trying to {un,}prepare the panel. The panel drivers tend to handle this OK to some extent, as they (e.g., panel-edp.c) guard against redundant operations, but *we* don't -- notice that analogix_dp_suspend() ignores drm_panel_unprepare() errors for one. Also, I don't believe device management really handles resume() failures quite right; in the end, this patch ends up un-balancing the clk count on the RK3399 Gru-Bob systems I'm testing. (Side note: every other bridge driver seems to ignore drm_panel_prepare() failures.) It's possible this was correct (or at least, not terribly broken) back when it was written, but then, the DRM core frameworks have evolved since then. Today, I think we do not need to manage the panel state directly in suspend()/resume(). All in all: Nacked-by: Brian Norris Tested-and-failed-by: Brian Norris Now separately, I have to figure out why I'm hitting this error case in the first place...
Re: [PATCH] drm/rockchip: vop: Don't crash for invalid duplicate_state()
On Fri, Jun 24, 2022 at 12:23 AM Heiko Stuebner wrote: > The interesting question would be, do we want some fixes tag for it? I'm not aware of any currently-upstream code that will hit this [1]. I've hit it in out-of-tree code (or, code that I submitted to dri-devel, but wasn't accepted as-is), and this is the "belt and braces" part -- the primary fix is that we should avoid calling things like drm_atomic_get_crtc_state() at inappropriate times. So, is the "extra safety" check really something that should go to -stable? (Because let's be honest, everything with a Fixes tag goes there.) Maybe? Anyway, if you want to "blame" anything, this commit actually dropped the safety check: 4e257d9eee23 drm/rockchip: get rid of rockchip_drm_crtc_mode_config Brian [1] But I'm not omniscient. So maybe it's good to have anyway.
[PATCH] drm/rockchip: vop: Don't crash for invalid duplicate_state()
It's possible for users to try to duplicate the CRTC state even when the state doesn't exist. drm_atomic_helper_crtc_duplicate_state() (and other users of __drm_atomic_helper_crtc_duplicate_state()) already guard this with a WARN_ON() instead of crashing, so let's do that here too. Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index 74562d40f639..daf192881353 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -1570,6 +1570,9 @@ static struct drm_crtc_state *vop_crtc_duplicate_state(struct drm_crtc *crtc) { struct rockchip_crtc_state *rockchip_state; + if (WARN_ON(!crtc->state)) + return NULL; + rockchip_state = kzalloc(sizeof(*rockchip_state), GFP_KERNEL); if (!rockchip_state) return NULL; -- 2.36.1.476.g0c4daa206d-goog
Re: [PATCH v2 0/2] drm/bridge: analogix_dp: Self-refresh state machine fixes
On Mon, Jun 6, 2022 at 1:30 PM Doug Anderson wrote: > On Fri, Jun 3, 2022 at 8:17 AM Doug Anderson wrote: > > On Fri, Jun 3, 2022 at 8:11 AM Sean Paul wrote: > > > Apologies for the delay. Please in future ping on irc/chat if you're > > > waiting for review from me, my inbox is often neglected. OK, I'll try to keep that in mind. I can't help myself with the semi-relevant XKCD though ;) https://xkcd.com/1254/ > > > The set still looks good to me, > > > > > > Reviewed-by: Sean Paul Thanks! > > Unless someone yells, I'll plan to apply both patches to > > drm-misc-fixes early next week, possibly Monday. Seems like if someone > > was going to object to these they've had plenty of time up until now. > > As promised, I pushed these to drm-misc-fixes: > > e54a4424925a drm/atomic: Force bridge self-refresh-exit on CRTC switch > ca871659ec16 drm/bridge: analogix_dp: Support PSR-exit to disable transition And thanks, Doug. Brian
Re: [PATCH v2 0/2] drm/bridge: analogix_dp: Self-refresh state machine fixes
On Thu, Mar 10, 2022 at 3:50 PM Brian Norris wrote: > On Mon, Feb 28, 2022 at 12:25 PM Brian Norris > wrote: > Ping for review? Sean, perhaps? (You already reviewed this on the > Chromium tracker.) Ping
Re: [PATCH v2 0/2] drm/bridge: analogix_dp: Self-refresh state machine fixes
On Mon, Feb 28, 2022 at 12:25 PM Brian Norris wrote: > > Hi, > > I've been investigating several eDP issues on a Rockchip RK3399 system > and have two proposed bugfixes. RK3399 has two CRTCs, either of which > can be used for eDP output. For both fixes, we have bugs due to the > relationship between the generalized "self refresh helpers" and the > analogix-dp bridge driver which controls much of the PSR mechanics. > These bugs are most visible when switching between CRTCs. > > I'm not a DRM expert, but I've been poking at a lot of Rockchip display > drivers recently. I'd love some skeptical eyes, so feel free to ask > questions if I haven't explained issues well, or the proposals look > fishy. > > Regards, > Brian Ping for review? Sean, perhaps? (You already reviewed this on the Chromium tracker.) Brian
[PATCH v4 2/2] drm/bridge: analogix_dp: Enable autosuspend
DP AUX transactions can consist of many short operations. There's no need to power things up/down in short intervals. I pick an arbitrary 100ms; for the systems I'm testing (Rockchip RK3399), runtime-PM transitions only take a few microseconds. Signed-off-by: Brian Norris --- Changes in v4: - call pm_runtime_mark_last_busy() and pm_runtime_dont_use_autosuspend() - drop excess pm references around drm_get_edid(), now that we grab and hold in the dp-aux helper Changes in v3: - New in v3 drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index 16be279aed2c..b248d352f2bd 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1119,9 +1119,7 @@ static int analogix_dp_get_modes(struct drm_connector *connector) return 0; } - pm_runtime_get_sync(dp->dev); edid = drm_get_edid(connector, &dp->aux.ddc); - pm_runtime_put(dp->dev); if (edid) { drm_connector_update_edid_property(&dp->connector, edid); @@ -1642,7 +1640,8 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux, ret = analogix_dp_transfer(dp, msg); out: - pm_runtime_put(dp->dev); + pm_runtime_mark_last_busy(dp->dev); + pm_runtime_put_autosuspend(dp->dev); return ret; } @@ -1775,6 +1774,8 @@ int analogix_dp_bind(struct analogix_dp_device *dp, struct drm_device *drm_dev) if (ret) return ret; + pm_runtime_use_autosuspend(dp->dev); + pm_runtime_set_autosuspend_delay(dp->dev, 100); pm_runtime_enable(dp->dev); ret = analogix_dp_create_bridge(drm_dev, dp); @@ -1786,6 +1787,7 @@ int analogix_dp_bind(struct analogix_dp_device *dp, struct drm_device *drm_dev) return 0; err_disable_pm_runtime: + pm_runtime_dont_use_autosuspend(dp->dev); pm_runtime_disable(dp->dev); drm_dp_aux_unregister(&dp->aux); @@ -1804,6 +1806,7 @@ void analogix_dp_unbind(struct analogix_dp_device *dp) } drm_dp_aux_unregister(&dp->aux); + pm_runtime_dont_use_autosuspend(dp->dev); pm_runtime_disable(dp->dev); } EXPORT_SYMBOL_GPL(analogix_dp_unbind); -- 2.35.1.574.g5d30c73bfb-goog
[PATCH v4 1/2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
If the display is not enable()d, then we aren't holding a runtime PM reference here. Thus, it's easy to accidentally cause a hang, if user space is poking around at /dev/drm_dp_aux0 at the "wrong" time. Let's get a runtime PM reference, and check that we "see" the panel. Don't force any panel power-up, etc., because that can be intrusive, and that's not what other drivers do (see drivers/gpu/drm/bridge/ti-sn65dsi86.c and drivers/gpu/drm/bridge/parade-ps8640.c.) Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code") Cc: Cc: Tomeu Vizoso Signed-off-by: Brian Norris Reviewed-by: Douglas Anderson --- Changes in v4: - Add Doug's Reviewed-by Changes in v3: - Avoid panel power-up; just check for HPD state, and let the rest happen "as-is" (e.g., time out, if the caller hasn't prepared things properly) Changes in v2: - Fix spelling in Subject - DRM_DEV_ERROR() -> drm_err() - Propagate errors from un-analogix_dp_prepare_panel() drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index b7d2e4449cfa..16be279aed2c 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1632,8 +1632,19 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) { struct analogix_dp_device *dp = to_dp(aux); + int ret; + + pm_runtime_get_sync(dp->dev); + + ret = analogix_dp_detect_hpd(dp); + if (ret) + goto out; - return analogix_dp_transfer(dp, msg); + ret = analogix_dp_transfer(dp, msg); +out: + pm_runtime_put(dp->dev); + + return ret; } struct analogix_dp_device * -- 2.35.1.574.g5d30c73bfb-goog
Re: [PATCH v3 2/2] drm/bridge: analogix_dp: Enable autosuspend
On Tue, Feb 22, 2022 at 2:10 PM Doug Anderson wrote: > On Thu, Feb 17, 2022 at 2:42 PM Brian Norris wrote: > > --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > @@ -1121,7 +1121,7 @@ static int analogix_dp_get_modes(struct drm_connector > > *connector) > > > > pm_runtime_get_sync(dp->dev); > > edid = drm_get_edid(connector, &dp->aux.ddc); > > - pm_runtime_put(dp->dev); > > + pm_runtime_put_autosuspend(dp->dev); > > So I think you can fully get rid of these ones now and rely on the > ones in the aux transfer, right? Yep, good catch. > > if (edid) { > > drm_connector_update_edid_property(&dp->connector, > >edid); > > @@ -1642,7 +1642,7 @@ static ssize_t analogix_dpaux_transfer(struct > > drm_dp_aux *aux, > > > > ret = analogix_dp_transfer(dp, msg); > > out: > > - pm_runtime_put(dp->dev); > > + pm_runtime_put_autosuspend(dp->dev); > > > > return ret; > > } > > @@ -1775,6 +1775,8 @@ int analogix_dp_bind(struct analogix_dp_device *dp, > > struct drm_device *drm_dev) > > if (ret) > > return ret; > > > > + pm_runtime_use_autosuspend(dp->dev); > > + pm_runtime_set_autosuspend_delay(dp->dev, 100); > > It's explicitly listed in the Documentation that you need the > corresponding pm_runtime_dont_use_autosuspend(). Specifically, it > says: > > > Drivers in ->remove() callback should undo the runtime PM changes done > > in ->probe(). Usually this means calling pm_runtime_disable(), > > pm_runtime_dont_use_autosuspend() etc. > > Not that it's very common to see anyone actually get it right, but I > seem to remember running into an issue when I didn't do it. I think > ti-sn65dsi86 still has it wrong since I found out about this later. > Need to write a patch up for that... Basically you want to put it > right before the two calls in your driver to pm_runtime_disable(). Ack. Speaking of API misfeatures that we missed: I've failed to call pm_runtime_mark_last_busy(). I'll add that in the next rev, for the cases where we weren't already calling *_put_sync() (i.e., presumably we don't really care to wait around for autosuspend). *gripe* What a silly API. *gripe* Brian
[PATCH v2 2/2] drm/atomic: Force bridge self-refresh-exit on CRTC switch
It's possible to change which CRTC is in use for a given connector/encoder/bridge while we're in self-refresh without fully disabling the connector/encoder/bridge along the way. This can confuse the bridge encoder/bridge, because (a) it needs to track the SR state (trying to perform "active" operations while the panel is still in SR can be Bad(TM)); and (b) it tracks the SR state via the CRTC state (and after the switch, the previous SR state is lost). Thus, we need to either somehow carry the self-refresh state over to the new CRTC, or else force an encoder/bridge self-refresh transition during such a switch. I choose the latter, so we disable the encoder (and exit PSR) before attaching it to the new CRTC (where we can continue to assume a clean (non-self-refresh) state). This fixes PSR issues seen on Rockchip RK3399 systems with drivers/gpu/drm/bridge/analogix/analogix_dp_core.c. Change in v2: - Drop "->enable" condition; this could possibly be "->active" to reflect the intended hardware state, but it also is a little over-specific. We want to make a transition through "disabled" any time we're exiting PSR at the same time as a CRTC switch. (Thanks Liu Ying) Cc: Liu Ying Cc: Fixes: 1452c25b0e60 ("drm: Add helpers to kick off self refresh mode in drivers") Signed-off-by: Brian Norris --- drivers/gpu/drm/drm_atomic_helper.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 9603193d2fa1..987e4b212e9f 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1011,9 +1011,19 @@ crtc_needs_disable(struct drm_crtc_state *old_state, return drm_atomic_crtc_effectively_active(old_state); /* -* We need to run through the crtc_funcs->disable() function if the CRTC -* is currently on, if it's transitioning to self refresh mode, or if -* it's in self refresh mode and needs to be fully disabled. +* We need to disable bridge(s) and CRTC if we're transitioning out of +* self-refresh and changing CRTCs at the same time, because the +* bridge tracks self-refresh status via CRTC state. +*/ + if (old_state->self_refresh_active && + old_state->crtc != new_state->crtc) + return true; + + /* +* We also need to run through the crtc_funcs->disable() function if +* the CRTC is currently on, if it's transitioning to self refresh +* mode, or if it's in self refresh mode and needs to be fully +* disabled. */ return old_state->active || (old_state->self_refresh_active && !new_state->active) || -- 2.35.1.574.g5d30c73bfb-goog
[PATCH v2 1/2] drm/bridge: analogix_dp: Support PSR-exit to disable transition
Most eDP panel functions only work correctly when the panel is not in self-refresh. In particular, analogix_dp_bridge_disable() tends to hit AUX channel errors if the panel is in self-refresh. Given the above, it appears that so far, this driver assumes that we are never in self-refresh when it comes time to fully disable the bridge. Prior to commit 846c7dfc1193 ("drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2."), this tended to be true, because we would automatically disable the pipe when framebuffers were removed, and so we'd typically disable the bridge shortly after the last display activity. However, that is not guaranteed: an idle (self-refresh) display pipe may be disabled, e.g., when switching CRTCs. We need to exit PSR first. Stable notes: this is definitely a bugfix, and the bug has likely existed in some form for quite a while. It may predate the "PSR helpers" refactor, but the code looked very different before that, and it's probably not worth rewriting the fix. Cc: Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Signed-off-by: Brian Norris --- (no changes since v1) .../drm/bridge/analogix/analogix_dp_core.c| 42 +-- 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index b7d2e4449cfa..6ee0f62a7161 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1268,6 +1268,25 @@ static int analogix_dp_bridge_attach(struct drm_bridge *bridge, return 0; } +static +struct drm_crtc *analogix_dp_get_old_crtc(struct analogix_dp_device *dp, + struct drm_atomic_state *state) +{ + struct drm_encoder *encoder = dp->encoder; + struct drm_connector *connector; + struct drm_connector_state *conn_state; + + connector = drm_atomic_get_old_connector_for_encoder(state, encoder); + if (!connector) + return NULL; + + conn_state = drm_atomic_get_old_connector_state(state, connector); + if (!conn_state) + return NULL; + + return conn_state->crtc; +} + static struct drm_crtc *analogix_dp_get_new_crtc(struct analogix_dp_device *dp, struct drm_atomic_state *state) @@ -1448,14 +1467,16 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge, { struct drm_atomic_state *old_state = old_bridge_state->base.state; struct analogix_dp_device *dp = bridge->driver_private; - struct drm_crtc *crtc; + struct drm_crtc *old_crtc, *new_crtc; + struct drm_crtc_state *old_crtc_state = NULL; struct drm_crtc_state *new_crtc_state = NULL; + int ret; - crtc = analogix_dp_get_new_crtc(dp, old_state); - if (!crtc) + new_crtc = analogix_dp_get_new_crtc(dp, old_state); + if (!new_crtc) goto out; - new_crtc_state = drm_atomic_get_new_crtc_state(old_state, crtc); + new_crtc_state = drm_atomic_get_new_crtc_state(old_state, new_crtc); if (!new_crtc_state) goto out; @@ -1464,6 +1485,19 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge, return; out: + old_crtc = analogix_dp_get_old_crtc(dp, old_state); + if (old_crtc) { + old_crtc_state = drm_atomic_get_old_crtc_state(old_state, + old_crtc); + + /* When moving from PSR to fully disabled, exit PSR first. */ + if (old_crtc_state && old_crtc_state->self_refresh_active) { + ret = analogix_dp_disable_psr(dp); + if (ret) + DRM_ERROR("Failed to disable psr (%d)\n", ret); + } + } + analogix_dp_bridge_disable(bridge); } -- 2.35.1.574.g5d30c73bfb-goog
[PATCH v2 0/2] drm/bridge: analogix_dp: Self-refresh state machine fixes
Hi, I've been investigating several eDP issues on a Rockchip RK3399 system and have two proposed bugfixes. RK3399 has two CRTCs, either of which can be used for eDP output. For both fixes, we have bugs due to the relationship between the generalized "self refresh helpers" and the analogix-dp bridge driver which controls much of the PSR mechanics. These bugs are most visible when switching between CRTCs. I'm not a DRM expert, but I've been poking at a lot of Rockchip display drivers recently. I'd love some skeptical eyes, so feel free to ask questions if I haven't explained issues well, or the proposals look fishy. Regards, Brian Changes in v2: - Drop "->enable" condition in crtc_needs_disable(); this could possibly be "->active" to reflect the intended hardware state, but it also is a little over-specific. We want to make a transition through "disabled" any time we're exiting PSR at the same time as a CRTC switch. (Thanks Liu Ying) Brian Norris (2): drm/bridge: analogix_dp: Support PSR-exit to disable transition drm/atomic: Force bridge self-refresh-exit on CRTC switch .../drm/bridge/analogix/analogix_dp_core.c| 42 +-- drivers/gpu/drm/drm_atomic_helper.c | 16 +-- 2 files changed, 51 insertions(+), 7 deletions(-) -- 2.35.1.574.g5d30c73bfb-goog
Re: [PATCH 2/2] drm/atomic: Force bridge self-refresh-exit on CRTC switch
Hi Liu, On Mon, Feb 28, 2022 at 1:02 AM Liu Ying wrote: > On Tue, 2022-02-15 at 15:54 -0800, Brian Norris wrote: > > --- a/drivers/gpu/drm/drm_atomic_helper.c > > +++ b/drivers/gpu/drm/drm_atomic_helper.c > > @@ -1011,9 +1011,19 @@ crtc_needs_disable(struct drm_crtc_state *old_state, > > return drm_atomic_crtc_effectively_active(old_state); > > > > /* > > - * We need to run through the crtc_funcs->disable() function if the > > CRTC > > - * is currently on, if it's transitioning to self refresh mode, or if > > - * it's in self refresh mode and needs to be fully disabled. > > + * We need to disable bridge(s) and CRTC if we're transitioning out of > > + * self-refresh and changing CRTCs at the same time, because the > > + * bridge tracks self-refresh status via CRTC state. > > + */ > > + if (old_state->self_refresh_active && new_state->enable && > > + old_state->crtc != new_state->crtc) > > + return true; > > I think 'new_state->enable' should be changed to 'new_state->active', > because 'active' is the one to enable/disable the CRTC while 'enable' > reflects whether a mode blob is set to CRTC state. The overall logic > added above is ok to me. Let's see if others have any comments. Thanks for the review, and good catch. This actually shows that most of my development was before commit 69e630016ef4 ("drm/atomic: Check new_crtc_state->active to determine if CRTC needs disable in self refresh mode"). In fact, the "state->enable" condition was included here mostly as a complement to the "!state->enable" condition that was present previously, and I didn't adapt it properly upon rebase. In practice, this portion of the condition is not needed at all; we really want to exit PSR on CRTC-switch regardless of the new-CRTC state. So rather than change "enable" to "active", I plan to remove it entirely. I'll give it some local tests and send v2 eventually. Thanks, Brian
[PATCH v3 2/2] drm/bridge: analogix_dp: Enable autosuspend
DP AUX transactions can consist of many short operations. There's no need to power things up/down in short intervals. I pick an arbitrary 100ms; for the systems I'm testing (Rockchip RK3399), runtime-PM transitions only take a few microseconds. Signed-off-by: Brian Norris --- Changes in v3: - New in v3 drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index 16be279aed2c..d82a4ddf44e7 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1121,7 +1121,7 @@ static int analogix_dp_get_modes(struct drm_connector *connector) pm_runtime_get_sync(dp->dev); edid = drm_get_edid(connector, &dp->aux.ddc); - pm_runtime_put(dp->dev); + pm_runtime_put_autosuspend(dp->dev); if (edid) { drm_connector_update_edid_property(&dp->connector, edid); @@ -1642,7 +1642,7 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux, ret = analogix_dp_transfer(dp, msg); out: - pm_runtime_put(dp->dev); + pm_runtime_put_autosuspend(dp->dev); return ret; } @@ -1775,6 +1775,8 @@ int analogix_dp_bind(struct analogix_dp_device *dp, struct drm_device *drm_dev) if (ret) return ret; + pm_runtime_use_autosuspend(dp->dev); + pm_runtime_set_autosuspend_delay(dp->dev, 100); pm_runtime_enable(dp->dev); ret = analogix_dp_create_bridge(drm_dev, dp); -- 2.35.1.265.g69c8d7142f-goog
[PATCH v3 1/2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
If the display is not enable()d, then we aren't holding a runtime PM reference here. Thus, it's easy to accidentally cause a hang, if user space is poking around at /dev/drm_dp_aux0 at the "wrong" time. Let's get a runtime PM reference, and check that we "see" the panel. Don't force any panel power-up, etc., because that can be intrusive, and that's not what other drivers do (see drivers/gpu/drm/bridge/ti-sn65dsi86.c and drivers/gpu/drm/bridge/parade-ps8640.c.) Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code") Cc: Cc: Tomeu Vizoso Signed-off-by: Brian Norris --- Changes in v3: - Avoid panel power-up; just check for HPD state, and let the rest happen "as-is" (e.g., time out, if the caller hasn't prepared things properly) Changes in v2: - Fix spelling in Subject - DRM_DEV_ERROR() -> drm_err() - Propagate errors from un-analogix_dp_prepare_panel() drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index b7d2e4449cfa..16be279aed2c 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1632,8 +1632,19 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) { struct analogix_dp_device *dp = to_dp(aux); + int ret; + + pm_runtime_get_sync(dp->dev); + + ret = analogix_dp_detect_hpd(dp); + if (ret) + goto out; - return analogix_dp_transfer(dp, msg); + ret = analogix_dp_transfer(dp, msg); +out: + pm_runtime_put(dp->dev); + + return ret; } struct analogix_dp_device * -- 2.35.1.265.g69c8d7142f-goog
Re: [PATCH v2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
On Tue, Feb 15, 2022 at 3:46 PM Doug Anderson wrote: > On Tue, Feb 15, 2022 at 2:52 PM Brian Norris wrote: > > It still makes me wonder what the point > > of the /dev/drm_dp_aux interface is though, because it seems like > > you're pretty much destined to not have reliable operation through > > that means. > > I can't say I have tons of history for those files. I seem to recall > maybe someone using them to have userspace tweak the embedded > backlight on some external DP connected panels? I think we also might > use it in Chrome OS to update the firmware of panels (dunno if > internal or external) in some cases too? I suspect that it works OK > for certain situations but it's really not going to work in all > cases... Yes, I believe I'm only submitting patches like this because fwupd apparently likes to indiscriminately whack at dpaux devices: https://github.com/fwupd/fwupd/tree/main/plugins/synaptics-mst#kernel-dp-aux-interface That seems like a bad idea. (We've already disabled that plugin on these systems, but it seems wise not to leave the stumbling block here for the next time.) > I suppose this just further proves the point that this is really not a > great interface to rely on. It's fine for debugging during hardware > bringup and I guess in limited situations it might be OK, but it's > really not something we want userspace tweaking with anyway, right? In > general I expect it's up to the kernel to be controlling peripherals > on the DP AUX bus. The kernel should have a backlight driver and > should do the AUX transfers needed. Having userspace in there mucking > with things is just a bad idea. I mean, userspace also doesn't know > when a panel has been power cycled and potentially lost any changes > that they might have written, right? > > I sorta suspect that most of the uses of these files are there because > there wasn't a kernel driver and someone thought that doing it in > userspace was the way to go? *shrug* beats me. Brian
[PATCH 1/2] drm/bridge: analogix_dp: Support PSR-exit to disable transition
Most eDP panel functions only work correctly when the panel is not in self-refresh. In particular, analogix_dp_bridge_disable() tends to hit AUX channel errors if the panel is in self-refresh. Given the above, it appears that so far, this driver assumes that we are never in self-refresh when it comes time to fully disable the bridge. Prior to commit 846c7dfc1193 ("drm/atomic: Try to preserve the crtc enabled state in drm_atomic_remove_fb, v2."), this tended to be true, because we would automatically disable the pipe when framebuffers were removed, and so we'd typically disable the bridge shortly after the last display activity. However, that is not guaranteed: an idle (self-refresh) display pipe may be disabled, e.g., when switching CRTCs. We need to exit PSR first. Stable notes: this is definitely a bugfix, and the bug has likely existed in some form for quite a while. It may predate the "PSR helpers" refactor, but the code looked very different before that, and it's probably not worth rewriting the fix. Cc: Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Signed-off-by: Brian Norris --- .../drm/bridge/analogix/analogix_dp_core.c| 42 +-- 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index b7d2e4449cfa..6ee0f62a7161 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1268,6 +1268,25 @@ static int analogix_dp_bridge_attach(struct drm_bridge *bridge, return 0; } +static +struct drm_crtc *analogix_dp_get_old_crtc(struct analogix_dp_device *dp, + struct drm_atomic_state *state) +{ + struct drm_encoder *encoder = dp->encoder; + struct drm_connector *connector; + struct drm_connector_state *conn_state; + + connector = drm_atomic_get_old_connector_for_encoder(state, encoder); + if (!connector) + return NULL; + + conn_state = drm_atomic_get_old_connector_state(state, connector); + if (!conn_state) + return NULL; + + return conn_state->crtc; +} + static struct drm_crtc *analogix_dp_get_new_crtc(struct analogix_dp_device *dp, struct drm_atomic_state *state) @@ -1448,14 +1467,16 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge, { struct drm_atomic_state *old_state = old_bridge_state->base.state; struct analogix_dp_device *dp = bridge->driver_private; - struct drm_crtc *crtc; + struct drm_crtc *old_crtc, *new_crtc; + struct drm_crtc_state *old_crtc_state = NULL; struct drm_crtc_state *new_crtc_state = NULL; + int ret; - crtc = analogix_dp_get_new_crtc(dp, old_state); - if (!crtc) + new_crtc = analogix_dp_get_new_crtc(dp, old_state); + if (!new_crtc) goto out; - new_crtc_state = drm_atomic_get_new_crtc_state(old_state, crtc); + new_crtc_state = drm_atomic_get_new_crtc_state(old_state, new_crtc); if (!new_crtc_state) goto out; @@ -1464,6 +1485,19 @@ analogix_dp_bridge_atomic_disable(struct drm_bridge *bridge, return; out: + old_crtc = analogix_dp_get_old_crtc(dp, old_state); + if (old_crtc) { + old_crtc_state = drm_atomic_get_old_crtc_state(old_state, + old_crtc); + + /* When moving from PSR to fully disabled, exit PSR first. */ + if (old_crtc_state && old_crtc_state->self_refresh_active) { + ret = analogix_dp_disable_psr(dp); + if (ret) + DRM_ERROR("Failed to disable psr (%d)\n", ret); + } + } + analogix_dp_bridge_disable(bridge); } -- 2.35.1.265.g69c8d7142f-goog
[PATCH 2/2] drm/atomic: Force bridge self-refresh-exit on CRTC switch
It's possible to change which CRTC is in use for a given connector/encoder/bridge while we're in self-refresh without fully disabling the connector/encoder/bridge along the way. This can confuse the bridge encoder/bridge, because (a) it needs to track the SR state (trying to perform "active" operations while the panel is still in SR can be Bad(TM)); and (b) it tracks the SR state via the CRTC state (and after the switch, the previous SR state is lost). Thus, we need to either somehow carry the self-refresh state over to the new CRTC, or else force an encoder/bridge self-refresh transition during such a switch. I choose the latter, so we disable the encoder (and exit PSR) before attaching it to the new CRTC (where we can continue to assume a clean (non-self-refresh) state). This fixes PSR issues seen on Rockchip RK3399 systems with drivers/gpu/drm/bridge/analogix/analogix_dp_core.c. Cc: Fixes: 1452c25b0e60 ("drm: Add helpers to kick off self refresh mode in drivers") Signed-off-by: Brian Norris --- drivers/gpu/drm/drm_atomic_helper.c | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c index 9603193d2fa1..74161d007894 100644 --- a/drivers/gpu/drm/drm_atomic_helper.c +++ b/drivers/gpu/drm/drm_atomic_helper.c @@ -1011,9 +1011,19 @@ crtc_needs_disable(struct drm_crtc_state *old_state, return drm_atomic_crtc_effectively_active(old_state); /* -* We need to run through the crtc_funcs->disable() function if the CRTC -* is currently on, if it's transitioning to self refresh mode, or if -* it's in self refresh mode and needs to be fully disabled. +* We need to disable bridge(s) and CRTC if we're transitioning out of +* self-refresh and changing CRTCs at the same time, because the +* bridge tracks self-refresh status via CRTC state. +*/ + if (old_state->self_refresh_active && new_state->enable && + old_state->crtc != new_state->crtc) + return true; + + /* +* We also need to run through the crtc_funcs->disable() function if +* the CRTC is currently on, if it's transitioning to self refresh +* mode, or if it's in self refresh mode and needs to be fully +* disabled. */ return old_state->active || (old_state->self_refresh_active && !new_state->active) || -- 2.35.1.265.g69c8d7142f-goog
[PATCH 0/2] drm/bridge: analogix_dp: Self-refresh state machine fixes
Hi, I've been investigating several eDP issues on a Rockchip RK3399 system and have two proposed bugfixes. RK3399 has two CRTCs, either of which can be used for eDP output. For both fixes, we have bugs due to the relationship between the generalized "self refresh helpers" and the analogix-dp bridge driver which controls much of the PSR mechanics. These bugs are most visible when switching between CRTCs. I'm not a DRM expert, but I've been poking at a lot of Rockchip display drivers recently. I'd love some skeptical eyes, so feel free to ask questions if I haven't explained issues well, or the proposals look fishy. Regards, Brian Brian Norris (2): drm/bridge: analogix_dp: Support PSR-exit to disable transition drm/atomic: Force bridge self-refresh-exit on CRTC switch .../drm/bridge/analogix/analogix_dp_core.c| 42 +-- drivers/gpu/drm/drm_atomic_helper.c | 16 +-- 2 files changed, 51 insertions(+), 7 deletions(-) -- 2.35.1.265.g69c8d7142f-goog
Re: [PATCH v2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
On Tue, Feb 15, 2022 at 1:31 PM Doug Anderson wrote: > > Hi, Hi! > On Fri, Oct 1, 2021 at 2:50 PM Brian Norris wrote: > > > > If the display is not enable()d, then we aren't holding a runtime PM > > reference here. Thus, it's easy to accidentally cause a hang, if user > > space is poking around at /dev/drm_dp_aux0 at the "wrong" time. > > > > Let's get the panel and PM state right before trying to talk AUX. > > diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > index b7d2e4449cfa..6fc46ac93ef8 100644 > > --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > @@ -1632,8 +1632,27 @@ static ssize_t analogix_dpaux_transfer(struct > > drm_dp_aux *aux, ... > > + pm_runtime_get_sync(dp->dev); > > + ret = analogix_dp_transfer(dp, msg); > > + pm_runtime_put(dp->dev); > > I've spent an unfortunate amount of time digging around the DP AUX bus > recently, so I can at least say that I have some experience and some > opinions here. Thanks! Experience is welcome, and opinions too sometimes ;) > IMO: > > 1. Don't power the panel on. If the panel isn't powered on then the DP > AUX transfer will timeout. Tough nuggies. Think of yourself more like > an i2c controller and of this as an i2c transfer implementation. The > i2c controller isn't in charge of powering up the i2c devices on the > bus. If userspace does an "i2c detect" on an i2c bus and some of the > devices aren't powered then they won't be found. If you try to > read/write from a powered off device that won't work either. I guess this, paired with the driver examples below (ti-sn65dsi86.c, especially, which specifically throws errors if the panel isn't on), makes some sense. It's approximately (but more verbosely) what Andrzej was recommending too, I guess. It still makes me wonder what the point of the /dev/drm_dp_aux interface is though, because it seems like you're pretty much destined to not have reliable operation through that means. Also note: I found that the AUX bus is really not working properly at all (even with this patch) in some cases due to self-refresh. Not only do we need the panel enabled, but we need to not be in self-refresh mode. Self-refresh is not currently exposed to user space, so user space has no way of knowing the panel is currently active, aside from racily inducing artificial display activity. But if we're OK with "just throw errors" or "just let stuff time out", then I guess that's not a big deal. My purpose is to avoid hanging the system, not to make /dev/drm_dp_aux useful. > 2. In theory if the DP driver can read HPD (I haven't looked through > the analogix code to see how it handles it) then you can fail an AUX > transfer right away if HPD isn't asserted instead of timing out. If > this is hard, it's probably fine to just time out though. This driver does handle HPD, but it also has overrides because apparently it doesn't work on some systems. I might see if we can leverage it, or I might just follow the bridge-enabled state (similar to ti-sn65dsi86.c's 'comms_enabled'). > 3. Do the "pm_runtime" calls, but enable "autosuspend" with something > ~1 second autosuspend delay. When using the AUX bus to read an EDID > the underlying code will call your function 16 times in quick > succession. If you're powering up and down constantly that'll be a bit > of a waste. Does this part really matter? For properly active cases, the bridge remains enabled, and it holds a runtime PM reference. For "maybe active" (your "tough nuggies" situation above), you're probably right that it's inefficient, but does it matter, when it's going to be a slow timed-out operation anyway? The AUX failure will be much slower than the PM transition. I guess I can do this anyway, but frankly, I'll just be copy/pasting stuff from other drivers, because the runtime PM documentation still confuses me, and moreso once you involve autosuspend. > For a reference, you could look at > `drivers/gpu/drm/bridge/ti-sn65dsi86.c`. Also > `drivers/gpu/drm/bridge/parade-ps8640.c` Thanks for these. They look like reasonable patterns to follow. Brian
[PATCH] drm/rockchip: vop: Correct RK3399 VOP register fields
Commit 7707f7227f09 ("drm/rockchip: Add support for afbc") switched up the rk3399_vop_big[] register windows, but it did so incorrectly. The biggest problem is in rk3288_win23_data[] vs. rk3368_win23_data[] .format field: RK3288's format: VOP_REG(RK3288_WIN2_CTRL0, 0x7, 1) RK3368's format: VOP_REG(RK3368_WIN2_CTRL0, 0x3, 5) Bits 5:6 (i.e., shift 5, mask 0x3) are correct for RK3399, according to the TRM. There are a few other small differences between the 3288 and 3368 definitions that were swapped in commit 7707f7227f09. I reviewed them to the best of my ability according to the RK3399 TRM and fixed them up. This fixes IOMMU issues (and display errors) when testing with BG24 color formats. Fixes: 7707f7227f09 ("drm/rockchip: Add support for afbc") Cc: Andrzej Pietrasiewicz Cc: Signed-off-by: Brian Norris --- I'd appreciate notes or testing from Andrzej, since I'm not sure how he tested his original AFBC work. drivers/gpu/drm/rockchip/rockchip_vop_reg.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c index 1f7353f0684a..798b542e5916 100644 --- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c +++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c @@ -902,6 +902,7 @@ static const struct vop_win_phy rk3399_win01_data = { .enable = VOP_REG(RK3288_WIN0_CTRL0, 0x1, 0), .format = VOP_REG(RK3288_WIN0_CTRL0, 0x7, 1), .rb_swap = VOP_REG(RK3288_WIN0_CTRL0, 0x1, 12), + .x_mir_en = VOP_REG(RK3288_WIN0_CTRL0, 0x1, 21), .y_mir_en = VOP_REG(RK3288_WIN0_CTRL0, 0x1, 22), .act_info = VOP_REG(RK3288_WIN0_ACT_INFO, 0x1fff1fff, 0), .dsp_info = VOP_REG(RK3288_WIN0_DSP_INFO, 0x0fff0fff, 0), @@ -912,6 +913,7 @@ static const struct vop_win_phy rk3399_win01_data = { .uv_vir = VOP_REG(RK3288_WIN0_VIR, 0x3fff, 16), .src_alpha_ctl = VOP_REG(RK3288_WIN0_SRC_ALPHA_CTRL, 0xff, 0), .dst_alpha_ctl = VOP_REG(RK3288_WIN0_DST_ALPHA_CTRL, 0xff, 0), + .channel = VOP_REG(RK3288_WIN0_CTRL2, 0xff, 0), }; /* @@ -922,11 +924,11 @@ static const struct vop_win_phy rk3399_win01_data = { static const struct vop_win_data rk3399_vop_win_data[] = { { .base = 0x00, .phy = &rk3399_win01_data, .type = DRM_PLANE_TYPE_PRIMARY }, - { .base = 0x40, .phy = &rk3288_win01_data, + { .base = 0x40, .phy = &rk3368_win01_data, .type = DRM_PLANE_TYPE_OVERLAY }, - { .base = 0x00, .phy = &rk3288_win23_data, + { .base = 0x00, .phy = &rk3368_win23_data, .type = DRM_PLANE_TYPE_OVERLAY }, - { .base = 0x50, .phy = &rk3288_win23_data, + { .base = 0x50, .phy = &rk3368_win23_data, .type = DRM_PLANE_TYPE_CURSOR }, }; -- 2.34.1.703.g22d0c6ccf7-goog
Re: [PATCH v2 3/3] ASoC: rk3399_gru_sound: Wire up DP jack detection
Hi Chen-Yu, On Mon, Jan 17, 2022 at 05:01:52PM +0800, Chen-Yu Tsai wrote: > On Sat, Jan 15, 2022 at 7:03 AM Brian Norris wrote: > > > > Now that the cdn-dp driver supports plug-change callbacks, let's wire it > > up. > > > > Signed-off-by: Brian Norris > > --- > > > > (no changes since v1) > > > > sound/soc/rockchip/rk3399_gru_sound.c | 20 > > 1 file changed, 20 insertions(+) > > > > diff --git a/sound/soc/rockchip/rk3399_gru_sound.c > > b/sound/soc/rockchip/rk3399_gru_sound.c > > index e2d52d8d0ff9..eeef3ed70037 100644 > > --- a/sound/soc/rockchip/rk3399_gru_sound.c > > +++ b/sound/soc/rockchip/rk3399_gru_sound.c > > @@ -164,6 +164,25 @@ static int rockchip_sound_da7219_hw_params(struct > > snd_pcm_substream *substream, > > return 0; > > } > > > > +static struct snd_soc_jack cdn_dp_card_jack; > > + > > +static int rockchip_sound_cdndp_init(struct snd_soc_pcm_runtime *rtd) > > +{ > > + struct snd_soc_component *component = asoc_rtd_to_codec(rtd, > > 0)->component; > > Using snd_soc_card_get_codec_dai() might be a better choice throughout this > driver. While it will work for the cdn_dp case, because it is the first DAI > in |rockchip_dais[]|, all the invocations for the other codecs are likely > returning the wrong DAI. I'll admit, I'm not very familiar with the ASoC object model, so you may well be correct that there's something fishy in here. But I did trace through the objects involved here, and we *are* getting the correct DAI for both this case and the DA7219 case (preexisting code). It looks like we actually have a new runtime for each of our static dai_links: devm_snd_soc_register_card() ... for_each_card_prelinks() snd_soc_add_pcm_runtime() So I think this is valid to keep as-is. > For this particular patch it works either way, so > > Reviewed-by: Chen-Yu Tsai Thanks for looking! Brian
[PATCH v2 3/3] ASoC: rk3399_gru_sound: Wire up DP jack detection
Now that the cdn-dp driver supports plug-change callbacks, let's wire it up. Signed-off-by: Brian Norris --- (no changes since v1) sound/soc/rockchip/rk3399_gru_sound.c | 20 1 file changed, 20 insertions(+) diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index e2d52d8d0ff9..eeef3ed70037 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -164,6 +164,25 @@ static int rockchip_sound_da7219_hw_params(struct snd_pcm_substream *substream, return 0; } +static struct snd_soc_jack cdn_dp_card_jack; + +static int rockchip_sound_cdndp_init(struct snd_soc_pcm_runtime *rtd) +{ + struct snd_soc_component *component = asoc_rtd_to_codec(rtd, 0)->component; + struct snd_soc_card *card = rtd->card; + int ret; + + /* Enable jack detection. */ + ret = snd_soc_card_jack_new(card, "DP Jack", SND_JACK_LINEOUT, + &cdn_dp_card_jack, NULL, 0); + if (ret) { + dev_err(card->dev, "Can't create DP Jack %d\n", ret); + return ret; + } + + return snd_soc_component_set_jack(component, &cdn_dp_card_jack, NULL); +} + static int rockchip_sound_da7219_init(struct snd_soc_pcm_runtime *rtd) { struct snd_soc_component *component = asoc_rtd_to_codec(rtd, 0)->component; @@ -315,6 +334,7 @@ static const struct snd_soc_dai_link rockchip_dais[] = { [DAILINK_CDNDP] = { .name = "DP", .stream_name = "DP PCM", + .init = rockchip_sound_cdndp_init, .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS, SND_SOC_DAILINK_REG(cdndp), -- 2.34.1.703.g22d0c6ccf7-goog
[PATCH v2 2/3] drm/rockchip: cdn-dp: Support HDMI codec plug-change callback
Some audio servers like to monitor a jack device (perhaps combined with EDID, for audio-presence info) to determine DP/HDMI audio presence. Signed-off-by: Brian Norris --- (no changes since v1) drivers/gpu/drm/rockchip/cdn-dp-core.c | 28 ++ drivers/gpu/drm/rockchip/cdn-dp-core.h | 4 2 files changed, 32 insertions(+) diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c index 16497c31d9f9..edd6a1fc46cd 100644 --- a/drivers/gpu/drm/rockchip/cdn-dp-core.c +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c @@ -586,6 +586,13 @@ static bool cdn_dp_check_link_status(struct cdn_dp_device *dp) return drm_dp_channel_eq_ok(link_status, min(port->lanes, sink_lanes)); } +static void cdn_dp_audio_handle_plugged_change(struct cdn_dp_device *dp, + bool plugged) +{ + if (dp->codec_dev) + dp->plugged_cb(dp->codec_dev, plugged); +} + static void cdn_dp_encoder_enable(struct drm_encoder *encoder) { struct cdn_dp_device *dp = encoder_to_dp(encoder); @@ -641,6 +648,9 @@ static void cdn_dp_encoder_enable(struct drm_encoder *encoder) DRM_DEV_ERROR(dp->dev, "Failed to valid video %d\n", ret); goto out; } + + cdn_dp_audio_handle_plugged_change(dp, true); + out: mutex_unlock(&dp->lock); } @@ -651,6 +661,8 @@ static void cdn_dp_encoder_disable(struct drm_encoder *encoder) int ret; mutex_lock(&dp->lock); + cdn_dp_audio_handle_plugged_change(dp, false); + if (dp->active) { ret = cdn_dp_disable(dp); if (ret) { @@ -846,11 +858,27 @@ static int cdn_dp_audio_get_eld(struct device *dev, void *data, return 0; } +static int cdn_dp_audio_hook_plugged_cb(struct device *dev, void *data, + hdmi_codec_plugged_cb fn, + struct device *codec_dev) +{ + struct cdn_dp_device *dp = dev_get_drvdata(dev); + + mutex_lock(&dp->lock); + dp->plugged_cb = fn; + dp->codec_dev = codec_dev; + cdn_dp_audio_handle_plugged_change(dp, dp->connected); + mutex_unlock(&dp->lock); + + return 0; +} + static const struct hdmi_codec_ops audio_codec_ops = { .hw_params = cdn_dp_audio_hw_params, .audio_shutdown = cdn_dp_audio_shutdown, .mute_stream = cdn_dp_audio_mute_stream, .get_eld = cdn_dp_audio_get_eld, + .hook_plugged_cb = cdn_dp_audio_hook_plugged_cb, .no_capture_mute = 1, }; diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.h b/drivers/gpu/drm/rockchip/cdn-dp-core.h index 81ac9b658a70..d808a9de45ed 100644 --- a/drivers/gpu/drm/rockchip/cdn-dp-core.h +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.h @@ -10,6 +10,7 @@ #include #include #include +#include #include "rockchip_drm_drv.h" @@ -101,5 +102,8 @@ struct cdn_dp_device { u8 dpcd[DP_RECEIVER_CAP_SIZE]; bool sink_has_audio; + + hdmi_codec_plugged_cb plugged_cb; + struct device *codec_dev; }; #endif /* _CDN_DP_CORE_H */ -- 2.34.1.703.g22d0c6ccf7-goog
[PATCH v2 1/3] arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output
Commit b18c6c3c7768 ("ASoC: rockchip: cdn-dp sound output use spdif") switched the platform to SPDIF, but we didn't fix up the device tree. Drop the pinctrl settings, because the 'spdif_bus' pins are either: * unused (on kevin, bob), so the settings is ~harmless * used by a different function (on scarlet), which causes probe failures (!!) Fixes: b18c6c3c7768 ("ASoC: rockchip: cdn-dp sound output use spdif") Signed-off-by: Brian Norris --- Changes in v2: - (Un)set pinctrl, because the default assumes we're routing out to external pins arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi index 45a5ae5d2027..162f08bca0d4 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi @@ -286,7 +286,7 @@ max98357a: max98357a { sound: sound { compatible = "rockchip,rk3399-gru-sound"; - rockchip,cpu = <&i2s0 &i2s2>; + rockchip,cpu = <&i2s0 &spdif>; }; }; @@ -437,10 +437,6 @@ &i2s0 { status = "okay"; }; -&i2s2 { - status = "okay"; -}; - &io_domains { status = "okay"; @@ -537,6 +533,17 @@ &sdmmc { vqmmc-supply = <&ppvar_sd_card_io>; }; +&spdif { + status = "okay"; + + /* +* SPDIF is routed internally to DP; we either don't use these pins, or +* mux them to something else. +*/ + /delete-property/ pinctrl-0; + /delete-property/ pinctrl-names; +}; + &spi1 { status = "okay"; -- 2.34.1.703.g22d0c6ccf7-goog
[PATCH v2 0/3] (Re)enable DP/HDMI audio for RK3399 Gru
This series fixes DP/HDMI audio for RK3399 Gru systems. First, there was a regression with the switch to SPDIF. Patch 1 can be taken separately as a regression fix if desired. But it's not quite so useful (at least on Chrome OS systems) without the second part. Second, jack detection was never upstreamed, because the hdmi-codec dependencies were still being worked out when this platform was first supported. Patches cover a few subsystems. Perhaps this is something for arm-soc? Changes in v2: - (Un)set pinctrl, because the default assumes we're routing out to external pins Brian Norris (3): arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output drm/rockchip: cdn-dp: Support HDMI codec plug-change callback ASoC: rk3399_gru_sound: Wire up DP jack detection arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 17 drivers/gpu/drm/rockchip/cdn-dp-core.c | 28 drivers/gpu/drm/rockchip/cdn-dp-core.h | 4 +++ sound/soc/rockchip/rk3399_gru_sound.c| 20 ++ 4 files changed, 64 insertions(+), 5 deletions(-) -- 2.34.1.703.g22d0c6ccf7-goog
Re: [PATCH 1/3] arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output
Sorry to send a self-reply so quickly, but I noticed an error and want to make sure this doesn't get merged _too_ quickly before I get to send a revision! See below: On Fri, Jan 14, 2022 at 12:17 PM Brian Norris wrote: > > Commit b18c6c3c7768 ("ASoC: rockchip: cdn-dp sound output use spdif") > switched the platform to SPDIF, but we didn't fix up the device tree. > > Fixes: b18c6c3c7768 ("ASoC: rockchip: cdn-dp sound output use spdif") > Signed-off-by: Brian Norris > --- > > arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi > b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi > index 45a5ae5d2027..21ec073f4d51 100644 > --- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi > +++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi > +&spdif { > + status = "okay"; I need to fix up the pinctrl settings here. rk3399.dtsi has a default that is incorrect. That's OK for several variants (Kevin and Bob, where the pin is actually unconnected), but it breaks Scarlet (where the pin in question is actually connected to something else). I'll send a v2 after waiting a bit, in case there are other comments worth addressing at the same time. Brian > +};
[PATCH 2/3] drm/rockchip: cdn-dp: Support HDMI codec plug-change callback
Some audio servers like to monitor a jack device (perhaps combined with EDID, for audio-presence info) to determine DP/HDMI audio presence. Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/cdn-dp-core.c | 28 ++ drivers/gpu/drm/rockchip/cdn-dp-core.h | 4 2 files changed, 32 insertions(+) diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c index 16497c31d9f9..edd6a1fc46cd 100644 --- a/drivers/gpu/drm/rockchip/cdn-dp-core.c +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c @@ -586,6 +586,13 @@ static bool cdn_dp_check_link_status(struct cdn_dp_device *dp) return drm_dp_channel_eq_ok(link_status, min(port->lanes, sink_lanes)); } +static void cdn_dp_audio_handle_plugged_change(struct cdn_dp_device *dp, + bool plugged) +{ + if (dp->codec_dev) + dp->plugged_cb(dp->codec_dev, plugged); +} + static void cdn_dp_encoder_enable(struct drm_encoder *encoder) { struct cdn_dp_device *dp = encoder_to_dp(encoder); @@ -641,6 +648,9 @@ static void cdn_dp_encoder_enable(struct drm_encoder *encoder) DRM_DEV_ERROR(dp->dev, "Failed to valid video %d\n", ret); goto out; } + + cdn_dp_audio_handle_plugged_change(dp, true); + out: mutex_unlock(&dp->lock); } @@ -651,6 +661,8 @@ static void cdn_dp_encoder_disable(struct drm_encoder *encoder) int ret; mutex_lock(&dp->lock); + cdn_dp_audio_handle_plugged_change(dp, false); + if (dp->active) { ret = cdn_dp_disable(dp); if (ret) { @@ -846,11 +858,27 @@ static int cdn_dp_audio_get_eld(struct device *dev, void *data, return 0; } +static int cdn_dp_audio_hook_plugged_cb(struct device *dev, void *data, + hdmi_codec_plugged_cb fn, + struct device *codec_dev) +{ + struct cdn_dp_device *dp = dev_get_drvdata(dev); + + mutex_lock(&dp->lock); + dp->plugged_cb = fn; + dp->codec_dev = codec_dev; + cdn_dp_audio_handle_plugged_change(dp, dp->connected); + mutex_unlock(&dp->lock); + + return 0; +} + static const struct hdmi_codec_ops audio_codec_ops = { .hw_params = cdn_dp_audio_hw_params, .audio_shutdown = cdn_dp_audio_shutdown, .mute_stream = cdn_dp_audio_mute_stream, .get_eld = cdn_dp_audio_get_eld, + .hook_plugged_cb = cdn_dp_audio_hook_plugged_cb, .no_capture_mute = 1, }; diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.h b/drivers/gpu/drm/rockchip/cdn-dp-core.h index 81ac9b658a70..d808a9de45ed 100644 --- a/drivers/gpu/drm/rockchip/cdn-dp-core.h +++ b/drivers/gpu/drm/rockchip/cdn-dp-core.h @@ -10,6 +10,7 @@ #include #include #include +#include #include "rockchip_drm_drv.h" @@ -101,5 +102,8 @@ struct cdn_dp_device { u8 dpcd[DP_RECEIVER_CAP_SIZE]; bool sink_has_audio; + + hdmi_codec_plugged_cb plugged_cb; + struct device *codec_dev; }; #endif /* _CDN_DP_CORE_H */ -- 2.34.1.703.g22d0c6ccf7-goog
[PATCH 3/3] ASoC: rk3399_gru_sound: Wire up DP jack detection
Now that the cdn-dp driver supports plug-change callbacks, let's wire it up. Signed-off-by: Brian Norris --- sound/soc/rockchip/rk3399_gru_sound.c | 20 1 file changed, 20 insertions(+) diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c index e2d52d8d0ff9..eeef3ed70037 100644 --- a/sound/soc/rockchip/rk3399_gru_sound.c +++ b/sound/soc/rockchip/rk3399_gru_sound.c @@ -164,6 +164,25 @@ static int rockchip_sound_da7219_hw_params(struct snd_pcm_substream *substream, return 0; } +static struct snd_soc_jack cdn_dp_card_jack; + +static int rockchip_sound_cdndp_init(struct snd_soc_pcm_runtime *rtd) +{ + struct snd_soc_component *component = asoc_rtd_to_codec(rtd, 0)->component; + struct snd_soc_card *card = rtd->card; + int ret; + + /* Enable jack detection. */ + ret = snd_soc_card_jack_new(card, "DP Jack", SND_JACK_LINEOUT, + &cdn_dp_card_jack, NULL, 0); + if (ret) { + dev_err(card->dev, "Can't create DP Jack %d\n", ret); + return ret; + } + + return snd_soc_component_set_jack(component, &cdn_dp_card_jack, NULL); +} + static int rockchip_sound_da7219_init(struct snd_soc_pcm_runtime *rtd) { struct snd_soc_component *component = asoc_rtd_to_codec(rtd, 0)->component; @@ -315,6 +334,7 @@ static const struct snd_soc_dai_link rockchip_dais[] = { [DAILINK_CDNDP] = { .name = "DP", .stream_name = "DP PCM", + .init = rockchip_sound_cdndp_init, .dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS, SND_SOC_DAILINK_REG(cdndp), -- 2.34.1.703.g22d0c6ccf7-goog
[PATCH 1/3] arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output
Commit b18c6c3c7768 ("ASoC: rockchip: cdn-dp sound output use spdif") switched the platform to SPDIF, but we didn't fix up the device tree. Fixes: b18c6c3c7768 ("ASoC: rockchip: cdn-dp sound output use spdif") Signed-off-by: Brian Norris --- arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi index 45a5ae5d2027..21ec073f4d51 100644 --- a/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi +++ b/arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi @@ -286,7 +286,7 @@ max98357a: max98357a { sound: sound { compatible = "rockchip,rk3399-gru-sound"; - rockchip,cpu = <&i2s0 &i2s2>; + rockchip,cpu = <&i2s0 &spdif>; }; }; @@ -437,10 +437,6 @@ &i2s0 { status = "okay"; }; -&i2s2 { - status = "okay"; -}; - &io_domains { status = "okay"; @@ -537,6 +533,10 @@ &sdmmc { vqmmc-supply = <&ppvar_sd_card_io>; }; +&spdif { + status = "okay"; +}; + &spi1 { status = "okay"; -- 2.34.1.703.g22d0c6ccf7-goog
[PATCH 0/3] (Re)enable DP/HDMI audio for RK3399 Gru
This series fixes DP/HDMI audio for RK3399 Gru systems. First, there was a regression with the switch to SPDIF. Patch 1 can be taken separately as a regression fix if desired. But it's not quite so useful (at least on Chrome OS systems) without the second part. Second, jack detection was never upstreamed, because the hdmi-codec dependencies were still being worked out when this platform was first supported. Patches cover a few subsystems. Perhaps this is something for arm-soc? Brian Norris (3): arm64: dts: rockchip: Switch RK3399-Gru DP to SPDIF output drm/rockchip: cdn-dp: Support HDMI codec plug-change callback ASoC: rk3399_gru_sound: Wire up DP jack detection arch/arm64/boot/dts/rockchip/rk3399-gru.dtsi | 10 +++ drivers/gpu/drm/rockchip/cdn-dp-core.c | 28 drivers/gpu/drm/rockchip/cdn-dp-core.h | 4 +++ sound/soc/rockchip/rk3399_gru_sound.c| 20 ++ 4 files changed, 57 insertions(+), 5 deletions(-) -- 2.34.1.703.g22d0c6ccf7-goog
Re: [PATCH v2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
Hi Andrzej, On Tue, Jan 11, 2022 at 5:26 AM Andrzej Hajda wrote: > I am not DP specialist so CC-ed people working with DP Thanks for the review regardless! I'll also not claim to be a DP specialist -- although I've had to learn my fair share to debug a good handful of issues on an SoC using this driver. > On 01.10.2021 23:42, Brian Norris wrote: > > If the display is not enable()d, then we aren't holding a runtime PM > > reference here. Thus, it's easy to accidentally cause a hang, if user > > space is poking around at /dev/drm_dp_aux0 at the "wrong" time. > > > > Let's get the panel and PM state right before trying to talk AUX. > > > > Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code") > > Cc: > > Cc: Tomeu Vizoso > > Signed-off-by: Brian Norris > > > Few questions/issues here: > > 1. If it is just to avoid accidental 'hangs' it would be better to just > check if the panel is working before transfer, if not, return error > code. If there is better reason for this pm dance, please provide it in > description. I'm not that familiar with DP-AUX, but I believe it can potentially provide a variety of useful information (e.g., EDID?) to users without the display and primary video link being active. So it doesn't sound like a good idea to me to purposely leave this interface uninitialized (and emitting errors) even when the user is asking for communication (via /dev/drm_dp_aux). Do you want me to document what /dev/drm_dp_aux does, and why someone would use it, in the commit message? > 2. Again I see an assumption that panel-prepare enables power for > something different than video transmission, accidentally it is true for > most devices, but devices having more fine grained power management will > break, or at least will be used inefficiently - but maybe in case of dp > it is OK ??? For this part, I'm less sure -- I wasn't sure what the general needs are for AUX communication, and whether we need the panel enabled or not. It seems logical that we need something powered, and I don't know of anything besides "prepare()" that ensures that for DP panels. (NB: the key to _my_ problem is the PM runtime reference. It's absolutely essential that we don't try to utilize the DP hardware without powering it up. The panel power state is less critical.) > 3. More general issue - I am not sure if this should not be handled > uniformly for all drm_dp devices. I'm not sure what precisely you mean by #3. But FWIW, this is at least partially documented ("make sure it's been properly enabled"): /** * @transfer: transfers a message representing a single AUX * transaction. * * This is a hardware-specific implementation of how * transactions are executed that the drivers must provide. ... * Also note that this callback can be called no matter the * state @dev is in. Drivers that need that device to be powered * to perform this operation will first need to make sure it's * been properly enabled. */ ssize_t (*transfer)(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg); But maybe the definition of "properly enabled" is what you're unsure about? (I'm also a little unsure.) Regards, Brian
Re: [PATCH v2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
(updating Andrzej's email) On Fri, Oct 1, 2021 at 2:50 PM Brian Norris wrote: > If the display is not enable()d, then we aren't holding a runtime PM > reference here. Thus, it's easy to accidentally cause a hang, if user > space is poking around at /dev/drm_dp_aux0 at the "wrong" time. > > Let's get the panel and PM state right before trying to talk AUX. > > Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code") > Cc: > Cc: Tomeu Vizoso > Signed-off-by: Brian Norris > --- > > Changes in v2: > - Fix spelling in Subject > - DRM_DEV_ERROR() -> drm_err() > - Propagate errors from un-analogix_dp_prepare_panel() > > .../drm/bridge/analogix/analogix_dp_core.c| 21 ++- > 1 file changed, 20 insertions(+), 1 deletion(-) Ping? Do I need to do anything more here? Thanks, Brian
Re: [PATCH v2 1/2] drm/input_helper: Add new input-handling helper
Hi Pekka, On Fri, Nov 19, 2021 at 12:38:41PM +0200, Pekka Paalanen wrote: > On Thu, 18 Nov 2021 17:46:10 -0800 > Brian Norris wrote: > > On Thu, Nov 18, 2021 at 12:39:28PM +0200, Pekka Paalanen wrote: > > > On Wed, 17 Nov 2021 14:48:40 -0800 > > > Brian Norris wrote: > > > If KMS gets a pageflip or modeset in no time after an input event, then > > > what's the gain. OTOH, if the display server is locking on to vblank, > > > there might be a delay worth avoiding. But then, is it worth > > > short-circuiting the wake-up in kernel vs. adding a new ioctl that > > > userspace could hit to start the warming up process? > > > > Rob responded to the first part to some extent (there is definitely gain > > to be had). > > > > To the last part: I wrote a simple debugfs hook to allow user space to > > force a PSR exit, and then a simple user space program to read input > > events and smash that debugfs file whenever it sees one. Testing in the > > same scenarios, this appears to lose less than 100 microseconds versus > > the in-kernel approach, which is negligible for this use case. (I'm not > > sure about the other use cases.) > > > > So, this is technically doable in user space. > > This is crucial information I would like you to include in some commit > message. I think it is very interesting for the reviewers. Maybe also > copy that in the cover letter. > > In my opinion there is a clear and obvious decision due that > measurement: Add the new ioctl for userspace to hit, do not try to > hardcode or upload the wake-up policy into the kernel. Perhaps. I'll admit, I'm not eager to go write the fd-passing solutions that others are designing on the fly. I'm currently torn on whether I'll just live with this current patch set out-of-tree (or, y'all can decide that a simple, 99% working solution is better than no solution), because it's simple; or possibly figuring out how to utilize such an ioctl cleanly within our display manager. I'm not super hopeful on the latter. IOW, I'm approximately in line with Doug's thoughts: https://lore.kernel.org/all/CAD=FV=XARhZoj+0p-doxcbC=4K+NuMc=ur6wqg6kwk-mkpk...@mail.gmail.com/ But then, we're obviously biased. > > I can't speak to the ease of _actually_ integrating this into even our > > own Chrome display manager, but I highly doubt it will get integrated > > into others. I'd posit this should weigh into the relative worth, but > > otherwise can't really give you an answer there. > > I think such a thing would be very simple to add to any display server. > They already have hooks for things like resetting idle timeout timers on > any relevant input event. > > > I'd also note, software-directed PSR is so far designed to be completely > > opaque to user space. There's no way to disable it; no way to know it's > > active; and no way to know anything about the parameters it's computing > > (like average entry/exit delay). Would you suggest a whole set of new > > IOCTLs for this? > > Just one ioctl on the DRM device: "Hey, wake up!". Because that's what > your patch does in-kernel, right? Well, we'd at least want something to advertise that the feature does something ("is supported") I think, otherwise we're just asking user space to do useless work. > If there are use case specific parameters, then how did you intend to > allow adjusting those in your proposal? Another commenter mentioned the latency tradeoff -- it's possible that there are panels/eDP-links that resume fast enough that one doesn't care to use this ioctl. For an in-kernel solution, one has all the data available and could use hardware information to make decisions, if needed. For a user space solution, we won't have any of that, and we'd have to work to expose that information. I suppose we could ignore that problem and only expose a minimal UAPI until we need something more, but it feels like exposing a UAPI for something is a critical point where one should make sure it's reasonably descriptive and useful. > > > How do you know userspace is using this input device at all? If > > > userspace is not using the input device, then DRM should not be opening > > > it either, as it must have no effect on anything. > > > > > > If you open an input device that userspace does not use, you also cause > > > a power consumption regression, because now the input device itself is > > > active and possibly flooding the kernel with events (e.g. an > > > accelerometer). > > > > Well, I don't think accelerometers show up
Re: [PATCH v2 1/2] drm/input_helper: Add new input-handling helper
Hi Daniel, On Fri, Nov 19, 2021 at 11:01:18AM +0100, Daniel Vetter wrote: > On Thu, Nov 18, 2021 at 11:30:43AM -0800, Brian Norris wrote: > > On Thu, Nov 18, 2021 at 10:05:11AM +0100, Daniel Vetter wrote: > > > On Wed, Nov 17, 2021 at 02:48:40PM -0800, Brian Norris wrote: > > > > --- a/drivers/gpu/drm/Kconfig > > > > +++ b/drivers/gpu/drm/Kconfig > > > > @@ -79,9 +79,15 @@ config DRM_DEBUG_SELFTEST > > > > > > > > If in doubt, say "N". > > > > > > > > +config DRM_INPUT_HELPER > > > > + def_bool y > > > > + depends on DRM_KMS_HELPER > > > > + depends on INPUT > > > > > > Uh please no configs for each thing, it just makes everything more > > > complex. Do we _really_ need this? > > > > First, it's not a configurable option (a user will never see this nor > > have to answer Y/N to it); it only serves as an intermediary to express > > the CONFIG_INPUT dependency (which is necessary) without making > > DRM_KMS_HELPER fully depend on CONFIG_INPUT. (We should be able to run > > display stacks without the input subsystem.) > > I'm not so much worried about the user cost, but the maintenance cost. > Kbuild config complexity is ridiculous, anything that adds even a bit is > really silly. > > > The closest alternative I can think of with fewer Kconfig symbols is to > > just use CONFIG_INPUT directly in the code, to decide whether to provide > > the helpers or else just stub them out. But that has a problem of not > > properly expressing the =m vs. =y necessity: if, for example, > > CONFIG_DRM_KMS_HELPER=y and CONFIG_INPUT=m, then we'll have linker > > issues. > > Usually this is done by providing static inline dummy implementations in > the headers. That avoids having to sprinkle new Kconfig symbols all over. Right, I already did that, and I'm not sprinkling CONFIG_DRM_INPUT_HELPER much. (I do include one around the module parameter, because it doesn't make much sense to have the module parameter even exist, if the underlying feature is stubbed out.) But that doesn't solve the problem in my last sentence, involving tristates. The "stub inline" approach only works well for boolean features -- either built-in, or disabled. Once your feature is in a module, you need to ensure that no built-in code depends on it. Do you want DRM_KMS_HELPER to unconditionally depend on CONFIG_INPUT? If so, I can just add a 'select' or 'depend' and drop this intermediate symbol. If not, then what do you expect to happen with DRM_KMS_HELPER=y and CONFIG_INPUT=m? Brian
Re: [PATCH v2 1/2] drm/input_helper: Add new input-handling helper
Hi Pekka, Thanks for the thoughts and review. I've tried to respond below: On Thu, Nov 18, 2021 at 12:39:28PM +0200, Pekka Paalanen wrote: > On Wed, 17 Nov 2021 14:48:40 -0800 > Brian Norris wrote: > > > A variety of applications have found it useful to listen to > > user-initiated input events to make decisions within a DRM driver, given > > that input events are often the first sign that we're going to start > > doing latency-sensitive activities: > > > > * Panel self-refresh: software-directed self-refresh (e.g., with > >Rockchip eDP) is especially latency sensitive. In some cases, it can > >take 10s of milliseconds for a panel to exit self-refresh, which can > >be noticeable. Rockchip RK3399 Chrome OS systems have always shipped > >with an input_handler boost, that preemptively exits self-refresh > >whenever there is input activity. > > > > * GPU drivers: on GPU-accelerated desktop systems, we may need to > >render new frames immediately after user activity. Powering up the > >GPU can take enough time that it is worthwhile to start this process > >as soon as there is input activity. Many Chrome OS systems also ship > >with an input_handler boost that powers up the GPU. > > > > This patch provides a small helper library that abstracts some of the > > input-subsystem details around picking which devices to listen to, and > > some other boilerplate. This will be used in the next patch to implement > > the first bullet: preemptive exit for panel self-refresh. > > > > Bits of this are adapted from code the Android and/or Chrome OS kernels > > have been carrying for a while. > > > > Signed-off-by: Brian Norris > > --- > > Thanks Simon for the CC. > > Hi Brian, > > while this feature in general makes sense and sounds good, to start > warming up display hardware early when something might start to happen, > this particular proposal has many problems from UAPI perspective (as it > has none). Comments below. > > Btw. if PSR is that slow to wake up from, how much do you actually gain > from this input event watching? I would imagine the improvement to not > be noticeable. Patch 2 has details. It's not really about precisely how slow PSR is, but how much foresight we can gain: in patch 2, I note that with my particular user space and system, I can start PSR-exit 50ms earlier than I would otherweise. (FWIW, this measurement is exactly the same it was with the original version written 4 years ago.) For how long PSR-exit takes: the measurements I'm able to do (via ftrace) show that drm_self_refresh_transition() takes between 35 and 55 ms. That's noticeable at 60 fps. And quite conveniently, the input-boost manages to hide nearly 100% of that latency. Typical use cases where one notices PSR latency (and where this 35-55ms matters) involve simply moving a cursor; it's very noticeable when you have more than a few frames of latency to "get started". > I think some numbers about how much this feature helps would be really > good, even if they are quite specific use cases. You also need to > identify the userspace components, because I think different display > servers are very different in their reaction speed. If my email address isn't obvious, I'm testing Chrome OS. I'm frankly not that familiar with the user space display stack, but for what I know, it's rather custom, developed within the Chromium project. Others on CC here could probably give you more detail, if you want specific answers, besides docs like this: https://chromium.googlesource.com/chromium/src/+/HEAD/docs/ozone_overview.md > If KMS gets a pageflip or modeset in no time after an input event, then > what's the gain. OTOH, if the display server is locking on to vblank, > there might be a delay worth avoiding. But then, is it worth > short-circuiting the wake-up in kernel vs. adding a new ioctl that > userspace could hit to start the warming up process? Rob responded to the first part to some extent (there is definitely gain to be had). To the last part: I wrote a simple debugfs hook to allow user space to force a PSR exit, and then a simple user space program to read input events and smash that debugfs file whenever it sees one. Testing in the same scenarios, this appears to lose less than 100 microseconds versus the in-kernel approach, which is negligible for this use case. (I'm not sure about the other use cases.) So, this is technically doable in user space. I can't speak to the ease of _actually_ integrating this into even our own Chrome display manager, but I highly doubt it will get integrated into others. I'd posit this should weigh into the relative worth, but otherw
Re: [PATCH v2 1/2] drm/input_helper: Add new input-handling helper
Hi Daniel, Thanks for the review. Lots to address elsewhere, but I can respond here first: On Thu, Nov 18, 2021 at 10:05:11AM +0100, Daniel Vetter wrote: > On Wed, Nov 17, 2021 at 02:48:40PM -0800, Brian Norris wrote: > > --- a/drivers/gpu/drm/Kconfig > > +++ b/drivers/gpu/drm/Kconfig > > @@ -79,9 +79,15 @@ config DRM_DEBUG_SELFTEST > > > > If in doubt, say "N". > > > > +config DRM_INPUT_HELPER > > + def_bool y > > + depends on DRM_KMS_HELPER > > + depends on INPUT > > Uh please no configs for each thing, it just makes everything more > complex. Do we _really_ need this? First, it's not a configurable option (a user will never see this nor have to answer Y/N to it); it only serves as an intermediary to express the CONFIG_INPUT dependency (which is necessary) without making DRM_KMS_HELPER fully depend on CONFIG_INPUT. (We should be able to run display stacks without the input subsystem.) The closest alternative I can think of with fewer Kconfig symbols is to just use CONFIG_INPUT directly in the code, to decide whether to provide the helpers or else just stub them out. But that has a problem of not properly expressing the =m vs. =y necessity: if, for example, CONFIG_DRM_KMS_HELPER=y and CONFIG_INPUT=m, then we'll have linker issues. In short, yes, I think we really need this. But I'm not a Kbuild expert. > > diff --git a/include/drm/drm_input_helper.h b/include/drm/drm_input_helper.h > > new file mode 100644 > > index ..7904f397b934 > > --- /dev/null > > +++ b/include/drm/drm_input_helper.h > > @@ -0,0 +1,41 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +/* > > + * Copyright (C) 2021 Google, Inc. > > + */ > > +#ifndef __DRM_INPUT_HELPER_H__ > > +#define __DRM_INPUT_HELPER_H__ > > + > > +#include > > + > > +struct drm_device; > > + > > +struct drm_input_handler { > > + /* > > +* Callback to call for input activity. Will be called in an atomic > > +* context. > > How atomic? Like hardirq, and nasty spinlocks held? Maybe I should have just cribbed off the doc: * @event: event handler. This method is being called by input core with * interrupts disabled and dev->event_lock spinlock held and so * it may not sleep I probably don't want to propagate the subsystem details about which locks, but I guess I can be specific about "interrupts disabled" and "don't sleep". > > +*/ > > + void (*callback)(struct drm_input_handler *handler); > > + > > + struct input_handler handler; > > +}; > > + > > +#if defined(CONFIG_DRM_INPUT_HELPER) > > + > > +int drm_input_handle_register(struct drm_device *dev, > > + struct drm_input_handler *handler); > > +void drm_input_handle_unregister(struct drm_input_handler *handler); > > + > > +#else /* !CONFIG_DRM_INPUT_HELPER */ > > + > > +static inline int drm_input_handle_register(struct drm_device *dev, > > + struct drm_input_handler *handler) > > +{ > > + return 0; > > +} > > I guess the reason behind the helper is that you also want to use this in > drivers or maybe drm/sched? I think my reasoning is heavily described in both the cover letter and the commit message. If that's not clear, can you point out which part? I'd gladly improve it :) But specifically, see the 2nd bullet from the commit message, which I've re-quoted down here: > > * GPU drivers: on GPU-accelerated desktop systems, we may need to > >render new frames immediately after user activity. Powering up the > >GPU can take enough time that it is worthwhile to start this process > >as soon as there is input activity. Many Chrome OS systems also ship > >with an input_handler boost that powers up the GPU. Rob Clark has patches to drm/msm to boost GPU power-up via a similar helper. > Anyway I think it looks all reasonable. Definitely need an ack from input > people I realized I failed to carry Dmitry's Ack from version 1 [1]. If this has a v3 in similar form, I'll carry it there. > that the event list you have is a good choice, I have no idea what > that all does. Maybe also document that part a bit more. I'm admittedly not an expert there, and this is actually one reason why we hoped to make this a library (that nobody wants to keep figuring out whether all those flags, etc., are really doing the right thing), but there are comments about what each entry is _trying_ to do. Are you suggesting more, as in, why "BTN_LEFT + EV_KEY" means "pointer"? Or why we match certain devices (because they represent likely user activity that will affect the display pipeline)? Or both? Anyway, I'll give it a shot, if we keep this. Brian [1] https://lore.kernel.org/all/yyw6fwsenmk25...@google.com/
[PATCH v2 1/2] drm/input_helper: Add new input-handling helper
A variety of applications have found it useful to listen to user-initiated input events to make decisions within a DRM driver, given that input events are often the first sign that we're going to start doing latency-sensitive activities: * Panel self-refresh: software-directed self-refresh (e.g., with Rockchip eDP) is especially latency sensitive. In some cases, it can take 10s of milliseconds for a panel to exit self-refresh, which can be noticeable. Rockchip RK3399 Chrome OS systems have always shipped with an input_handler boost, that preemptively exits self-refresh whenever there is input activity. * GPU drivers: on GPU-accelerated desktop systems, we may need to render new frames immediately after user activity. Powering up the GPU can take enough time that it is worthwhile to start this process as soon as there is input activity. Many Chrome OS systems also ship with an input_handler boost that powers up the GPU. This patch provides a small helper library that abstracts some of the input-subsystem details around picking which devices to listen to, and some other boilerplate. This will be used in the next patch to implement the first bullet: preemptive exit for panel self-refresh. Bits of this are adapted from code the Android and/or Chrome OS kernels have been carrying for a while. Signed-off-by: Brian Norris --- Changes in v2: - Honor CONFIG_INPUT dependency, via new CONFIG_DRM_INPUT_HELPER - Remove void*; users should use container_of() - Document the callback context drivers/gpu/drm/Kconfig| 6 ++ drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/drm_input_helper.c | 143 + include/drm/drm_input_helper.h | 41 + 4 files changed, 192 insertions(+) create mode 100644 drivers/gpu/drm/drm_input_helper.c create mode 100644 include/drm/drm_input_helper.h diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index fb144617055b..381476b10a9d 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -79,9 +79,15 @@ config DRM_DEBUG_SELFTEST If in doubt, say "N". +config DRM_INPUT_HELPER + def_bool y + depends on DRM_KMS_HELPER + depends on INPUT + config DRM_KMS_HELPER tristate depends on DRM + select DRM_INPUT_HELPER if INPUT help CRTC helpers for KMS drivers. diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 1c41156deb5f..9a6494aa45e6 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -56,6 +56,8 @@ drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \ drm_atomic_state_helper.o drm_damage_helper.o \ drm_format_helper.o drm_self_refresh_helper.o drm_rect.o +drm_kms_helper-$(CONFIG_DRM_INPUT_HELPER) += drm_input_helper.o + drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o drm_kms_helper-$(CONFIG_DRM_FBDEV_EMULATION) += drm_fb_helper.o drm_kms_helper-$(CONFIG_DRM_KMS_CMA_HELPER) += drm_fb_cma_helper.o diff --git a/drivers/gpu/drm/drm_input_helper.c b/drivers/gpu/drm/drm_input_helper.c new file mode 100644 index ..470f90865c7c --- /dev/null +++ b/drivers/gpu/drm/drm_input_helper.c @@ -0,0 +1,143 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2021 Google, Inc. + */ +#include +#include + +#include +#include + +/** + * DOC: overview + * + * This helper library provides a thin wrapper around input handles, so that + * DRM drivers can easily perform domain-specific actions in response to user + * activity. e.g., if someone is moving a mouse, we're likely to want to + * display something soon, and we should exit panel self-refresh. + */ + +static void drm_input_event(struct input_handle *handle, unsigned int type, + unsigned int code, int value) +{ + struct drm_input_handler *handler = handle->handler->private; + + handler->callback(handler); +} + +static int drm_input_connect(struct input_handler *handler, +struct input_dev *dev, +const struct input_device_id *id) +{ + struct input_handle *handle; + int error; + + handle = kzalloc(sizeof(struct input_handle), GFP_KERNEL); + if (!handle) + return -ENOMEM; + + handle->dev = dev; + handle->handler = handler; + handle->name = "drm-input-helper"; + + error = input_register_handle(handle); + if (error) + goto err2; + + error = input_open_device(handle); + if (error) + goto err1; + + return 0; + +err1: + input_unregister_handle(handle); +err2: + kfree(handle); + return error; +} + +static void drm_input_disconnect(struct input_handle *handle) +{ + input_close_device(handle); + input_unregister_handle(handle); + kfree(handle); +} + +s
[PATCH v2 2/2] drm/self_refresh: Disable self-refresh on input events
To improve panel self-refresh exit latency, we speculatively start exiting when we receive input events. Occasionally, this may lead to false positives, but most of the time we get a head start on coming out of PSR. Depending on how userspace takes to produce a new frame in response to the event, this can completely hide the exit latency. In local tests on Chrome OS (Rockchip RK3399 eDP), we've found that the input notifier gives us about a 50ms head start over the fb-update-initiated exit. Leverage a new drm_input_helper library to get easy access to likely-relevant input event callbacks. Inspired-by: Kristian H. Kristensen Signed-off-by: Brian Norris --- This was in part picked up from: https://lore.kernel.org/all/20180405095000.9756-25-enric.balle...@collabora.com/ [PATCH v6 24/30] drm/rockchip: Disable PSR on input events with significant rewrites/reworks: - moved to common drm_input_helper and drm_self_refresh_helper implementation - track state only through crtc->state->self_refresh_active Note that I'm relatively unfamiliar with DRM locking expectations, but I believe access to drm_crtc->state (which helps us track redundant transitions) is OK under the locking provided by drm_atomic_get_crtc_state(). Changes in v2: - Delay PSR re-entry, when already disabled - Allow default configuration via Kconfig and modparam - Replace void* with container_of() drivers/gpu/drm/Kconfig | 16 drivers/gpu/drm/drm_self_refresh_helper.c | 98 +++ 2 files changed, 100 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 381476b10a9d..698924ed9b6b 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -84,6 +84,22 @@ config DRM_INPUT_HELPER depends on DRM_KMS_HELPER depends on INPUT +config DRM_SELF_REFRESH_INPUT_BOOST_DEFAULT + bool "Preemptively exit panel self-refresh on input device activity" if EXPERT + default y + depends on DRM_INPUT_HELPER + help + Allows the generic DRM panel self-refresh helpers to factor in user + input activity to preemptively exit panel self-refresh, in order to + reduce potentially-visible latency when displaying new display + content. This is an optimization which often will do the right thing, + but can be disabled for experimentation or similar. + + Saying Y enables the feature by default; this can also be configured + by module parameter, drm_kms_helper.self_refresh_input_boost. + + If in doubt, say "Y". + config DRM_KMS_HELPER tristate depends on DRM diff --git a/drivers/gpu/drm/drm_self_refresh_helper.c b/drivers/gpu/drm/drm_self_refresh_helper.c index dd33fec5aabd..ba4881e683b7 100644 --- a/drivers/gpu/drm/drm_self_refresh_helper.c +++ b/drivers/gpu/drm/drm_self_refresh_helper.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -15,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -58,17 +60,41 @@ DECLARE_EWMA(psr_time, 4, 4) struct drm_self_refresh_data { struct drm_crtc *crtc; struct delayed_work entry_work; + struct work_struct exit_work; + struct drm_input_handler input_handler; + bool input_handler_registered; struct mutex avg_mutex; struct ewma_psr_time entry_avg_ms; struct ewma_psr_time exit_avg_ms; }; -static void drm_self_refresh_helper_entry_work(struct work_struct *work) +static bool self_refresh_input_boost = + IS_ENABLED(CONFIG_DRM_SELF_REFRESH_INPUT_BOOST_DEFAULT); +#if defined(CONFIG_DRM_INPUT_HELPER) +module_param(self_refresh_input_boost, bool, 0644); +MODULE_PARM_DESC(self_refresh_input_boost, +"Enable panel self-refresh input boost [default=" +__stringify(CONFIG_DRM_SELF_REFRESH_INPUT_BOOST_DEFAULT) "]"); +#endif /* CONFIG_DRM_INPUT_HELPER */ + + +static void drm_self_refresh_reschedule(struct drm_self_refresh_data *sr_data) +{ + unsigned int delay; + + mutex_lock(&sr_data->avg_mutex); + delay = (ewma_psr_time_read(&sr_data->entry_avg_ms) + +ewma_psr_time_read(&sr_data->exit_avg_ms)) * 2; + mutex_unlock(&sr_data->avg_mutex); + + mod_delayed_work(system_wq, &sr_data->entry_work, +msecs_to_jiffies(delay)); +} + +static void drm_self_refresh_transition(struct drm_self_refresh_data *sr_data, + bool enable) { - struct drm_self_refresh_data *sr_data = container_of( - to_delayed_work(work), - struct drm_self_refresh_data, entry_work); struct drm_crtc *crtc = sr_data->crtc; struct drm_device *dev = crtc->dev; struct drm_modeset_acquire_c
[PATCH v2 0/2] drm: Support input-boosted panel self-refresh exit
A variety of applications have found it useful to listen to user-initiated input events to make decisions within a DRM driver, given that input events are often the first sign that we're going to start doing latency-sensitive activities: * Panel self-refresh: software-directed self-refresh (e.g., with Rockchip eDP) is especially latency sensitive. In some cases, it can take 10s of milliseconds for a panel to exit self-refresh, which can be noticeable. Rockchip RK3399 Chrome OS systems have always shipped with an input_handler boost, that preemptively exits self-refresh whenever there is input activity. * GPU drivers: on GPU-accelerated desktop systems, we may need to render new frames immediately after user activity. Powering up the GPU can take enough time that it is worthwhile to start this process as soon as there is input activity. Many Chrome OS systems also ship with an input_handler boost that powers up the GPU. I implement the first bullet in this series, and I also compared with some out-of-tree patches for the second, to ensure this could be useful there too. Past work on upstreaming a variety of Chromebook display patches got held up for this particular feature, as there was some desire to make it a bit more generic, for one. See the latest here: https://lore.kernel.org/all/20180405095000.9756-25-enric.balle...@collabora.com/ [PATCH v6 24/30] drm/rockchip: Disable PSR on input events I significantly rewrote this to adapt it to the new common drm_self_refresh_helpers and to add a new drm_input_helper thin library, so I only carry my own authorship on this series. Admittedly, this "drm_input_helper" library is barely DRM-specific at all, except that all display- and GPU-related input-watchers are likely to want to watch similar device behavior (unlike, say, rfkill or led input_handler code). The approximate consensus so far seems to be that (a) this isn't much code; if we need it for other subsystems (like, cpufreq-boost), it's easy to implement similar logic (b) input subsystem maintainers think the existing input_handler abstraction is good enough So, I keep the thin input helper in drivers/gpu/drm/. v1: https://lore.kernel.org/all/20211103234018.4009771-1-briannor...@chromium.org/ Changes in v2: - Honor CONFIG_INPUT dependency, via new CONFIG_DRM_INPUT_HELPER - Remove void*; users should use container_of() - Document the callback context - Delay PSR re-entry, when already disabled - Allow default configuration via Kconfig and modparam - really CC dri-devel@lists.freedesktop.org (oops!) Brian Norris (2): drm/input_helper: Add new input-handling helper drm/self_refresh: Disable self-refresh on input events drivers/gpu/drm/Kconfig | 22 drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/drm_input_helper.c| 143 ++ drivers/gpu/drm/drm_self_refresh_helper.c | 98 --- include/drm/drm_input_helper.h| 41 +++ 5 files changed, 292 insertions(+), 14 deletions(-) create mode 100644 drivers/gpu/drm/drm_input_helper.c create mode 100644 include/drm/drm_input_helper.h -- 2.34.0.rc1.387.gb447b232ab-goog
[PATCH v3] drm/bridge: analogix_dp: Make PSR-exit block less
Prior to commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), "PSR exit" used non-blocking analogix_dp_send_psr_spd(). The refactor started using the blocking variant, for a variety of reasons -- quoting Sean Paul's potentially-faulty memory: """ - To avoid racing a subsequent PSR entry (if exit takes a long time) - To avoid racing disable/modeset - We're not displaying new content while exiting PSR anyways, so there is minimal utility in allowing frames to be submitted - We're lying to userspace telling them frames are on the screen when we're just dropping them on the floor """ However, I'm finding that this blocking transition is causing upwards of 60+ ms of unneeded latency on PSR-exit, to the point that initial cursor movements when leaving PSR are unbearably jumpy. It turns out that we need to meet in the middle somewhere: Sean is right that we were "lying to userspace" with a non-blocking PSR-exit, but the new blocking behavior is also waiting too long: According to the eDP specification, the sink device must support PSR entry transitions from both state 4 (ACTIVE_RESYNC) and state 0 (INACTIVE). It also states that in ACTIVE_RESYNC, "the Sink device must display the incoming active frames from the Source device with no visible glitches and/or artifacts." Thus, for our purposes, we only need to wait for ACTIVE_RESYNC before moving on; we are ready to display video, and subsequent PSR-entry is safe. Tested on a Samsung Chromebook Plus (i.e., Rockchip RK3399 Gru Kevin), where this saves about 60ms of latency, for PSR-exit that used to take about 80ms. Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Cc: Cc: Zain Wang Cc: Tomasz Figa Cc: Heiko Stuebner Cc: Sean Paul Signed-off-by: Brian Norris Reviewed-by: Sean Paul --- CC list is partially constructed from the commit message of the Fixed commit Changes in v3: - Fix the exiting/entering comment (a reviewer noticed off-mailing-list that I got it backwards) - Add Reviewed-by Changes in v2: - retitled subject (previous: "drm/bridge: analogix_dp: Make PSR-disable non-blocking") - instead of completely non-blocking, make this "less"-blocking - more background (thanks Sean!) - more specification details drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c index cab6c8b92efd..6a4f20fccf84 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c @@ -998,11 +998,21 @@ int analogix_dp_send_psr_spd(struct analogix_dp_device *dp, if (!blocking) return 0; + /* +* db[1]!=0: entering PSR, wait for fully active remote frame buffer. +* db[1]==0: exiting PSR, wait for either +* (a) ACTIVE_RESYNC - the sink "must display the +* incoming active frames from the Source device with no visible +* glitches and/or artifacts", even though timings may still be +* re-synchronizing; or +* (b) INACTIVE - the transition is fully complete. +*/ ret = readx_poll_timeout(analogix_dp_get_psr_status, dp, psr_status, psr_status >= 0 && ((vsc->db[1] && psr_status == DP_PSR_SINK_ACTIVE_RFB) || - (!vsc->db[1] && psr_status == DP_PSR_SINK_INACTIVE)), 1500, - DP_TIMEOUT_PSR_LOOP_MS * 1000); + (!vsc->db[1] && (psr_status == DP_PSR_SINK_ACTIVE_RESYNC || +psr_status == DP_PSR_SINK_INACTIVE))), + 1500, DP_TIMEOUT_PSR_LOOP_MS * 1000); if (ret) { dev_warn(dp->dev, "Failed to apply PSR %d\n", ret); return ret; -- 2.34.0.rc0.344.g81b53c2807-goog
[PATCH v2] drm/bridge: analogix_dp: Make PSR-exit block less
Prior to commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), "PSR exit" used non-blocking analogix_dp_send_psr_spd(). The refactor started using the blocking variant, for a variety of reasons -- quoting Sean Paul's potentially-faulty memory: """ - To avoid racing a subsequent PSR entry (if exit takes a long time) - To avoid racing disable/modeset - We're not displaying new content while exiting PSR anyways, so there is minimal utility in allowing frames to be submitted - We're lying to userspace telling them frames are on the screen when we're just dropping them on the floor """ However, I'm finding that this blocking transition is causing upwards of 60+ ms of unneeded latency on PSR-exit, to the point that initial cursor movements when leaving PSR are unbearably jumpy. It turns out that we need to meet in the middle somewhere: Sean is right that we were "lying to userspace" with a non-blocking PSR-exit, but the new blocking behavior is also waiting too long: According to the eDP specification, the sink device must support PSR entry transitions from both state 4 (ACTIVE_RESYNC) and state 0 (INACTIVE). It also states that in ACTIVE_RESYNC, "the Sink device must display the incoming active frames from the Source device with no visible glitches and/or artifacts." Thus, for our purposes, we only need to wait for ACTIVE_RESYNC before moving on; we are ready to display video, and subsequent PSR-entry is safe. Tested on a Samsung Chromebook Plus (i.e., Rockchip RK3399 Gru Kevin), where this saves about 60ms of latency, for PSR-exit that used to take about 80ms. Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Cc: Cc: Zain Wang Cc: Tomasz Figa Cc: Heiko Stuebner Cc: Sean Paul Signed-off-by: Brian Norris --- CC list is partially constructed from the commit message of the Fixed commit Changes in v2: - retitled subject (previous: "drm/bridge: analogix_dp: Make PSR-disable non-blocking") - instead of completely non-blocking, make this "less"-blocking - more background (thanks Sean!) - more specification details drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c index cab6c8b92efd..f8e119e84ae2 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_reg.c @@ -998,11 +998,21 @@ int analogix_dp_send_psr_spd(struct analogix_dp_device *dp, if (!blocking) return 0; + /* +* db[1]==0: entering PSR, wait for fully active remote frame buffer. +* db[1]!=0: exiting PSR, wait for either +* (a) ACTIVE_RESYNC - the sink "must display the +* incoming active frames from the Source device with no visible +* glitches and/or artifacts", even though timings may still be +* re-synchronizing; or +* (b) INACTIVE - the transition is fully complete. +*/ ret = readx_poll_timeout(analogix_dp_get_psr_status, dp, psr_status, psr_status >= 0 && ((vsc->db[1] && psr_status == DP_PSR_SINK_ACTIVE_RFB) || - (!vsc->db[1] && psr_status == DP_PSR_SINK_INACTIVE)), 1500, - DP_TIMEOUT_PSR_LOOP_MS * 1000); + (!vsc->db[1] && (psr_status == DP_PSR_SINK_ACTIVE_RESYNC || +psr_status == DP_PSR_SINK_INACTIVE))), + 1500, DP_TIMEOUT_PSR_LOOP_MS * 1000); if (ret) { dev_warn(dp->dev, "Failed to apply PSR %d\n", ret); return ret; -- 2.33.1.1089.g2158813163f-goog
Re: [PATCH] drm/bridge: analogix_dp: Make PSR-disable non-blocking
On Wed, Oct 20, 2021 at 06:23:35PM -0700, Brian Norris wrote: > On Wed, Oct 20, 2021 at 5:40 PM Sean Paul wrote: > > The actual latency gains from doing this synchronously are minimal since the > > panel will display new content as soon as it can regardless of whether the > > kernel is blocking. There is likely a perceptual difference, but that's only > > because kernel is lying to userspace and skipping frames without consent. > > Hmm, you might well be right about some of the first points (I'm still > learning the DRM framework), but I'm a bit skeptical that the > perceptual difference is "only" because we're cheating in some way. > I'm not doing science here, and it's certainly not a blinded test, but > I'm nearly certain this patch cuts out approx 50-80% of the cursor lag > I see without this patch (relative to the current Chrome OS kernel). I > don't see how cheating would produce a smoother cursor movement -- > we'd still be dropping frames, and the movement would appear jumpy > somewhere along the way. Aha, so I think I found {a,the} reason for some disagreement here: looking at the eDP PSR spec, I see that while the current implementation is looking for psr_state==DP_PSR_SINK_INACTIVE to signal PSR-exit completion, the spec shows an intermediate state (DP_PSR_SINK_ACTIVE_RESYNC == 4), where among other things, "the Sink device must display the incoming active frames from the Source device with no visible glitches and/or artifacts." And it happens that we move to DP_PSR_SINK_ACTIVE_RESYNC somewhat quickly (on the order of 20-40ms), while the move to DP_PSR_SINK_INACTIVE is a good chunk longer (approx 60ms more). So pre-commit-6c836d965bad might have been cheating a little (we'd claim we're "done" about 20-40ms too early), but post-commit-6c836d965bad we're waiting about 60ms too long. I'll send v2 to make this block for DP_PSR_SINK_ACTIVE_RESYNC || DP_PSR_SINK_INACTIVE, which gets much or all of the same latency win, and I'll try to document the reasons, etc., better. I'll probably also include a patch to drop the 'blocking' parameter, since it's unused, and gives the wrong idea about this state machine. > In any case, I'm absolutely certain that mainline Linux performs much > much worse with PSR than the current CrOS kernel, but there are some > other potential reasons for that, such as the lack of an > input-notifier [1]. ... > [1] This got locked up in "controversy": > https://patchwork.kernel.org/project/linux-arm-kernel/patch/20180405095000.9756-25-enric.balle...@collabora.com/ While I'm here: I also played with this a bit, and I still haven't gotten all the details right, but I don't believe this alone will get the latency wins we'd like. We still need something like the above. Brian
Re: [PATCH] drm/bridge: analogix_dp: Make PSR-disable non-blocking
(Dropping Andrzej, because that address keeps bouncing. Does MAINTAINERS and/or .mailmap need updating?) Apologies for the double reply here, but I forgot to mention one last thing for now: On Wed, Oct 20, 2021 at 5:40 PM Sean Paul wrote: > > On Wed, Oct 20, 2021 at 04:17:28PM -0700, Brian Norris wrote: > > Prior to commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), > > "PSR disable" used non-blocking analogix_dp_send_psr_spd(). The refactor > > accidentally (?) set blocking=true. > > IIRC this wasn't accidental. One other tip that made me think it was accidental was that today, the |blocking| argument to analogix_dp_send_psr_spd() is always true. If non-blocking support was intentionally dropped, it seemed like you should have dropped the non-blocking code too. But that's a weak proof of your intentions :) Brian
Re: [PATCH] drm/bridge: analogix_dp: Make PSR-disable non-blocking
On Wed, Oct 20, 2021 at 5:40 PM Sean Paul wrote: > On Wed, Oct 20, 2021 at 04:17:28PM -0700, Brian Norris wrote: > > Prior to commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), > > "PSR disable" used non-blocking analogix_dp_send_psr_spd(). The refactor > > accidentally (?) set blocking=true. > > IIRC this wasn't accidental. > > The reason it became synchronous was: > - To avoid racing a subsequent PSR entry (if exit takes a long time) How did this work pre-commit-6c836d965bad then? I don't see any provision for avoiding subsequent PSR entry. Or I guess that was implicitly covered by PSR_FLUSH_TIMEOUT_MS, which means we allowed at least 100ms between exit/entry each time, which was good enough? And in the "new" implementation, that turned into a running average that gets measured on each commit? So we're no longer guaranteed 100ms, and it's even worse if we cheat the timing measurement? I'm still not sure that "race" is truly a problem without consulting some kind of hardware documentation though. It wouldn't surprise me if these things are cancelable. > - To avoid racing disable/modeset > - We're not displaying new content while exiting PSR anyways, so there is >minimal utility in allowing frames to be submitted > - We're lying to userspace telling them frames are on the screen when we're >just dropping them on the floor > > The actual latency gains from doing this synchronously are minimal since the > panel will display new content as soon as it can regardless of whether the > kernel is blocking. There is likely a perceptual difference, but that's only > because kernel is lying to userspace and skipping frames without consent. Hmm, you might well be right about some of the first points (I'm still learning the DRM framework), but I'm a bit skeptical that the perceptual difference is "only" because we're cheating in some way. I'm not doing science here, and it's certainly not a blinded test, but I'm nearly certain this patch cuts out approx 50-80% of the cursor lag I see without this patch (relative to the current Chrome OS kernel). I don't see how cheating would produce a smoother cursor movement -- we'd still be dropping frames, and the movement would appear jumpy somewhere along the way. In any case, I'm absolutely certain that mainline Linux performs much much worse with PSR than the current CrOS kernel, but there are some other potential reasons for that, such as the lack of an input-notifier [1]. > Going back to the first line, it's entirely possible my memory is failing > and this was accidental! Well either way, thanks for the notes. I'll see if I can get anywhere on proving/disproving that they are relevant, or if they can be worked around some other way; or perhaps I can regain the lost performance some other way. It'll be a few days before I get around to that. Brian [1] This got locked up in "controversy": https://patchwork.kernel.org/project/linux-arm-kernel/patch/20180405095000.9756-25-enric.balle...@collabora.com/
[PATCH] drm/bridge: analogix_dp: Make PSR-disable non-blocking
Prior to commit 6c836d965bad ("drm/rockchip: Use the helpers for PSR"), "PSR disable" used non-blocking analogix_dp_send_psr_spd(). The refactor accidentally (?) set blocking=true. This can cause upwards of 60-100ms of unneeded latency when exiting self-refresh, which can cause very noticeable lag when, say, moving a cursor. Presumbaly it's OK to let the display finish exiting refresh in parallel with clocking out the next video frames, so we shouldn't hold up the atomic_enable() step. This also brings behavior in line with the downstream ("mainline-derived") variant of the driver currently deployed to Chrome OS Rockchip systems. Tested on a Samsung Chromebook Plus (i.e., Rockchip RK3399 Gru Kevin). Fixes: 6c836d965bad ("drm/rockchip: Use the helpers for PSR") Cc: Cc: Zain Wang Cc: Tomasz Figa Cc: Heiko Stuebner Cc: Sean Paul Signed-off-by: Brian Norris --- CC list is partially constructed from the commit message of the Fixed commit drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index b7d2e4449cfa..fbe6eb9df310 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1055,7 +1055,7 @@ static int analogix_dp_disable_psr(struct analogix_dp_device *dp) psr_vsc.db[0] = 0; psr_vsc.db[1] = 0; - return analogix_dp_send_psr_spd(dp, &psr_vsc, true); + return analogix_dp_send_psr_spd(dp, &psr_vsc, false); } /* -- 2.33.0.1079.g6e70778dc9-goog
[PATCH v2] MAINTAINERS: Fixup drm-misc website link
https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-misc.html gives HTTP 404, and https://01.org/linuxgraphics/gfx-docs/maintainer-tools/ redirects to freedesktop.org now. Let's save people the pain of figuring that out. Signed-off-by: Brian Norris Reviewed-by: Sean Paul --- Changes in v2: - Correct the patch description text MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 100d7f93a15b..811d8d3e35fb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6158,7 +6158,7 @@ M:Maarten Lankhorst M: Maxime Ripard M: Thomas Zimmermann S: Maintained -W: https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-misc.html +W: https://drm.pages.freedesktop.org/maintainer-tools/drm-misc.html T: git git://anongit.freedesktop.org/drm/drm-misc F: Documentation/gpu/ F: drivers/gpu/drm/* -- 2.33.0.1079.g6e70778dc9-goog
Re: [PATCH] MAINTAINERS: Fixup drm-misc website link
On Tue, Oct 19, 2021 at 5:59 PM Brian Norris wrote: > > https://drm.pages.freedesktop.org/maintainer-tools/drm-misc.html gives > HTTP 404, and https://drm.pages.freedesktop.org/maintainer-tools/ > redirects to freedesktop.org now. With a dash of irony, I actually listed the wrong URLs in the description. (I used the properly-redirected ones, and claimed that they were the broken ones.) I'll send a v2, so a maintainer doesn't have to fix that up for me. Brian
[PATCH] MAINTAINERS: Fixup drm-misc website link
https://drm.pages.freedesktop.org/maintainer-tools/drm-misc.html gives HTTP 404, and https://drm.pages.freedesktop.org/maintainer-tools/ redirects to freedesktop.org now. Let's save people the pain of figuring that out. Signed-off-by: Brian Norris --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 100d7f93a15b..811d8d3e35fb 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6158,7 +6158,7 @@ M:Maarten Lankhorst M: Maxime Ripard M: Thomas Zimmermann S: Maintained -W: https://01.org/linuxgraphics/gfx-docs/maintainer-tools/drm-misc.html +W: https://drm.pages.freedesktop.org/maintainer-tools/drm-misc.html T: git git://anongit.freedesktop.org/drm/drm-misc F: Documentation/gpu/ F: drivers/gpu/drm/* -- 2.33.0.1079.g6e70778dc9-goog
[PATCH] drm/rockchip: vop: Add timeout for DSP hold
If hardware is malfunctioning (e.g., misconfigured clocks?), we can get stuck here forever, holding various DRM locks and eventually locking up the entire system. It's better to complain loudly and move on, than to lock up the system. In local tests, this operation takes less than 20ms. Signed-off-by: Brian Norris --- drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c index ba9e14da41b4..b28ea5d6a3e2 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c @@ -726,7 +726,9 @@ static void vop_crtc_atomic_disable(struct drm_crtc *crtc, spin_unlock(&vop->reg_lock); - wait_for_completion(&vop->dsp_hold_completion); + if (!wait_for_completion_timeout(&vop->dsp_hold_completion, +msecs_to_jiffies(200))) + WARN(1, "%s: timed out waiting for DSP hold", crtc->name); vop_dsp_hold_valid_irq_disable(vop); -- 2.33.0.882.g93a45727a2-goog
[PATCH v2] drm/bridge: analogix_dp: Grab runtime PM reference for DP-AUX
If the display is not enable()d, then we aren't holding a runtime PM reference here. Thus, it's easy to accidentally cause a hang, if user space is poking around at /dev/drm_dp_aux0 at the "wrong" time. Let's get the panel and PM state right before trying to talk AUX. Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code") Cc: Cc: Tomeu Vizoso Signed-off-by: Brian Norris --- Changes in v2: - Fix spelling in Subject - DRM_DEV_ERROR() -> drm_err() - Propagate errors from un-analogix_dp_prepare_panel() .../drm/bridge/analogix/analogix_dp_core.c| 21 ++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index b7d2e4449cfa..6fc46ac93ef8 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1632,8 +1632,27 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) { struct analogix_dp_device *dp = to_dp(aux); + int ret, ret2; - return analogix_dp_transfer(dp, msg); + ret = analogix_dp_prepare_panel(dp, true, false); + if (ret) { + drm_err(dp->drm_dev, "Failed to prepare panel (%d)\n", ret); + return ret; + } + + pm_runtime_get_sync(dp->dev); + ret = analogix_dp_transfer(dp, msg); + pm_runtime_put(dp->dev); + + ret2 = analogix_dp_prepare_panel(dp, false, false); + if (ret2) { + drm_err(dp->drm_dev, "Failed to unprepare panel (%d)\n", ret2); + /* Prefer the analogix_dp_transfer() error, if it exists. */ + if (!ret) + ret = ret2; + } + + return ret; } struct analogix_dp_device * -- 2.33.0.800.g4c38ced690-goog
Re: [PATCH] drm/brdige: analogix_dp: Grab runtime PM reference for DP-AUX
On Fri, Oct 1, 2021 at 1:37 PM Sean Paul wrote: > On Wed, Sep 29, 2021 at 02:41:03PM -0700, Brian Norris wrote: > > --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c > > @@ -1632,8 +1632,23 @@ static ssize_t analogix_dpaux_transfer(struct > > drm_dp_aux *aux, > > struct drm_dp_aux_msg *msg) > > { > > struct analogix_dp_device *dp = to_dp(aux); > > + int ret, ret2; > > > > - return analogix_dp_transfer(dp, msg); > > + ret = analogix_dp_prepare_panel(dp, true, false); > > + if (ret) { > > + DRM_DEV_ERROR(dp->dev, "Failed to prepare panel (%d)\n", ret); > > s/DRM_DEV_ERROR/drm_err/ Sure. Now that I'm looking a second time, I see the header recommends this. > > + return ret; > > + } > > + > > + pm_runtime_get_sync(dp->dev); > > + ret = analogix_dp_transfer(dp, msg); > > + pm_runtime_put(dp->dev); > > + > > + ret2 = analogix_dp_prepare_panel(dp, false, false); > > + if (ret2) > > + DRM_DEV_ERROR(dp->dev, "Failed to unprepare panel (%d)\n", > > ret2); > > What's the reasoning for not propagating unprepare failures? I feel like that > should be fair game. I suppose the underlying reason is laziness, sorry. But a related reason is the we probably should prefer propagating the analogix_dp_transfer() error, if it's non-zero, rather than the unprepare error. That's not too hard to do though, even if it's slightly more awkward. > > + > > + return ret; > > } > > > > struct analogix_dp_device * v2 coming. Regards, Brian
[PATCH] drm/brdige: analogix_dp: Grab runtime PM reference for DP-AUX
If the display is not enable()d, then we aren't holding a runtime PM reference here. Thus, it's easy to accidentally cause a hang, if user space is poking around at /dev/drm_dp_aux0 at the "wrong" time. Let's get the panel and PM state right before trying to talk AUX. Fixes: 0d97ad03f422 ("drm/bridge: analogix_dp: Remove duplicated code") Cc: Cc: Tomeu Vizoso Signed-off-by: Brian Norris --- .../gpu/drm/bridge/analogix/analogix_dp_core.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c index b7d2e4449cfa..a1b553904b85 100644 --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c @@ -1632,8 +1632,23 @@ static ssize_t analogix_dpaux_transfer(struct drm_dp_aux *aux, struct drm_dp_aux_msg *msg) { struct analogix_dp_device *dp = to_dp(aux); + int ret, ret2; - return analogix_dp_transfer(dp, msg); + ret = analogix_dp_prepare_panel(dp, true, false); + if (ret) { + DRM_DEV_ERROR(dp->dev, "Failed to prepare panel (%d)\n", ret); + return ret; + } + + pm_runtime_get_sync(dp->dev); + ret = analogix_dp_transfer(dp, msg); + pm_runtime_put(dp->dev); + + ret2 = analogix_dp_prepare_panel(dp, false, false); + if (ret2) + DRM_DEV_ERROR(dp->dev, "Failed to unprepare panel (%d)\n", ret2); + + return ret; } struct analogix_dp_device * -- 2.33.0.685.g46640cef36-goog
[PATCH v3 4/4] drm/rockchip: dsi: Disable PLL clock on bind error
Fix some error handling here noticed in review of other changes. Fixes: 2d4f7bdafd70 ("drm/rockchip: dsi: migrate to use dw-mipi-dsi bridge driver") Signed-off-by: Brian Norris Reported-by: Chen-Yu Tsai Reviewed-by: Chen-Yu Tsai --- Changes in v3: - Add Fixes, Reviewed-by Changes in v2: - New drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index 8ea852880d1c..59c3d8ef6bf9 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -945,7 +945,7 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = clk_prepare_enable(dsi->grf_clk); if (ret) { DRM_DEV_ERROR(dsi->dev, "Failed to enable grf_clk: %d\n", ret); - goto out_pm_runtime; + goto out_pll_clk; } dw_mipi_dsi_rockchip_config(dsi); @@ -957,19 +957,21 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = rockchip_dsi_drm_create_encoder(dsi, drm_dev); if (ret) { DRM_DEV_ERROR(dev, "Failed to create drm encoder\n"); - goto out_pm_runtime; + goto out_pll_clk; } ret = dw_mipi_dsi_bind(dsi->dmd, &dsi->encoder); if (ret) { DRM_DEV_ERROR(dev, "Failed to bind: %d\n", ret); - goto out_pm_runtime; + goto out_pll_clk; } dsi->dsi_bound = true; return 0; +out_pll_clk: + clk_disable_unprepare(dsi->pllref_clk); out_pm_runtime: pm_runtime_put(dsi->dev); if (dsi->slave) -- 2.33.0.685.g46640cef36-goog
[PATCH v3 3/4] drm/rockchip: dsi: Fix unbalanced clock on probe error
Our probe() function never enabled this clock, so we shouldn't disable it if we fail to probe the bridge. Noted by inspection. Fixes: 2d4f7bdafd70 ("drm/rockchip: dsi: migrate to use dw-mipi-dsi bridge driver") Signed-off-by: Brian Norris Reviewed-by: Chen-Yu Tsai --- (no changes since v1) drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index 21c67343cc6c..8ea852880d1c 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -1434,14 +1434,10 @@ static int dw_mipi_dsi_rockchip_probe(struct platform_device *pdev) if (ret != -EPROBE_DEFER) DRM_DEV_ERROR(dev, "Failed to probe dw_mipi_dsi: %d\n", ret); - goto err_clkdisable; + return ret; } return 0; - -err_clkdisable: - clk_disable_unprepare(dsi->pllref_clk); - return ret; } static int dw_mipi_dsi_rockchip_remove(struct platform_device *pdev) -- 2.33.0.685.g46640cef36-goog
[PATCH v3 1/4] drm/rockchip: dsi: Hold pm-runtime across bind/unbind
In commit 43c2de1002d2, we moved most HW configuration to bind(), but we didn't move the runtime PM management. Therefore, depending on initial boot state, runtime-PM workqueue delays, and other timing factors, we may disable our power domain in between the hardware configuration (bind()) and when we enable the display. This can cause us to lose hardware state and fail to configure our display. For example: dw-mipi-dsi-rockchip ff968000.mipi: failed to write command FIFO panel-innolux-p079zca ff96.mipi.0: failed to write command 0 or: dw-mipi-dsi-rockchip ff968000.mipi: failed to write command FIFO panel-kingdisplay-kd097d04 ff96.mipi.0: failed write init cmds: -110 We should match the runtime PM to the lifetime of the bind()/unbind() cycle. Tested on Acer Chrometab 10 (RK3399 Gru-Scarlet), with panel drivers built either as modules or built-in. Side notes: it seems one is more likely to see this problem when the panel driver is built into the kernel. I've also seen this problem bisect down to commits that simply changed Kconfig dependencies, because it changed the order in which driver init functions were compiled into the kernel, and therefore the ordering and timing of built-in device probe. Fixes: 43c2de1002d2 ("drm/rockchip: dsi: move all lane config except LCDC mux to bind()") Link: https://lore.kernel.org/linux-rockchip/9aedfb528600ecf871885f7293ca4207c84d16c1.ca...@gmail.com/ Reported-by: Cc: Signed-off-by: Brian Norris Tested-by: Nícolas F. R. A. Prado --- (no changes since v2) Changes in v2: - Clean up pm-runtime state in error cases. - Correct git hash for Fixes. .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 37 ++- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index a2262bee5aa4..45676b23c019 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -773,10 +773,6 @@ static void dw_mipi_dsi_encoder_enable(struct drm_encoder *encoder) if (mux < 0) return; - pm_runtime_get_sync(dsi->dev); - if (dsi->slave) - pm_runtime_get_sync(dsi->slave->dev); - /* * For the RK3399, the clk of grf must be enabled before writing grf * register. And for RK3288 or other soc, this grf_clk must be NULL, @@ -795,20 +791,10 @@ static void dw_mipi_dsi_encoder_enable(struct drm_encoder *encoder) clk_disable_unprepare(dsi->grf_clk); } -static void dw_mipi_dsi_encoder_disable(struct drm_encoder *encoder) -{ - struct dw_mipi_dsi_rockchip *dsi = to_dsi(encoder); - - if (dsi->slave) - pm_runtime_put(dsi->slave->dev); - pm_runtime_put(dsi->dev); -} - static const struct drm_encoder_helper_funcs dw_mipi_dsi_encoder_helper_funcs = { .atomic_check = dw_mipi_dsi_encoder_atomic_check, .enable = dw_mipi_dsi_encoder_enable, - .disable = dw_mipi_dsi_encoder_disable, }; static int rockchip_dsi_drm_create_encoder(struct dw_mipi_dsi_rockchip *dsi, @@ -938,10 +924,14 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, put_device(second); } + pm_runtime_get_sync(dsi->dev); + if (dsi->slave) + pm_runtime_get_sync(dsi->slave->dev); + ret = clk_prepare_enable(dsi->pllref_clk); if (ret) { DRM_DEV_ERROR(dev, "Failed to enable pllref_clk: %d\n", ret); - return ret; + goto out_pm_runtime; } /* @@ -953,7 +943,7 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = clk_prepare_enable(dsi->grf_clk); if (ret) { DRM_DEV_ERROR(dsi->dev, "Failed to enable grf_clk: %d\n", ret); - return ret; + goto out_pm_runtime; } dw_mipi_dsi_rockchip_config(dsi); @@ -965,16 +955,23 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = rockchip_dsi_drm_create_encoder(dsi, drm_dev); if (ret) { DRM_DEV_ERROR(dev, "Failed to create drm encoder\n"); - return ret; + goto out_pm_runtime; } ret = dw_mipi_dsi_bind(dsi->dmd, &dsi->encoder); if (ret) { DRM_DEV_ERROR(dev, "Failed to bind: %d\n", ret); - return ret; + goto out_pm_runtime; } return 0; + +out_pm_runtime: + pm_runtime_put(dsi->dev); + if (dsi->slave) + pm_runtime_put(dsi->slave->dev); + + return ret; } static void dw_mipi_dsi_rockchip_unbind(struct device *dev, @@ -989,6 +986,10 @@ static void dw_mipi_dsi_rockchip_unbind(struct device *dev, dw_mipi_dsi_unbind(dsi->dmd); clk_d
[PATCH v3 2/4] drm/rockchip: dsi: Reconfigure hardware on resume()
Since commit 43c2de1002d2, we perform most HW configuration in the bind() function. This configuration may be lost on suspend/resume, so we need to call it again. That may lead to errors like this after system suspend/resume: dw-mipi-dsi-rockchip ff968000.mipi: failed to write command FIFO panel-kingdisplay-kd097d04 ff96.mipi.0: failed write init cmds: -110 Tested on Acer Chromebook Tab 10 (RK3399 Gru-Scarlet). Note that early mailing list versions of this driver borrowed Rockchip's downstream/BSP solution, to do HW configuration in mode_set() (which *is* called at the appropriate pre-enable() times), but that was discarded along the way. I've avoided that still, because mode_set() documentation doesn't suggest this kind of purpose as far as I can tell. Fixes: 43c2de1002d2 ("drm/rockchip: dsi: move all lane config except LCDC mux to bind()") Cc: Signed-off-by: Brian Norris --- Changes in v3: - New patch, to fix related PM issue discovered after patch 1 .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 37 +++ 1 file changed, 37 insertions(+) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index 45676b23c019..21c67343cc6c 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -268,6 +268,8 @@ struct dw_mipi_dsi_rockchip { struct dw_mipi_dsi *dmd; const struct rockchip_dw_dsi_chip_data *cdata; struct dw_mipi_dsi_plat_data pdata; + + bool dsi_bound; }; struct dphy_pll_parameter_map { @@ -964,6 +966,8 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, goto out_pm_runtime; } + dsi->dsi_bound = true; + return 0; out_pm_runtime: @@ -983,6 +987,8 @@ static void dw_mipi_dsi_rockchip_unbind(struct device *dev, if (dsi->is_slave) return; + dsi->dsi_bound = false; + dw_mipi_dsi_unbind(dsi->dmd); clk_disable_unprepare(dsi->pllref_clk); @@ -1277,6 +1283,36 @@ static const struct phy_ops dw_mipi_dsi_dphy_ops = { .exit = dw_mipi_dsi_dphy_exit, }; +static int __maybe_unused dw_mipi_dsi_rockchip_resume(struct device *dev) +{ + struct dw_mipi_dsi_rockchip *dsi = dev_get_drvdata(dev); + int ret; + + /* +* Re-configure DSI state, if we were previously initialized. We need +* to do this before rockchip_drm_drv tries to re-enable() any panels. +*/ + if (dsi->dsi_bound) { + ret = clk_prepare_enable(dsi->grf_clk); + if (ret) { + DRM_DEV_ERROR(dsi->dev, "Failed to enable grf_clk: %d\n", ret); + return ret; + } + + dw_mipi_dsi_rockchip_config(dsi); + if (dsi->slave) + dw_mipi_dsi_rockchip_config(dsi->slave); + + clk_disable_unprepare(dsi->grf_clk); + } + + return 0; +} + +static const struct dev_pm_ops dw_mipi_dsi_rockchip_pm_ops = { + SET_LATE_SYSTEM_SLEEP_PM_OPS(NULL, dw_mipi_dsi_rockchip_resume) +}; + static int dw_mipi_dsi_rockchip_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -1594,6 +1630,7 @@ struct platform_driver dw_mipi_dsi_rockchip_driver = { .remove = dw_mipi_dsi_rockchip_remove, .driver = { .of_match_table = dw_mipi_dsi_rockchip_dt_ids, + .pm = &dw_mipi_dsi_rockchip_pm_ops, .name = "dw-mipi-dsi-rockchip", }, }; -- 2.33.0.685.g46640cef36-goog
[PATCH v3 0/4] Fix Rockchip MIPI DSI display init timeouts
The Rockchip DSI driver has had a number of bugs over time and has usually only worked by chance. A number of users have noticed that things regressed with commit 43c2de1002d2 ("drm/rockchip: dsi: move all lane config except LCDC mux to bind()"), although it was fixing several real issues of its own that had been present forever in the upstream driver (but notably, not in the downstream/BSP driver). Patch 1 and 2 are the most important fixes here. See their description for more info. While I'm at it, fix a few error handling and cleanup issues too. Changes in v3: - New patch, to fix related PM issue discovered after patch 1 Changes in v2: - Clean up pm-runtime state in error cases. - Correct git hash for Fixes. Brian Norris (4): drm/rockchip: dsi: Hold pm-runtime across bind/unbind drm/rockchip: dsi: Reconfigure hardware on resume() drm/rockchip: dsi: Fix unbalanced clock on probe error drm/rockchip: dsi: Disable PLL clock on bind error .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 82 +-- 1 file changed, 59 insertions(+), 23 deletions(-) -- 2.33.0.685.g46640cef36-goog
Re: [PATCH v2 3/3] drm/rockchip: dsi: Disable PLL clock on bind error
On Mon, Sep 27, 2021 at 9:16 PM Chen-Yu Tsai wrote: > > On Tue, Sep 28, 2021 at 2:00 AM Brian Norris wrote: > > > > Fix some error handling here noticed in review of other changes. > > > > Reported-by: Chen-Yu Tsai > > Signed-off-by: Brian Norris > > Fixes: 2d4f7bdafd70 ("drm/rockchip: dsi: migrate to use dw-mipi-dsi > bridge driver") > > Otherwise, > > Reviewed-by: Chen-Yu Tsai I'll add these tags, thanks. > Additionally, I would move patch 2 and this patch before the first patch, > as these two fix a commit introduced in v5.0, while the first patch fixes > something introduced in v5.14. This would make automatic backporting cleaner. Personally, I prioritize putting user-visible fixes first, and more cosmetic things (like error handling that we ~never hit) later. Also, the buggy commit was already backported to all still-supported stable releases where it was relevant (i.e., 5.4+), so if these get backported, they all should. Regards, Brian [1] 43c2de1002d2 drm/rockchip: dsi: move all lane config except LCDC mux to bind()
Re: [PATCH v2 1/3] drm/rockchip: dsi: Hold pm-runtime across bind/unbind
On Mon, Sep 27, 2021 at 10:59:42AM -0700, Brian Norris wrote: > In commit 43c2de1002d2, we moved most HW configuration to bind(), but we > didn't move the runtime PM management. Therefore, depending on initial > boot state, runtime-PM workqueue delays, and other timing factors, we > may disable our power domain in between the hardware configuration > (bind()) and when we enable the display. This can cause us to lose > hardware state and fail to configure our display. For example: > > dw-mipi-dsi-rockchip ff968000.mipi: failed to write command FIFO > panel-innolux-p079zca ff96.mipi.0: failed to write command 0 > > or: > > dw-mipi-dsi-rockchip ff968000.mipi: failed to write command FIFO > panel-kingdisplay-kd097d04 ff96.mipi.0: failed write init cmds: -110 > > We should match the runtime PM to the lifetime of the bind()/unbind() > cycle. Hmm, sorry to reply to my own patch so quickly, but after a bit more testing I'm finding we still have yet another problem here -- that suspend/resume does not work. For suspend/resume, drm_mode_config_helper_{suspend,resume}() are expecting to only do teardown/setup via disable()/enable() -- there is no re-bind() (which makes sense). But the DSI hardware state may be lost, so the resume-time enable() may find the panel initialization timing out yet again. Possible solutions: (1) I can add PM suspend()/resume() operations just to call dw_mipi_dsi_rockchip_config(). (2) Switch back to using mode_set() for HW configuration, like the downstream/BSP driver does (and the initial versions Rockchip and later Heiko were working on did the same), since that's always called at the right time before both panel and encoder enable(). That also happens to be where some other DSI drivers [1] do similar init. Have we been avoiding (2) just because that doesn't really match the intended purpose of the callback? I can't find any cleaner callback for this at the moment, and I'd rather not try to introduce entirely new drm helper callbacks just for this particularly-unfriendly sequence. I have a patch written for option (1), and may send a v3 soon to include that as well (because that's also a regression from the same commit). Brian [1] e.g., drivers/gpu/drm/bridge/nwl-dsi.c
Re: [PATCH v2 1/3] drm/rockchip: dsi: Hold pm-runtime across bind/unbind
On Mon, Sep 27, 2021 at 12:18 PM Tom Hebb wrote: > Reviewed-by: Thomas Hebb Thanks! > Thank you for catching this, and sorry that my original fix broke things. > There had actually been a report of this breakage from my patch, but I > missed that email until it had already been merged and then didn't have > time to follow up on it. Totally my bad. No worries. It was a 1 step forward, 1 step backward kind of a thing anyway -- things were broken in many cases before your patch too (with very similar-looking symptoms) -- so the net result is still good, having both issues fixed. I'm not sure how that ideally should have been handled [1], but it's totally fair to not have time to follow up on everything. At the worst, we could have reverted things; but again, I'm pretty sure things were broken just as well without your fix, just with a different root cause. Regards, Brian [1] Don't accept (or, revert?) your bugfix until what may or may not have been a regression is fixed? I'm not sure.
Re: [PATCH 1/2] drm/rockchip: dsi: hold pm-runtime across bind/unbind
On Mon, Sep 27, 2021 at 12:10 AM Chen-Yu Tsai wrote: > On Sat, Sep 25, 2021 at 7:24 AM Brian Norris wrote: > > We should match the runtime PM to the lifetime of the bind()/unbind() > > cycle. > > I'm not too familiar with MIPI DSI, but it seems that the subsystem expects > the DSI link to be always available, and in LPM if power saving is required? > If so then this change matches that expectation, though we might lose some > power savings compared to the previous non-conforming behavior. Yeah, I was a little torn on whether we should care about any possible lost power savings here, because now we stay runtime-enabled even if the display is not enable()d. But I'm not aware of a good hook for handling this kind of a sequence, and I'm not convinced there is much savings by disabling the power domain in that case. > > Fixes: 59eb7193bef2 ("drm/rockchip: dsi: move all lane config except LCDC > > mux to bind()") > > This hash is from some stable branch. The mainline one is: > > 43c2de1002d2 drm/rockchip: dsi: move all lane config except LCDC mux to bind() Oops, good catch. I've been doing too much debugging/development on 5.10.y stable. Fixed in v2. > The bind function is missing an error cleanup path. We might end up with > unbalanced runtime PM references. (And also possibly an enabled pllref clk.) > This is a pre-existing issue though. The code changes here look correct. In v2, I've performed cleanup for the runtime PM state in this patch, and added an additional patch to fix the other existing issues you noted. Thanks. Brian
[PATCH v2 3/3] drm/rockchip: dsi: Disable PLL clock on bind error
Fix some error handling here noticed in review of other changes. Reported-by: Chen-Yu Tsai Signed-off-by: Brian Norris --- Changes in v2: - New drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index fa4080176719..0ed13d81fe60 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -943,7 +943,7 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = clk_prepare_enable(dsi->grf_clk); if (ret) { DRM_DEV_ERROR(dsi->dev, "Failed to enable grf_clk: %d\n", ret); - goto out_pm_runtime; + goto out_pll_clk; } dw_mipi_dsi_rockchip_config(dsi); @@ -955,17 +955,19 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = rockchip_dsi_drm_create_encoder(dsi, drm_dev); if (ret) { DRM_DEV_ERROR(dev, "Failed to create drm encoder\n"); - goto out_pm_runtime; + goto out_pll_clk; } ret = dw_mipi_dsi_bind(dsi->dmd, &dsi->encoder); if (ret) { DRM_DEV_ERROR(dev, "Failed to bind: %d\n", ret); - goto out_pm_runtime; + goto out_pll_clk; } return 0; +out_pll_clk: + clk_disable_unprepare(dsi->pllref_clk); out_pm_runtime: pm_runtime_put(dsi->dev); if (dsi->slave) -- 2.33.0.685.g46640cef36-goog
[PATCH v2 2/3] drm/rockchip: dsi: Fix unbalanced clock on probe error
Our probe() function never enabled this clock, so we shouldn't disable it if we fail to probe the bridge. Noted by inspection. Fixes: 2d4f7bdafd70 ("drm/rockchip: dsi: migrate to use dw-mipi-dsi bridge driver") Signed-off-by: Brian Norris Reviewed-by: Chen-Yu Tsai --- (no changes since v1) drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index 45676b23c019..fa4080176719 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -1398,14 +1398,10 @@ static int dw_mipi_dsi_rockchip_probe(struct platform_device *pdev) if (ret != -EPROBE_DEFER) DRM_DEV_ERROR(dev, "Failed to probe dw_mipi_dsi: %d\n", ret); - goto err_clkdisable; + return ret; } return 0; - -err_clkdisable: - clk_disable_unprepare(dsi->pllref_clk); - return ret; } static int dw_mipi_dsi_rockchip_remove(struct platform_device *pdev) -- 2.33.0.685.g46640cef36-goog
[PATCH v2 1/3] drm/rockchip: dsi: Hold pm-runtime across bind/unbind
In commit 43c2de1002d2, we moved most HW configuration to bind(), but we didn't move the runtime PM management. Therefore, depending on initial boot state, runtime-PM workqueue delays, and other timing factors, we may disable our power domain in between the hardware configuration (bind()) and when we enable the display. This can cause us to lose hardware state and fail to configure our display. For example: dw-mipi-dsi-rockchip ff968000.mipi: failed to write command FIFO panel-innolux-p079zca ff96.mipi.0: failed to write command 0 or: dw-mipi-dsi-rockchip ff968000.mipi: failed to write command FIFO panel-kingdisplay-kd097d04 ff96.mipi.0: failed write init cmds: -110 We should match the runtime PM to the lifetime of the bind()/unbind() cycle. Tested on Acer Chrometab 10 (RK3399 Gru-Scarlet), with panel drivers built either as modules or built-in. Side notes: it seems one is more likely to see this problem when the panel driver is built into the kernel. I've also seen this problem bisect down to commits that simply changed Kconfig dependencies, because it changed the order in which driver init functions were compiled into the kernel, and therefore the ordering and timing of built-in device probe. Fixes: 43c2de1002d2 ("drm/rockchip: dsi: move all lane config except LCDC mux to bind()") Link: https://lore.kernel.org/linux-rockchip/9aedfb528600ecf871885f7293ca4207c84d16c1.ca...@gmail.com/ Reported-by: Cc: Signed-off-by: Brian Norris Tested-by: Nícolas F. R. A. Prado --- Changes in v2: - Clean up pm-runtime state in error cases. - Correct git hash for Fixes. .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 37 ++- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c index a2262bee5aa4..45676b23c019 100644 --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c @@ -773,10 +773,6 @@ static void dw_mipi_dsi_encoder_enable(struct drm_encoder *encoder) if (mux < 0) return; - pm_runtime_get_sync(dsi->dev); - if (dsi->slave) - pm_runtime_get_sync(dsi->slave->dev); - /* * For the RK3399, the clk of grf must be enabled before writing grf * register. And for RK3288 or other soc, this grf_clk must be NULL, @@ -795,20 +791,10 @@ static void dw_mipi_dsi_encoder_enable(struct drm_encoder *encoder) clk_disable_unprepare(dsi->grf_clk); } -static void dw_mipi_dsi_encoder_disable(struct drm_encoder *encoder) -{ - struct dw_mipi_dsi_rockchip *dsi = to_dsi(encoder); - - if (dsi->slave) - pm_runtime_put(dsi->slave->dev); - pm_runtime_put(dsi->dev); -} - static const struct drm_encoder_helper_funcs dw_mipi_dsi_encoder_helper_funcs = { .atomic_check = dw_mipi_dsi_encoder_atomic_check, .enable = dw_mipi_dsi_encoder_enable, - .disable = dw_mipi_dsi_encoder_disable, }; static int rockchip_dsi_drm_create_encoder(struct dw_mipi_dsi_rockchip *dsi, @@ -938,10 +924,14 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, put_device(second); } + pm_runtime_get_sync(dsi->dev); + if (dsi->slave) + pm_runtime_get_sync(dsi->slave->dev); + ret = clk_prepare_enable(dsi->pllref_clk); if (ret) { DRM_DEV_ERROR(dev, "Failed to enable pllref_clk: %d\n", ret); - return ret; + goto out_pm_runtime; } /* @@ -953,7 +943,7 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = clk_prepare_enable(dsi->grf_clk); if (ret) { DRM_DEV_ERROR(dsi->dev, "Failed to enable grf_clk: %d\n", ret); - return ret; + goto out_pm_runtime; } dw_mipi_dsi_rockchip_config(dsi); @@ -965,16 +955,23 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev, ret = rockchip_dsi_drm_create_encoder(dsi, drm_dev); if (ret) { DRM_DEV_ERROR(dev, "Failed to create drm encoder\n"); - return ret; + goto out_pm_runtime; } ret = dw_mipi_dsi_bind(dsi->dmd, &dsi->encoder); if (ret) { DRM_DEV_ERROR(dev, "Failed to bind: %d\n", ret); - return ret; + goto out_pm_runtime; } return 0; + +out_pm_runtime: + pm_runtime_put(dsi->dev); + if (dsi->slave) + pm_runtime_put(dsi->slave->dev); + + return ret; } static void dw_mipi_dsi_rockchip_unbind(struct device *dev, @@ -989,6 +986,10 @@ static void dw_mipi_dsi_rockchip_unbind(struct device *dev, dw_mipi_dsi_unbind(dsi->dmd); clk_disable_unprepare(dsi
[PATCH v2 0/3] Fix Rockchip MIPI DSI display init timeouts
The Rockchip DSI driver has had a number of bugs over time and has usually only worked by chance. A number of users have noticed that things regressed with commit 43c2de1002d2 ("drm/rockchip: dsi: move all lane config except LCDC mux to bind()"), although it was fixing several real issues of its own that had been present forever in the upstream driver (but notably, not in the downstream/BSP driver). Patch 1 is the most important fix here. See its description for more info. While I'm at it, fix a few error handling and cleanup issues too. Changes in v2: - Clean up pm-runtime state in error cases. - Correct git hash for Fixes. Brian Norris (3): drm/rockchip: dsi: Hold pm-runtime across bind/unbind drm/rockchip: dsi: Fix unbalanced clock on probe error drm/rockchip: dsi: Disable PLL clock on bind error .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 45 +-- 1 file changed, 22 insertions(+), 23 deletions(-) -- 2.33.0.685.g46640cef36-goog