Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-31 Thread Daniel Vetter
On Wed, Jan 31, 2024 at 12:26:45PM +0200, Dmitry Baryshkov wrote:
> On Wed, 31 Jan 2024 at 11:11, Daniel Vetter  wrote:
> >
> > On Wed, Jan 31, 2024 at 05:17:08AM +, Jason-JH Lin (林睿祥) wrote:
> > > On Thu, 2024-01-25 at 19:17 +0100, Daniel Vetter wrote:
> > > >
> > > > External email : Please do not click links or open attachments until
> > > > you have verified the sender or the content.
> > > >  On Tue, Jan 23, 2024 at 06:09:05AM +, Jason-JH Lin (林睿祥) wrote:
> > > > > Hi Maxime, Daniel,
> > > > >
> > > > > We encountered similar issue with mediatek SoCs.
> > > > >
> > > > > We have found that in drm_atomic_helper_commit_rpm(), when
> > > > disabling
> > > > > the cursor plane, the old_state->legacy_cursor_update in
> > > > > drm_atomic_wait_for_vblank() is set to true.
> > > > > As the result, we are not actually waiting for a vlbank to wait for
> > > > our
> > > > > hardware to close the cursor plane. Subsequently, the execution
> > > > > proceeds to drm_atomic_helper_cleanup_planes() to  free the cursor
> > > > > buffer. This can lead to use-after-free issues with our hardware.
> > > > >
> > > > > Could you please apply this patch to fix our problem?
> > > > > Or are there any considerations for not applying this patch?
> > > >
> > > > Mostly it needs someone to collect a pile of acks/tested-by and then
> > > > land
> > > > it.
> > > >
> > >
> > > Got it. I would add tested-by tag for mediatek SoC.
> > >
> > > > I'd be _very_ happy if someone else can take care of that ...
> > > >
> > > > There's also the potential issue that it might slow down some of the
> > > > legacy X11 use-cases that really needed a non-blocking cursor, but I
> > > > think
> > > > all the drivers where this matters have switched over to the async
> > > > plane
> > > > update stuff meanwhile. So hopefully that's good.
> > > >
> > >
> > > I think all the drivers should have switched to async plane update.
> > >
> > > Can we add the checking condition to see if atomic_async_update/check
> > > function are implemented?
> >
> > Pretty sure not all have done that, so really it boils down to whether we
> > break a real user's use-case. Which pretty much can only be checked by
> > merging the patch (hence the requirement to get as many acks as possible
> > from display drivers) and then being willing to handle any fallout that's
> > reported as regressions for a specific driver.
> >
> > It's a pile of work, at least when it goes south, that's why I'm looking
> > for volunteers.
> 
> I can check this on all sensible msm generations, including mdp4, but
> it will be next week, after the FOSDEM.
> 
> BTW, for technical reasons one of the msm platforms still has the
> legacy cursor implementation might it be related?

Yeah, msm is one of the drivers I had to change with some hacks to avoid
really bad fallout. It should still work like before, but that's one that
definitely needs testing.
-Sima

> 
> >
> > Note that handling the fallout doesn't mean you have to fix that specific
> > driver, the only realistic option might be to reinstate the legacy cursor
> > behaviour, but as an explicit opt-in that only that specific driver
> > enables.
> >
> > So maybe for next round of that patch it might be good to have a 2nd patch
> > which implements this fallback plan in the shared atomic modeset code?
> >
> > Cheers, Sima
> 
> 
> -- 
> With best wishes
> Dmitry

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Re: Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-31 Thread mrip...@kernel.org
Hi,

On Wed, Jan 31, 2024 at 05:27:14AM +, Jason-JH Lin (林睿祥) wrote:
> 
> On Sun, 2024-01-28 at 10:24 +0100, Maxime Ripard wrote:
> > On Thu, Jan 25, 2024 at 07:17:21PM +0100, Daniel Vetter wrote:
> > > On Tue, Jan 23, 2024 at 06:09:05AM +, Jason-JH Lin (林睿祥) wrote:
> > > > Hi Maxime, Daniel,
> > > > 
> > > > We encountered similar issue with mediatek SoCs.
> > > > 
> > > > We have found that in drm_atomic_helper_commit_rpm(), when
> > > > disabling
> > > > the cursor plane, the old_state->legacy_cursor_update in
> > > > drm_atomic_wait_for_vblank() is set to true.
> > > > As the result, we are not actually waiting for a vlbank to wait
> > > > for our
> > > > hardware to close the cursor plane. Subsequently, the execution
> > > > proceeds to drm_atomic_helper_cleanup_planes() to  free the
> > > > cursor
> > > > buffer. This can lead to use-after-free issues with our hardware.
> > > > 
> > > > Could you please apply this patch to fix our problem?
> > > > Or are there any considerations for not applying this patch?
> > > 
> > > Mostly it needs someone to collect a pile of acks/tested-by and
> > > then land
> > > it.
> > > 
> > > I'd be _very_ happy if someone else can take care of that ...
> > > 
> > > There's also the potential issue that it might slow down some of
> > > the
> > > legacy X11 use-cases that really needed a non-blocking cursor, but
> > > I think
> > > all the drivers where this matters have switched over to the async
> > > plane
> > > update stuff meanwhile. So hopefully that's good.
> > 
> > I think there was also a regression with msm no one really figured
> > out?
> 
> OK...
> But I am only available on MediaTek platform.

I think most of us are in that situation, and which is part of the
reason it kind of stalled :)

> Does it also causes a regression with msn if I re-send a patch for
> drm_atomic_helper.c only?

Yes, that's my recollection at least.

Fortunately, Dmitry might be able to clear that up.

Maxime


signature.asc
Description: PGP signature


Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-31 Thread Dmitry Baryshkov
On Wed, 31 Jan 2024 at 11:11, Daniel Vetter  wrote:
>
> On Wed, Jan 31, 2024 at 05:17:08AM +, Jason-JH Lin (林睿祥) wrote:
> > On Thu, 2024-01-25 at 19:17 +0100, Daniel Vetter wrote:
> > >
> > > External email : Please do not click links or open attachments until
> > > you have verified the sender or the content.
> > >  On Tue, Jan 23, 2024 at 06:09:05AM +, Jason-JH Lin (林睿祥) wrote:
> > > > Hi Maxime, Daniel,
> > > >
> > > > We encountered similar issue with mediatek SoCs.
> > > >
> > > > We have found that in drm_atomic_helper_commit_rpm(), when
> > > disabling
> > > > the cursor plane, the old_state->legacy_cursor_update in
> > > > drm_atomic_wait_for_vblank() is set to true.
> > > > As the result, we are not actually waiting for a vlbank to wait for
> > > our
> > > > hardware to close the cursor plane. Subsequently, the execution
> > > > proceeds to drm_atomic_helper_cleanup_planes() to  free the cursor
> > > > buffer. This can lead to use-after-free issues with our hardware.
> > > >
> > > > Could you please apply this patch to fix our problem?
> > > > Or are there any considerations for not applying this patch?
> > >
> > > Mostly it needs someone to collect a pile of acks/tested-by and then
> > > land
> > > it.
> > >
> >
> > Got it. I would add tested-by tag for mediatek SoC.
> >
> > > I'd be _very_ happy if someone else can take care of that ...
> > >
> > > There's also the potential issue that it might slow down some of the
> > > legacy X11 use-cases that really needed a non-blocking cursor, but I
> > > think
> > > all the drivers where this matters have switched over to the async
> > > plane
> > > update stuff meanwhile. So hopefully that's good.
> > >
> >
> > I think all the drivers should have switched to async plane update.
> >
> > Can we add the checking condition to see if atomic_async_update/check
> > function are implemented?
>
> Pretty sure not all have done that, so really it boils down to whether we
> break a real user's use-case. Which pretty much can only be checked by
> merging the patch (hence the requirement to get as many acks as possible
> from display drivers) and then being willing to handle any fallout that's
> reported as regressions for a specific driver.
>
> It's a pile of work, at least when it goes south, that's why I'm looking
> for volunteers.

I can check this on all sensible msm generations, including mdp4, but
it will be next week, after the FOSDEM.

BTW, for technical reasons one of the msm platforms still has the
legacy cursor implementation might it be related?

>
> Note that handling the fallout doesn't mean you have to fix that specific
> driver, the only realistic option might be to reinstate the legacy cursor
> behaviour, but as an explicit opt-in that only that specific driver
> enables.
>
> So maybe for next round of that patch it might be good to have a 2nd patch
> which implements this fallback plan in the shared atomic modeset code?
>
> Cheers, Sima


-- 
With best wishes
Dmitry


Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-31 Thread Daniel Vetter
On Wed, Jan 31, 2024 at 05:17:08AM +, Jason-JH Lin (林睿祥) wrote:
> On Thu, 2024-01-25 at 19:17 +0100, Daniel Vetter wrote:
> >  
> > External email : Please do not click links or open attachments until
> > you have verified the sender or the content.
> >  On Tue, Jan 23, 2024 at 06:09:05AM +, Jason-JH Lin (林睿祥) wrote:
> > > Hi Maxime, Daniel,
> > > 
> > > We encountered similar issue with mediatek SoCs.
> > > 
> > > We have found that in drm_atomic_helper_commit_rpm(), when
> > disabling
> > > the cursor plane, the old_state->legacy_cursor_update in
> > > drm_atomic_wait_for_vblank() is set to true.
> > > As the result, we are not actually waiting for a vlbank to wait for
> > our
> > > hardware to close the cursor plane. Subsequently, the execution
> > > proceeds to drm_atomic_helper_cleanup_planes() to  free the cursor
> > > buffer. This can lead to use-after-free issues with our hardware.
> > > 
> > > Could you please apply this patch to fix our problem?
> > > Or are there any considerations for not applying this patch?
> > 
> > Mostly it needs someone to collect a pile of acks/tested-by and then
> > land
> > it.
> > 
> 
> Got it. I would add tested-by tag for mediatek SoC.
> 
> > I'd be _very_ happy if someone else can take care of that ...
> > 
> > There's also the potential issue that it might slow down some of the
> > legacy X11 use-cases that really needed a non-blocking cursor, but I
> > think
> > all the drivers where this matters have switched over to the async
> > plane
> > update stuff meanwhile. So hopefully that's good.
> > 
> 
> I think all the drivers should have switched to async plane update.
> 
> Can we add the checking condition to see if atomic_async_update/check
> function are implemented?

Pretty sure not all have done that, so really it boils down to whether we
break a real user's use-case. Which pretty much can only be checked by
merging the patch (hence the requirement to get as many acks as possible
from display drivers) and then being willing to handle any fallout that's
reported as regressions for a specific driver.

It's a pile of work, at least when it goes south, that's why I'm looking
for volunteers.

Note that handling the fallout doesn't mean you have to fix that specific
driver, the only realistic option might be to reinstate the legacy cursor
behaviour, but as an explicit opt-in that only that specific driver
enables.

So maybe for next round of that patch it might be good to have a 2nd patch
which implements this fallback plan in the shared atomic modeset code?

Cheers, Sima

> 
> Regards,
> Jason-JH.Lin
> 
> > Cheers, Sima
> > > 
> > > Regards,
> > > Jason-JH.Lin
> > > 
> > > On Tue, 2023-03-07 at 15:56 +0100, Maxime Ripard wrote:
> > > > Hi,
> > > > 
> > > > On Thu, Feb 16, 2023 at 12:12:13PM +0100, Daniel Vetter wrote:
> > > > > The stuff never really worked, and leads to lots of fun because
> > it
> > > > > out-of-order frees atomic states. Which upsets KASAN, among
> > other
> > > > > things.
> > > > > 
> > > > > For async updates we now have a more solid solution with the
> > > > > ->atomic_async_check and ->atomic_async_commit hooks. Support
> > for
> > > > > that
> > > > > for msm and vc4 landed. nouveau and i915 have their own commit
> > > > > routines, doing something similar.
> > > > > 
> > > > > For everyone else it's probably better to remove the use-after-
> > free
> > > > > bug, and encourage folks to use the async support instead. The
> > > > > affected drivers which register a legacy cursor plane and don't
> > > > > either
> > > > > use the new async stuff or their own commit routine are:
> > amdgpu,
> > > > > atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and
> > > > > vmwgfx.
> > > > > 
> > > > > Inspired by an amdgpu bug report.
> > > > 
> > > > Thanks for submitting that patch. It's been in the downstream RPi
> > > > tree
> > > > for a while, so I'd really like it to be merged eventually :)
> > > > 
> > > > Acked-by: Maxime Ripard 
> > > > 
> > > > Maxime
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-30 Thread 林睿祥


Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-30 Thread 林睿祥


Re: Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-28 Thread Maxime Ripard
On Thu, Jan 25, 2024 at 07:17:21PM +0100, Daniel Vetter wrote:
> On Tue, Jan 23, 2024 at 06:09:05AM +, Jason-JH Lin (林睿祥) wrote:
> > Hi Maxime, Daniel,
> > 
> > We encountered similar issue with mediatek SoCs.
> > 
> > We have found that in drm_atomic_helper_commit_rpm(), when disabling
> > the cursor plane, the old_state->legacy_cursor_update in
> > drm_atomic_wait_for_vblank() is set to true.
> > As the result, we are not actually waiting for a vlbank to wait for our
> > hardware to close the cursor plane. Subsequently, the execution
> > proceeds to drm_atomic_helper_cleanup_planes() to  free the cursor
> > buffer. This can lead to use-after-free issues with our hardware.
> > 
> > Could you please apply this patch to fix our problem?
> > Or are there any considerations for not applying this patch?
> 
> Mostly it needs someone to collect a pile of acks/tested-by and then land
> it.
> 
> I'd be _very_ happy if someone else can take care of that ...
> 
> There's also the potential issue that it might slow down some of the
> legacy X11 use-cases that really needed a non-blocking cursor, but I think
> all the drivers where this matters have switched over to the async plane
> update stuff meanwhile. So hopefully that's good.

I think there was also a regression with msm no one really figured out?

Maxime


signature.asc
Description: PGP signature


Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-25 Thread Daniel Vetter
On Tue, Jan 23, 2024 at 06:09:05AM +, Jason-JH Lin (林睿祥) wrote:
> Hi Maxime, Daniel,
> 
> We encountered similar issue with mediatek SoCs.
> 
> We have found that in drm_atomic_helper_commit_rpm(), when disabling
> the cursor plane, the old_state->legacy_cursor_update in
> drm_atomic_wait_for_vblank() is set to true.
> As the result, we are not actually waiting for a vlbank to wait for our
> hardware to close the cursor plane. Subsequently, the execution
> proceeds to drm_atomic_helper_cleanup_planes() to  free the cursor
> buffer. This can lead to use-after-free issues with our hardware.
> 
> Could you please apply this patch to fix our problem?
> Or are there any considerations for not applying this patch?

Mostly it needs someone to collect a pile of acks/tested-by and then land
it.

I'd be _very_ happy if someone else can take care of that ...

There's also the potential issue that it might slow down some of the
legacy X11 use-cases that really needed a non-blocking cursor, but I think
all the drivers where this matters have switched over to the async plane
update stuff meanwhile. So hopefully that's good.

Cheers, Sima
> 
> Regards,
> Jason-JH.Lin
> 
> On Tue, 2023-03-07 at 15:56 +0100, Maxime Ripard wrote:
> > Hi,
> > 
> > On Thu, Feb 16, 2023 at 12:12:13PM +0100, Daniel Vetter wrote:
> > > The stuff never really worked, and leads to lots of fun because it
> > > out-of-order frees atomic states. Which upsets KASAN, among other
> > > things.
> > > 
> > > For async updates we now have a more solid solution with the
> > > ->atomic_async_check and ->atomic_async_commit hooks. Support for
> > > that
> > > for msm and vc4 landed. nouveau and i915 have their own commit
> > > routines, doing something similar.
> > > 
> > > For everyone else it's probably better to remove the use-after-free
> > > bug, and encourage folks to use the async support instead. The
> > > affected drivers which register a legacy cursor plane and don't
> > > either
> > > use the new async stuff or their own commit routine are: amdgpu,
> > > atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and
> > > vmwgfx.
> > > 
> > > Inspired by an amdgpu bug report.
> > 
> > Thanks for submitting that patch. It's been in the downstream RPi
> > tree
> > for a while, so I'd really like it to be merged eventually :)
> > 
> > Acked-by: Maxime Ripard 
> > 
> > Maxime

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2024-01-25 Thread 林睿祥


Re: [Freedreno] [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2023-03-07 Thread Maxime Ripard
Hi,

On Thu, Feb 16, 2023 at 12:12:13PM +0100, Daniel Vetter wrote:
> The stuff never really worked, and leads to lots of fun because it
> out-of-order frees atomic states. Which upsets KASAN, among other
> things.
> 
> For async updates we now have a more solid solution with the
> ->atomic_async_check and ->atomic_async_commit hooks. Support for that
> for msm and vc4 landed. nouveau and i915 have their own commit
> routines, doing something similar.
> 
> For everyone else it's probably better to remove the use-after-free
> bug, and encourage folks to use the async support instead. The
> affected drivers which register a legacy cursor plane and don't either
> use the new async stuff or their own commit routine are: amdgpu,
> atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and vmwgfx.
> 
> Inspired by an amdgpu bug report.

Thanks for submitting that patch. It's been in the downstream RPi tree
for a while, so I'd really like it to be merged eventually :)

Acked-by: Maxime Ripard 

Maxime


signature.asc
Description: PGP signature


Re: [Freedreno] [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2023-02-22 Thread Rob Clark
On Wed, Feb 22, 2023 at 3:14 PM Rob Clark  wrote:
>
> On Thu, Feb 16, 2023 at 3:12 AM Daniel Vetter  wrote:
> >
> > The stuff never really worked, and leads to lots of fun because it
> > out-of-order frees atomic states. Which upsets KASAN, among other
> > things.
> >
> > For async updates we now have a more solid solution with the
> > ->atomic_async_check and ->atomic_async_commit hooks. Support for that
> > for msm and vc4 landed. nouveau and i915 have their own commit
> > routines, doing something similar.
> >
> > For everyone else it's probably better to remove the use-after-free
> > bug, and encourage folks to use the async support instead. The
> > affected drivers which register a legacy cursor plane and don't either
> > use the new async stuff or their own commit routine are: amdgpu,
> > atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and vmwgfx.
> >
> > Inspired by an amdgpu bug report.
> >
> > v2: Drop RFC, I think with amdgpu converted over to use
> > atomic_async_check/commit done in
> >
> > commit 674e78acae0dfb4beb56132e41cbae5b60f7d662
> > Author: Nicholas Kazlauskas 
> > Date:   Wed Dec 5 14:59:07 2018 -0500
> >
> > drm/amd/display: Add fast path for cursor plane updates
> >
> > we don't have any driver anymore where we have userspace expecting
> > solid legacy cursor support _and_ they are using the atomic helpers in
> > their fully glory. So we can retire this.
> >
> > v3: Paper over msm and i915 regression. The complete_all is the only
> > thing missing afaict.
> >
> > v4: Fixup i915 fixup ...
> >
> > v5: Unallocate the crtc->event in msm to avoid hitting a WARN_ON in
> > dpu_crtc_atomic_flush(). This is a bit a hack, but simplest way to
> > untangle this all. Thanks to Abhinav Kumar for the debug help.
>
> Hmm, are you sure about that double-put?
>
> [  +0.501263] [ cut here ]
> [  +0.32] refcount_t: underflow; use-after-free.
> [  +0.33] WARNING: CPU: 6 PID: 1854 at lib/refcount.c:28
> refcount_warn_saturate+0xf8/0x134
> [  +0.43] Modules linked in: uinput rfcomm algif_hash
> algif_skcipher af_alg veth venus_dec venus_enc xt_cgroup xt_MASQUERADE
> qcom_spmi_temp_alarm qcom_spmi_adc_tm5 qcom_spmi_adc5 qcom_vadc_common
> cros_ec_typec typec 8021q hci_uart btqca qcom_stats venus_core
> coresight_etm4x coresight_tmc snd_soc_lpass_sc7180
> coresight_replicator coresight_funnel coresight snd_soc_sc7180
> ip6table_nat fuse ath10k_snoc ath10k_core ath mac80211 iio_trig_sysfs
> bluetooth cros_ec_sensors cfg80211 cros_ec_sensors_core
> industrialio_triggered_buffer kfifo_buf ecdh_generic ecc
> cros_ec_sensorhub lzo_rle lzo_compress r8153_ecm cdc_ether usbnet
> r8152 mii zram hid_vivaldi hid_google_hammer hid_vivaldi_common joydev
> [  +0.000189] CPU: 6 PID: 1854 Comm: DrmThread Not tainted
> 5.15.93-16271-g5ecce40dbcd4 #46
> cf9752a1c9e5b13fd13216094f52d77fa5a5f8f3
> [  +0.16] Hardware name: Google Wormdingler rev1+ INX panel board (DT)
> [  +0.08] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [  +0.13] pc : refcount_warn_saturate+0xf8/0x134
> [  +0.11] lr : refcount_warn_saturate+0xf8/0x134
> [  +0.11] sp : ffc012e43930
> [  +0.08] x29: ffc012e43930 x28: ff80d31aa300 x27: 
> 024e
> [  +0.17] x26: 03bd x25: 0040 x24: 
> 0040
> [  +0.14] x23: ff8083eb1000 x22: 0002 x21: 
> ff80845bc800
> [  +0.13] x20: 0040 x19: ff80d0cecb00 x18: 
> 60014024
> [  +0.12] x17:  x16: 003c x15: 
> ffd97e21a1c0
> [  +0.12] x14: 0003 x13: 0004 x12: 
> 0001
> [  +0.14] x11: c000dfff x10: ffd97f560f50 x9 : 
> 5749cdb403550d00
> [  +0.14] x8 : 5749cdb403550d00 x7 :  x6 : 
> 372e31332020205b
> [  +0.12] x5 : ffd97f7b8b24 x4 :  x3 : 
> ffc012e43588
> [  +0.13] x2 : ffc012e43590 x1 : dfff x0 : 
> 0026
> [  +0.14] Call trace:
> [  +0.08]  refcount_warn_saturate+0xf8/0x134
> [  +0.13]  drm_crtc_commit_put+0x54/0x74
> [  +0.13]  __drm_atomic_helper_plane_destroy_state+0x64/0x68
> [  +0.13]  dpu_plane_destroy_state+0x24/0x3c
> [  +0.17]  drm_atomic_state_default_clear+0x13c/0x2d8
> [  +0.15]  __drm_atomic_state_free+0x88/0xa0
> [  +0.15]  drm_atomic_helper_update_plane+0x158/0x188
> [  +0.14]  __setplane_atomic+0xf4/0x138
> [  +0.12]  drm_mode_cursor_common+0x2e8/0x40c
> [  +0.09]  drm_mode_cursor_ioctl+0x48/0x70
> [  +0.08]  drm_ioctl_kernel+0xe0/0x158
> [  +0.14]  drm_ioctl+0x214/0x480
> [  +0.12]  __arm64_sys_ioctl+0x94/0xd4
> [  +0.10]  invoke_syscall+0x4c/0x100
> [  +0.13]  do_el0_svc+0xa4/0x168
> [  +0.12]  el0_svc+0x20/0x50
> [  +0.09]  el0t_64_sync_handler+0x20/0x110
> [  +0.08]  el0t_64_sync+0x1a4/0x1a8
> [  +0.10] ---[ end trace 35bb2d245a684c9a ]---
>

without the 

Re: [Freedreno] [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2023-02-22 Thread Rob Clark
On Thu, Feb 16, 2023 at 3:12 AM Daniel Vetter  wrote:
>
> The stuff never really worked, and leads to lots of fun because it
> out-of-order frees atomic states. Which upsets KASAN, among other
> things.
>
> For async updates we now have a more solid solution with the
> ->atomic_async_check and ->atomic_async_commit hooks. Support for that
> for msm and vc4 landed. nouveau and i915 have their own commit
> routines, doing something similar.
>
> For everyone else it's probably better to remove the use-after-free
> bug, and encourage folks to use the async support instead. The
> affected drivers which register a legacy cursor plane and don't either
> use the new async stuff or their own commit routine are: amdgpu,
> atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and vmwgfx.
>
> Inspired by an amdgpu bug report.
>
> v2: Drop RFC, I think with amdgpu converted over to use
> atomic_async_check/commit done in
>
> commit 674e78acae0dfb4beb56132e41cbae5b60f7d662
> Author: Nicholas Kazlauskas 
> Date:   Wed Dec 5 14:59:07 2018 -0500
>
> drm/amd/display: Add fast path for cursor plane updates
>
> we don't have any driver anymore where we have userspace expecting
> solid legacy cursor support _and_ they are using the atomic helpers in
> their fully glory. So we can retire this.
>
> v3: Paper over msm and i915 regression. The complete_all is the only
> thing missing afaict.
>
> v4: Fixup i915 fixup ...
>
> v5: Unallocate the crtc->event in msm to avoid hitting a WARN_ON in
> dpu_crtc_atomic_flush(). This is a bit a hack, but simplest way to
> untangle this all. Thanks to Abhinav Kumar for the debug help.

Hmm, are you sure about that double-put?

[  +0.501263] [ cut here ]
[  +0.32] refcount_t: underflow; use-after-free.
[  +0.33] WARNING: CPU: 6 PID: 1854 at lib/refcount.c:28
refcount_warn_saturate+0xf8/0x134
[  +0.43] Modules linked in: uinput rfcomm algif_hash
algif_skcipher af_alg veth venus_dec venus_enc xt_cgroup xt_MASQUERADE
qcom_spmi_temp_alarm qcom_spmi_adc_tm5 qcom_spmi_adc5 qcom_vadc_common
cros_ec_typec typec 8021q hci_uart btqca qcom_stats venus_core
coresight_etm4x coresight_tmc snd_soc_lpass_sc7180
coresight_replicator coresight_funnel coresight snd_soc_sc7180
ip6table_nat fuse ath10k_snoc ath10k_core ath mac80211 iio_trig_sysfs
bluetooth cros_ec_sensors cfg80211 cros_ec_sensors_core
industrialio_triggered_buffer kfifo_buf ecdh_generic ecc
cros_ec_sensorhub lzo_rle lzo_compress r8153_ecm cdc_ether usbnet
r8152 mii zram hid_vivaldi hid_google_hammer hid_vivaldi_common joydev
[  +0.000189] CPU: 6 PID: 1854 Comm: DrmThread Not tainted
5.15.93-16271-g5ecce40dbcd4 #46
cf9752a1c9e5b13fd13216094f52d77fa5a5f8f3
[  +0.16] Hardware name: Google Wormdingler rev1+ INX panel board (DT)
[  +0.08] pstate: 6049 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  +0.13] pc : refcount_warn_saturate+0xf8/0x134
[  +0.11] lr : refcount_warn_saturate+0xf8/0x134
[  +0.11] sp : ffc012e43930
[  +0.08] x29: ffc012e43930 x28: ff80d31aa300 x27: 024e
[  +0.17] x26: 03bd x25: 0040 x24: 0040
[  +0.14] x23: ff8083eb1000 x22: 0002 x21: ff80845bc800
[  +0.13] x20: 0040 x19: ff80d0cecb00 x18: 60014024
[  +0.12] x17:  x16: 003c x15: ffd97e21a1c0
[  +0.12] x14: 0003 x13: 0004 x12: 0001
[  +0.14] x11: c000dfff x10: ffd97f560f50 x9 : 5749cdb403550d00
[  +0.14] x8 : 5749cdb403550d00 x7 :  x6 : 372e31332020205b
[  +0.12] x5 : ffd97f7b8b24 x4 :  x3 : ffc012e43588
[  +0.13] x2 : ffc012e43590 x1 : dfff x0 : 0026
[  +0.14] Call trace:
[  +0.08]  refcount_warn_saturate+0xf8/0x134
[  +0.13]  drm_crtc_commit_put+0x54/0x74
[  +0.13]  __drm_atomic_helper_plane_destroy_state+0x64/0x68
[  +0.13]  dpu_plane_destroy_state+0x24/0x3c
[  +0.17]  drm_atomic_state_default_clear+0x13c/0x2d8
[  +0.15]  __drm_atomic_state_free+0x88/0xa0
[  +0.15]  drm_atomic_helper_update_plane+0x158/0x188
[  +0.14]  __setplane_atomic+0xf4/0x138
[  +0.12]  drm_mode_cursor_common+0x2e8/0x40c
[  +0.09]  drm_mode_cursor_ioctl+0x48/0x70
[  +0.08]  drm_ioctl_kernel+0xe0/0x158
[  +0.14]  drm_ioctl+0x214/0x480
[  +0.12]  __arm64_sys_ioctl+0x94/0xd4
[  +0.10]  invoke_syscall+0x4c/0x100
[  +0.13]  do_el0_svc+0xa4/0x168
[  +0.12]  el0_svc+0x20/0x50
[  +0.09]  el0t_64_sync_handler+0x20/0x110
[  +0.08]  el0t_64_sync+0x1a4/0x1a8
[  +0.10] ---[ end trace 35bb2d245a684c9a ]---


BR,
-R



> Cc: Abhinav Kumar 
> Cc: Thomas Zimmermann 
> Cc: Maxime Ripard 
> References: https://bugzilla.kernel.org/show_bug.cgi?id=199425
> References: 
> https://lore.kernel.org/all/20220221134155.125447-9-max...@cerno.tech/
> References: 

[Freedreno] [PATCH] drm/atomic-helpers: remove legacy_cursor_update hacks

2023-02-16 Thread Daniel Vetter
The stuff never really worked, and leads to lots of fun because it
out-of-order frees atomic states. Which upsets KASAN, among other
things.

For async updates we now have a more solid solution with the
->atomic_async_check and ->atomic_async_commit hooks. Support for that
for msm and vc4 landed. nouveau and i915 have their own commit
routines, doing something similar.

For everyone else it's probably better to remove the use-after-free
bug, and encourage folks to use the async support instead. The
affected drivers which register a legacy cursor plane and don't either
use the new async stuff or their own commit routine are: amdgpu,
atmel, mediatek, qxl, rockchip, sti, sun4i, tegra, virtio, and vmwgfx.

Inspired by an amdgpu bug report.

v2: Drop RFC, I think with amdgpu converted over to use
atomic_async_check/commit done in

commit 674e78acae0dfb4beb56132e41cbae5b60f7d662
Author: Nicholas Kazlauskas 
Date:   Wed Dec 5 14:59:07 2018 -0500

drm/amd/display: Add fast path for cursor plane updates

we don't have any driver anymore where we have userspace expecting
solid legacy cursor support _and_ they are using the atomic helpers in
their fully glory. So we can retire this.

v3: Paper over msm and i915 regression. The complete_all is the only
thing missing afaict.

v4: Fixup i915 fixup ...

v5: Unallocate the crtc->event in msm to avoid hitting a WARN_ON in
dpu_crtc_atomic_flush(). This is a bit a hack, but simplest way to
untangle this all. Thanks to Abhinav Kumar for the debug help.

Cc: Abhinav Kumar 
Cc: Thomas Zimmermann 
Cc: Maxime Ripard 
References: https://bugzilla.kernel.org/show_bug.cgi?id=199425
References: 
https://lore.kernel.org/all/20220221134155.125447-9-max...@cerno.tech/
References: https://bugzilla.kernel.org/show_bug.cgi?id=199425
Cc: Maxime Ripard 
Tested-by: Maxime Ripard 
Cc: mikita.lip...@amd.com
Cc: Michel Dänzer 
Cc: harry.wentl...@amd.com
Cc: Rob Clark 
Cc: "Kazlauskas, Nicholas" 
Cc: Dmitry Osipenko 
Cc: Maarten Lankhorst 
Cc: Dmitry Baryshkov 
Cc: Sean Paul 
Cc: Matthias Brugger 
Cc: AngeloGioacchino Del Regno 
Cc: "Ville Syrjälä" 
Cc: Jani Nikula 
Cc: Lucas De Marchi 
Cc: Imre Deak 
Cc: Manasi Navare 
Cc: linux-arm-...@vger.kernel.org
Cc: freedreno@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-media...@lists.infradead.org
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/drm_atomic_helper.c  | 13 -
 drivers/gpu/drm/i915/display/intel_display.c | 14 ++
 drivers/gpu/drm/msm/msm_atomic.c | 15 +++
 3 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index d579fd8f7cb8..f6b4c3a00684 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1587,13 +1587,6 @@ drm_atomic_helper_wait_for_vblanks(struct drm_device 
*dev,
int i, ret;
unsigned int crtc_mask = 0;
 
-/*
- * Legacy cursor ioctls are completely unsynced, and userspace
- * relies on that (by doing tons of cursor updates).
- */
-   if (old_state->legacy_cursor_update)
-   return;
-
for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, 
new_crtc_state, i) {
if (!new_crtc_state->active)
continue;
@@ -2244,12 +2237,6 @@ int drm_atomic_helper_setup_commit(struct 
drm_atomic_state *state,
continue;
}
 
-   /* Legacy cursor updates are fully unsynced. */
-   if (state->legacy_cursor_update) {
-   complete_all(>flip_done);
-   continue;
-   }
-
if (!new_crtc_state->event) {
commit->event = kzalloc(sizeof(*commit->event),
GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 3479125fbda6..2454451fcf95 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -7651,6 +7651,20 @@ static int intel_atomic_commit(struct drm_device *dev,
intel_runtime_pm_put(_priv->runtime_pm, state->wakeref);
return ret;
}
+
+   /*
+* FIXME: Cut over to (async) commit helpers instead of hand-rolling
+* everything.
+*/
+   if (state->base.legacy_cursor_update) {
+   struct intel_crtc_state *new_crtc_state;
+   struct intel_crtc *crtc;
+   int i;
+
+   for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i)
+   complete_all(_crtc_state->uapi.commit->flip_done);
+   }
+
intel_shared_dpll_swap_state(state);
intel_atomic_track_fbs(state);
 
diff --git a/drivers/gpu/drm/msm/msm_atomic.c