Re: UI layer threading and locking strategy; memory_region_snapshot_and_clear_dirty() races

Peter Maydell Thu, 17 Nov 2022 05:07:21 -0800

On Tue, 1 Nov 2022 at 14:17, Peter Maydell <peter.mayd...@linaro.org> wrote:
>
> Hi; I'm trying to find out what the UI layer's threading and
> locking strategy is, at least as far as it applies to display
> device models.


Ping! :-) I'm still looking for information about this,
and about what threads call_rcu() callbacks might be run on...

thanks
-- PMM

> Specifically:
>  * is the device's GraphicHwOps::gfx_update method always called
>    from one specific thread, or might it be called from any thread?
>  * is that method called with any locks guaranteed held? (eg the
>    iothread lock)
>  * is the caller of the gfx_update method OK if an implementation
>    of the method drops the iothread lock temporarily while it is
>    executing? (my guess would be "no")
>  * for a gfx_update_async = true device, what are the requirements
>    on calling graphic_hw_update_done()? Does the caller need to hold
>    any particular lock? Does the call need to be done from any
>    particular thread?
>
> The background to this is that I'm looking again at the race
> condition involving the memory_region_snapshot_and_clear_dirty()
> function, as described here:
>  
> https://lore.kernel.org/qemu-devel/CAFEAcA9odnPo2LPip295Uztri7JfoVnQbkJ=wn+k8dqneb_...@mail.gmail.com/T/#u
>
> Having worked through what is going on, as far as I can see:
>  (1) in order to be sure that we have the right data to match
>  the snapshotted dirty bitmap state, we must wait for all TCG
>  vCPUs to leave their current TB
>  (2) a vCPU might block waiting for the iothread lock mid-TB
>  (3) therefore we cannot wait for the TCG vCPUs without dropping
>  the iothread lock one way or another
>  (4) but none of the callers expect that and various things break
>
> My tentative idea for a fix is a bit of an upheaval:
>  * have the display devices set gfx_update_async = true
>  * instead of doing everything synchronously in their gfx_update
>    method, they do the initial setup and call an 'async' version
>    of memory_region_snapshot_and_clear_dirty()
>  * that async version of the function will do what it does today,
>    but without trying to wait for TCG vCPUs
>  * instead the caller arranges (via call_rcu(), probably) a
>    callback that will happen once all the TCG CPUs have finished
>    executing their current TB
>  * that callback does the actual copy-from-guest-ram-to-display
>    and then calls graphic_hw_update_done()
>
> This seems like an awful pain in the neck but I couldn't see
> anything better :-(
>
> Paolo: what (if any) guarantee does call_rcu() make about
> which thread the callback function gets executed on, and what
> locks are/are not held when it's called?
>
> (I haven't looked at the migration code's use of
> memory_global_after_dirty_log_sync() but I suspect it's
> similarly broken.)
>
> thanks
> -- PMM

Re: UI layer threading and locking strategy; memory_region_snapshot_and_clear_dirty() races

Reply via email to