On Mon, Dec 04, 2023 at 02:50:17PM -0500, Stefan Hajnoczi wrote:
> On Mon, 4 Dec 2023 at 14:40, Philippe Mathieu-Daudé <phi...@linaro.org> wrote:
> >
> > Unplugging vCPU triggers the following assertion in
> 
> Unplugging leaks the tcg context refcount but does not trigger the
> assertion directly. Maybe clarify that by changing the wording:
> 
> "Plugging a vCPU after it has been unplugged triggers..."
> 
> > tcg_register_thread():
> >
> >  796 void tcg_register_thread(void)
> >  797 {
> >  ...
> >  812     /* Claim an entry in tcg_ctxs */
> >  813     n = qatomic_fetch_inc(&tcg_cur_ctxs);
> >  814     g_assert(n < tcg_max_ctxs);
> >
> > Implement and use tcg_unregister_thread() so when a
> > vCPU is unplugged, the tcg_cur_ctxs refcount is
> > decremented.
> >
> > Reported-by: Michal Suchánek <msucha...@suse.de>
> > Suggested-by: Stefan Hajnoczi <stefa...@gmail.com>
> > Signed-off-by: Philippe Mathieu-Daudé <phi...@linaro.org>
> > ---
> > RFC: untested
> > Report: 
> > https://lore.kernel.org/qemu-devel/20231204183638.gz9...@kitsune.suse.cz/
> > ---
> >  include/tcg/startup.h           |  5 +++++
> >  accel/tcg/tcg-accel-ops-mttcg.c |  1 +
> >  accel/tcg/tcg-accel-ops-rr.c    |  1 +
> >  tcg/tcg.c                       | 17 +++++++++++++++++
> >  4 files changed, 24 insertions(+)
> >
> > diff --git a/include/tcg/startup.h b/include/tcg/startup.h
> > index f71305765c..520942a4a1 100644
> > --- a/include/tcg/startup.h
> > +++ b/include/tcg/startup.h
> > @@ -45,6 +45,11 @@ void tcg_init(size_t tb_size, int splitwx, unsigned 
> > max_cpus);
> >   */
> >  void tcg_register_thread(void);
> >
> > +/**
> > + * tcg_unregister_thread: Unregister this thread with the TCG runtime
> > + */
> > +void tcg_unregister_thread(void);
> > +
> >  /**
> >   * tcg_prologue_init(): Generate the code for the TCG prologue
> >   *
> > diff --git a/accel/tcg/tcg-accel-ops-mttcg.c 
> > b/accel/tcg/tcg-accel-ops-mttcg.c
> > index fac80095bb..88d7427aad 100644
> > --- a/accel/tcg/tcg-accel-ops-mttcg.c
> > +++ b/accel/tcg/tcg-accel-ops-mttcg.c
> > @@ -120,6 +120,7 @@ static void *mttcg_cpu_thread_fn(void *arg)
> >
> >      tcg_cpus_destroy(cpu);
> >      qemu_mutex_unlock_iothread();
> > +    tcg_unregister_thread();
> >      rcu_remove_force_rcu_notifier(&force_rcu.notifier);
> >      rcu_unregister_thread();
> >      return NULL;
> > diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c
> > index 611932f3c3..c2af3aad21 100644
> > --- a/accel/tcg/tcg-accel-ops-rr.c
> > +++ b/accel/tcg/tcg-accel-ops-rr.c
> > @@ -302,6 +302,7 @@ static void *rr_cpu_thread_fn(void *arg)
> >          rr_deal_with_unplugged_cpus();
> >      }
> >
> > +    tcg_unregister_thread();
> >      rcu_remove_force_rcu_notifier(&force_rcu);
> >      rcu_unregister_thread();
> >      return NULL;
> > diff --git a/tcg/tcg.c b/tcg/tcg.c
> > index d2ea22b397..5125342d70 100644
> > --- a/tcg/tcg.c
> > +++ b/tcg/tcg.c
> > @@ -781,11 +781,18 @@ static void alloc_tcg_plugin_context(TCGContext *s)
> >   * modes.
> >   */
> >  #ifdef CONFIG_USER_ONLY
> > +
> >  void tcg_register_thread(void)
> >  {
> >      tcg_ctx = &tcg_init_ctx;
> >  }
> > +
> > +void tcg_unregister_thread(void)
> > +{
> > +}
> > +
> >  #else
> > +
> >  void tcg_register_thread(void)
> >  {
> >      TCGContext *s = g_malloc(sizeof(*s));
> > @@ -814,6 +821,16 @@ void tcg_register_thread(void)
> >
> >      tcg_ctx = s;
> >  }
> > +
> > +void tcg_unregister_thread(void)
> > +{
> > +    unsigned int n;
> > +
> > +    n = qatomic_fetch_dec(&tcg_cur_ctxs);
> > +    g_free(tcg_ctxs[n]);
> > +    qatomic_set(&tcg_ctxs[n], NULL);
> > +}
> 
> tcg_ctxs[n] may not be our context, so this looks like it could free
> another thread's context and lead to undefined behavior.

There is cpu->thread_id so perhaps cpu->thread_ctx could be added as
well. That would require a bitmap of used threads contexts rather than a
counter, though.

Thanks

Michal

Reply via email to