Hi, Kristian,

On Oct 20, Kristian Nielsen wrote:
> Hi Serg,
> 
> I came upon this commit by you:
> 
> > commit 6b685ea7b0776430d45b095cb4be3ef0739a3c04
> > Author: Sergei Golubchik <s...@mariadb.org>
> > Date:   Wed Sep 28 18:55:15 2022 +0200
> > 
> >     correctness assert
> >     
> >     thd_get_ha_data() can be used without a lock, but only from the
> >     current thd thread, when calling from anoher thread it *must*
> >     be protected by thd->LOCK_thd_data
> >     
> >     * fix group commit code to take thd->LOCK_thd_data
> >     * remove innobase_close_connection() from the innodb background thread,
> >       it's not needed after 87775402cd0c and was failing the assert with
> >       current_thd==0
> 
> > @@ -8512,7 +8512,11 @@ 
> > MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
> >      ++num_commits;
> >      if (current->cache_mngr->using_xa && likely(!current->error) &&
> >          DBUG_EVALUATE_IF("skip_commit_ordered", 0, 1))
> > +    {
> > +      mysql_mutex_lock(&current->thd->LOCK_thd_data);
> >        run_commit_ordered(current->thd, current->all);
> > +      mysql_mutex_unlock(&current->thd->LOCK_thd_data);
> > +    }
> 
> This seems _really_ expensive :-(
> This code runs for every transaction commit in replication, and it
> runs under a global LOCK_commit_ordered which needs to be held for as
> short as possible.

Monty asserted many times that taking an uncontended mutex is
cheap. Did you think it's expensive or did you benchmark that?

> So what is the actual reason for this change?

This is just a protocol for accessing mostly-read-only thread-local
variable. Either we can protect all accesses with a mutex. Which will
heavily penalize the thread owner of the variable.

This commit enforced a different protocol: the owner can read the value
without a mutex, and only take the mutex when modifying it (which is
infequent). A different thread can only read the variable and only with
a mutex (this is also very infequent).

I don't remember what code was violating it, likely something
wsrep-specific.

> Note that at this point, the group commit leader thread _is_
> effectively acting as if it was each of the participating threads, the
> other threads are blocked from running during commit_ordered.

I'm not sure it's good enough. Practically it might work, but if one
thread has modified some variable under a mutex, you cannot simply read
it from another thread without a mutex, there's no guarantee that
another thread will see the new value. You'll need an appropriate memory
barrier for that.

> If there is some reason, then the correct fix could be to set
> current_thd for the duration of run_commit_ordered, which will satisfy
> the assertion and presumably the actual bug?

Setting current_thd is cheating, it will trick the assert, but won't fix
the underlying problem as outlined above.

Regards,
Sergei
Chief Architect, MariaDB Server
and secur...@mariadb.org
_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

Reply via email to