Hi Serg,

I came upon this commit by you:

> commit 6b685ea7b0776430d45b095cb4be3ef0739a3c04
> Author: Sergei Golubchik <s...@mariadb.org>
> Date:   Wed Sep 28 18:55:15 2022 +0200
> 
>     correctness assert
>     
>     thd_get_ha_data() can be used without a lock, but only from the
>     current thd thread, when calling from anoher thread it *must*
>     be protected by thd->LOCK_thd_data
>     
>     * fix group commit code to take thd->LOCK_thd_data
>     * remove innobase_close_connection() from the innodb background thread,
>       it's not needed after 87775402cd0c and was failing the assert with
>       current_thd==0

> @@ -8512,7 +8512,11 @@ 
> MYSQL_BIN_LOG::trx_group_commit_leader(group_commit_entry *leader)
>      ++num_commits;
>      if (current->cache_mngr->using_xa && likely(!current->error) &&
>          DBUG_EVALUATE_IF("skip_commit_ordered", 0, 1))
> +    {
> +      mysql_mutex_lock(&current->thd->LOCK_thd_data);
>        run_commit_ordered(current->thd, current->all);
> +      mysql_mutex_unlock(&current->thd->LOCK_thd_data);
> +    }

This seems _really_ expensive :-(
This code runs for every transaction commit in replication, and it runs
under a global LOCK_commit_ordered which needs to be held for as short as
possible.

The commit message doesn't mention anything about what goes wrong during the
group commit code. And the patch doesn't have any test case that shows the
problem, it just adds an assertion that will trigger during the group
commit.

So what is the actual reason for this change? Note that at this point, the
group commit leader thread _is_ effectively acting as if it was each of the
participating threads, the other threads are blocked from running during
commit_ordered.

If it is just so that we can have this assertion, then that needs to be
rolled back. We _really_ don't want to take/release N mutexes in release
builds during LOCK_commit_ordered, just to have an assertion in debug
builds.

If there is some reason, then the correct fix could be to set current_thd
for the duration of run_commit_ordered, which will satisfy the assertion and
presumably the actual bug? Well, it depends _what_ the actual reason/bug is.

 - Kristian.
_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

Reply via email to