Kristian Nielsen via developers <developers@lists.mariadb.org> writes:

> (For in-order parallel replication itself, the deadlock detection problem is
> much simpler, because we _know_ that any T_(i+1) will eventually have to
> wait for T_(i). So a deadlock will occur if-and-only-if there is some T_j
> that waits for T_k, with j < k. But this is not true if considering other
> transactions that the in-order sequence within one replication domain_id).

It's good to consider these issues again. It's a pity we didn't have this
discussion in 10.0/10.1 when I implemented parallel replication, or later in
10.6 when you improved the InnoDB locking.

Maybe we could do things much simpler? If we expose the gtid_sub_id directly,
the InnoDB deadlock detector can just look if the sub_id increases while
traversing the wait-for path. If it does, it can just stop there and declare
a deadlock and choose the transaction with the higher sub_id as the victim.
We will have to roll back that transaction anyway at some point to preserve
commit order, and doing so now will break any cycle if it's there.

Then we get rid of trying to hint InnoDB which victim to prefer with
thd_deadlock_victim_preference(). And we don't need the
thd_rpl_deadlock_check() for every wait, which then has to queue a
background task to later kill the waited-for transaction.

Needs some more thought to be sure all details are right. I think I should
still fix just the regression with missing thd_deadlock_victim_preference()
for now, but something to think about to make the code both simpler and more
efficient.

>> Thank you for the explanation. Can you please also post it to
>> https://jira.mariadb.org/browse/MDEV-24948 for future reference?

I added some more explanation there, and ideas for how to mostly eliminate
the overhead.

>> It would also be useful if you could check the following changes in
>> MySQL 8.0 that are linked from MDEV-24948:

>> https://github.com/mysql/mysql-server/commit/30ead1f6966128cbcd32c7b6029ea2170aeef5f9

Interesting. This seems similar to
https://jira.mariadb.org/browse/MDEV-31840, which I discovered recently.
The MySQL code uses a function thd_report_lock_wait(), which looks very
similar to the MariaDB thd_rpl_deadlock_check().

>> https://github.com/mysql/mysql-server/commit/3859219875b62154b921e8c6078c751198071b9c

This is a large patch, I did not yet manage to read through it all. But from
reading some of the comments, it sounds like they move the deadlock
detection to be done asynchronously as a background task, rather than
synchronously whenever a wait is needed. This is quite interesting, and
something I've earlier idly speculated could reduce the cost of deadlock
detection.

 - Kristian.
_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

Reply via email to