[MariaDB developers] Re: [Commits] cb06612a9da: MDEV-31655: Parallel replication deadlock victim preference code errorneously removed

Marko Mäkelä via developers Thu, 03 Aug 2023 03:11:27 -0700

Hi Kristian,

On Tue, Jul 11, 2023 at 1:35 AM Kristian Nielsen
<kniel...@knielsen-hq.org> wrote:
>
> Hi Marko,
>
> I'd like to ask you to review the below patch for MDEV-31655.
>
>   https://jira.mariadb.org/browse/MDEV-31655
>   
> https://github.com/MariaDB/server/commit/cb06612a9da09a7981ada84768793f2ff3ef842c


I am sorry for the delay; last month I was mostly on vacation.

I sent some minor comments that apply to the InnoDB code changes.
I think that this needs to be reviewed by the replication team as well.
Whoever reviews that should pay attention to the stability of the tests.
We have a few replication tests that fail rather often.

> This patch restores code in InnoDB that was originally part of the parallel
> replication implementation. It makes InnoDB choose the right victim when two
> transactions in parallel replication deadlock with each other. The code was
> erroneously removed as part of the VATS work in InnoDB. I saw that you later
> deprecated and eventually removed the VATS code.

As far as I can tell, innodb_lock_schedule_algorithm=VATS consistently
reproduced debug assertion failures during stress testing. It was
finally removed from 10.6. We might introduce something similar later,
provided that it passes tests. In MySQL 8.0.20,
https://dev.mysql.com/worklog/task/?id=13468 reimplemented this as
CATS (Contention-Aware Transaction Scheduling).

> I saw that the InnoDB deadlock code is substantially changed in 10.6, so I
> will need to rework the patch in that version. But I wanted to ask your
> opinion about this 10.4 version first.

I must trust your judgement here; I have no fundamental objection. For
better testing of this, I think that we really need a 10.6 based
branch that contains both this and MDEV-31482.

> In the patch, note in particular the modification to has_higher_priority(),
> which I think has the effect of choosing which of two waiting transactions
> to grant a lock to. I don't remember adding this myself, so it might have
> been added as part of VATS work. But it sounds appropriate; if T1 and T2 are
> both waiting for a lock, there is no point in granting it to T2 over T1, as
> then T2 will just be immediately killed and rolled back by replication.

Because innodb_lock_schedule_algorithm=VATS is known to be broken even
without any involvement of replication, I do not think that we have to
care too much about it.

Marko
-- 
Marko Mäkelä, Lead Developer InnoDB
MariaDB plc
_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

[MariaDB developers] Re: [Commits] cb06612a9da: MDEV-31655: Parallel replication deadlock victim preference code errorneously removed

Reply via email to