Hi Sergei, I have implemented PlanE as agreed on branch bb-10.2-MDEV-25114-planE-galera and mostly regression testing looks promising. However, I have problems with MDL-locks. For example test case galera.galera_toi_lock_exclusive hangs and I have not yet found out why. I will ask help from Seppo.
R: Jan On Fri, Oct 15, 2021 at 11:36 AM Sergei Golubchik <s...@mariadb.org> wrote: > Hi, Jan! > > On Oct 15, Jan Lindström wrote: > > Few questions: > > > > (1) Is this review for a full patch or just problems on > > wsrep_abort_transaction ? > > a full patch > > > (2) In case at wsrep_abort_transaction we do not have a transaction idea > is > > that we do not anymore want to enter InnoDB i.e. innobase_kill_query, > > that is the reason we set MUST_ABORT to wsrep_conflict_state so that no > > more mutexes from InnoDB is acquired. > > right, sorry. I thought I've removed too much from > wsrep_abort_transaction. > > > (3) In wsrep_close_connections code used LOCK_thread_lock, is that > > enough to protect THDs on list from concurrent disconnect and delete? > > (I added locking for THD::LOCK_thd_data but I was not sure is it > > necessary) > > Isn't it a separate issue? > > Anyway, as far as I understand, LOCK_thread_count protects the list of > threads, a THD cannot be added to or deleted from the list. > > but a THD can be disconnected or killed and it'll wait somewhere at the > very end of the destructor to remove itself from the list. > So, yes, you need to take LOCK_thd_data or LOCK_thd_kill too. > > > (4) Do we still keep manual KILL as TOI ? > > Not in my patch, no. > > > (5) What I do with thr_lock.c there we pass victim THD pointer to > > wsrep_abort_thd and it is as you pointed out mostly unprotected on > > that code, LOCK TABLE is TOI > > It's fine. Same logic as in wsrep_innobase_kill_one_trx(). > You kill a victim thd that owns thr locks. This THD cannot just > disappear, it has to release those locks first. And you own the > lock->mutex. So the THD isn't going anywhere until you return from > thr_lock(). > > > (6) Please note that fix will not be ready 19th of October as we again > > change critical path, Ramesh will run QA with pquery > > sure > > > On Thu, Oct 14, 2021 at 9:49 PM Sergei Golubchik <s...@mariadb.org> > wrote: > > > Hi, Jan! > > > > > > Here's an idea of the fix: > > > > > > Let's always use the KILL mutex locking order, that is > > > > > > victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex > > > > > > For this we need to fix wsrep_abort_transaction(), which is called > from the > > > server, and wsrep_innobase_kill_one_trx(), which is called from BF > > > thread. > > > > > > wsrep_abort_transaction() can be fixed by not invoking > > > wsrep_innobase_kill_one_trx() and always using KILL code path (that is > > > wsrep_thd_awake) and forcing rollback after the kill. > > > > > > wsrep_innobase_kill_one_trx() can be fixed by not locking LOCK_thd_data > > > at all, just don't lock it. We know that the victim waits on a lock > > > inside InnoDB and we've locked trx mutex and lock_sys mutex. The victim > > > cannot go away, cannot modify its data, it cannot do anything. So, > > > LOCK_thd_data doesn't seem to be necessary at that point. > > > > > > I've attached a demo patch. It compiles, but I didn't try to run it, > > > it's only to show the idea, not a working fix (I already suspect I > > > removed too much from wsrep_abort_transaction()). Note it's the patch > > > for 10.2 at the commit 29bbcac0ee8^ - that is one commit before my fix. > > > > > > On Oct 12, Jan Lindström wrote: > > > > Hi Sergei, > > > > > > > > Update on wsrep_close_connections problem. My suggestion to fix this > > > issue > > > > is on > > > > > > > > https://github.com/MariaDB/server/commit/99cbe03a44cc95e6f548550df51e7201ebea3b9d > > > > > > > > If you have a better solution, please advise. > > > > > > > > R: Jan > > > > Regards, > Sergei > VP of MariaDB Server Engineering > and secur...@mariadb.org >
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp