> andrei.el...@pp.inet.fi writes: > >>> Now consider the second case, where T_1 only sets rgi->killed_for_retry >>> _after_ T_2 has passed the above code point: >>> >>> 1. T_2 acquires some lock inside InnoDB >>> >>> 2. T_2 passes the above code point with the check for rgi->killed_for_retry, >>> it calls mark_start_commit() and starts the commit process. >>> >>> 3. T_1 only now gets a lock conflict with T_2, it calls >>> thd_rpl_deadlock_check(), sets rgi->killed_for_retry, schedules a task for >>> the manager thread to kill T_2 >>> >>> 4. T_1 goes to wait on the lock. >>> >>> In this situation, the lock in InnoDB was still held by T_2 when it ran >>> mark_start_commit(). >> >> Hmm, this is unclear. I thought you had a case in which T_2 >> releases an innodb stats lock at the end of its current statement. >> And T_1 would try to acquire the lock within that timeframe. > > I was describing two different cases, "first case" and "second case". > > The "first case" is the case where the bug occurred, and the one my patch > fixes. > It is just as you describe, T_2 releases an innodb stats lock at the end of > its current statement. > > The above "second case" describes a different scenario, trying to explain > why this scenario does not cause a bug. This case differs from the "first > case" > in that the lock taken by T_2 is _not_ released at the end of its current > statement, it is only released at commit/rollback. > > Sorry if this was not clear. > >> Am I confused in the above? It'd be great to have a T_2 worker stack >> at time of the lock is granted to it, but it's not provided.. > > I have included below a stack trace from my notes, where an insert goes to > take a dict stats lock. > >> Obvisoulsy under this perception T_2 would execute mark_start_commit() >> after the lock is released. > > For the scenario where T_2 executes mark_start_commit() after the lock is > released, and the bug occurred, you should look at the "first case", > described in points 1-8. > > The "second case" that you found unclear you can just ignore.
Got it. Thanks for spelling out! I see I should've asked specifically on the first case. I meant the following. What if p.4 lock release by T_2 happened within p.2, and before T_2::rgi->killed_for_retry is changed. Something like 1. T_2 gets a lock inside InnoDB, on the dict_stats table in this case. 2. T_1 requests a conflicting lock on dict_stats. It calls thd_rpl_deadlock_check() ... + 4 ... which sets rgi->killed_for_retry and schedules a task for the manager thread to kill T_2. But the manager thread does not service the request yet. - 3. T_1 goes to wait on the lock. + 3. T_1 is granted the lock immediately. - 4. T_2 later releases the lock on dict_stats - this is a special case, normally DML locks are held until COMMIT. But I see now that the step 4 relocation is impossible due to Innodb locking protocol. So that proves T_2 can't miss to find its killing status at mark_start_commit(). > that was to explain what happens in a different scenario, where T_2 does not > release its lock until the transaction is rolled back. In this case we avoid > starting D_3 too early, because T_2 does unmark_start_commit() before > rolling back and releasing its lock, so there is no bug (with or without the > patch). > >> Please clear it out. > > I hope this helps. No more question, as we're done :-). Thanks for this piece of analysis and work! Andrei > > - Kristian. _______________________________________________ developers mailing list -- developers@lists.mariadb.org To unsubscribe send an email to developers-le...@lists.mariadb.org