[MariaDB developers] Re: [MariaDB commits] [PATCH 1/6] MDEV-34696: do_gco_wait() completes too early on InnoDB dict stats updates

Andrei via developers Fri, 23 Aug 2024 10:01:17 -0700

> andrei.el...@pp.inet.fi writes:
>
>>> Now consider the second case, where T_1 only sets rgi->killed_for_retry
>>> _after_ T_2 has passed the above code point:
>>>
>>> 1. T_2 acquires some lock inside InnoDB
>>>
>>> 2. T_2 passes the above code point with the check for rgi->killed_for_retry,
>>> it calls mark_start_commit() and starts the commit process.
>>>
>>> 3. T_1 only now gets a lock conflict with T_2, it calls
>>> thd_rpl_deadlock_check(), sets rgi->killed_for_retry, schedules a task for
>>> the manager thread to kill T_2
>>>
>>> 4. T_1 goes to wait on the lock.
>>>
>>> In this situation, the lock in InnoDB was still held by T_2 when it ran
>>> mark_start_commit().
>>
>> Hmm, this is unclear. I thought you had a case in which T_2
>> releases an innodb stats lock at the end of its current statement.
>> And T_1 would try to acquire the lock within that timeframe.
>
> I was describing two different cases, "first case" and "second case".
>
> The "first case" is the case where the bug occurred, and the one my patch 
> fixes.
> It is just as you describe, T_2 releases an innodb stats lock at the end of
> its current statement.
>
> The above "second case" describes a different scenario, trying to explain
> why this scenario does not cause a bug. This case differs from the "first 
> case"
> in that the lock taken by T_2 is _not_ released at the end of its current
> statement, it is only released at commit/rollback.
>
> Sorry if this was not clear.
>
>> Am I confused in the above? It'd be great to have a T_2 worker stack 
>> at time of the lock is granted to it, but it's not provided..
>
> I have included below a stack trace from my notes, where an insert goes to
> take a dict stats lock.
>
>> Obvisoulsy under this perception T_2 would execute mark_start_commit()
>> after the lock is released.
>
> For the scenario where T_2 executes mark_start_commit() after the lock is
> released, and the bug occurred, you should look at the "first case",
> described in points 1-8.
>
> The "second case" that you found unclear you can just ignore.


Got it. Thanks for spelling out!

I see I should've asked specifically on the first case.
I meant the following. What if p.4 lock release by T_2 happened within
p.2, and before T_2::rgi->killed_for_retry is changed.
Something like

  1. T_2 gets a lock inside InnoDB, on the dict_stats table in this case.

  2. T_1 requests a conflicting lock on dict_stats. It calls
  thd_rpl_deadlock_check() ...

+ 4

  ... which sets rgi->killed_for_retry and schedules a
  task for the manager thread to kill T_2. But the manager thread does not
  service the request yet.

- 3. T_1 goes to wait on the lock.
+ 3. T_1 is granted the lock immediately.

- 4. T_2 later releases the lock on dict_stats - this is a special case,
  normally DML locks are held until COMMIT.

But I see now that the step 4 relocation is impossible due to Innodb
locking protocol.

So that proves T_2 can't miss to find its killing status at mark_start_commit().

> that was to explain what happens in a different scenario, where T_2 does not
> release its lock until the transaction is rolled back. In this case we avoid
> starting D_3 too early, because T_2 does unmark_start_commit() before
> rolling back and releasing its lock, so there is no bug (with or without the
> patch).
>
>> Please clear it out.
>
> I hope this helps.

No more question, as we're done :-).

Thanks for this piece of analysis and work!

Andrei

>
>  - Kristian.
_______________________________________________
developers mailing list -- developers@lists.mariadb.org
To unsubscribe send an email to developers-le...@lists.mariadb.org

[MariaDB developers] Re: [MariaDB commits] [PATCH 1/6] MDEV-34696: do_gco_wait() completes too early on InnoDB dict stats updates

Reply via email to