Hello Kristian,
See attached snapshot of slave threads and tokudb locks.  Thread 16 is
waiting for a tokudb lock held by thread 16, which is waiting for a tokudb
lock held by thread 14.  Thread 14 is waiting for a prior transaction to
complete, presumably either thread 15 or 16.  So, we have a deadlock that
tokudb can not detect because the ordering constraint is not available to
tokudb.  I assume that the optimistic scheduler killed thread 16, but since
tokudb does not implement the kill_query function, the deadlock is only
resolved when the tokudb lock timer pops.

On Mon, Aug 15, 2016 at 8:16 AM, Rich Prohaska <[email protected]> wrote:

> Hello Kristian,
> The simplest kill_query implementation for tokudb would just signal all of
> the pending lock request's condition variables.  This would cause the
> killed callback to be called.  A performance refinement, if necessary,
> would allow thread A (executing the kill_query function) to identify and
> signal a condition variable for a blocked thread B.
>
> On Mon, Aug 15, 2016 at 5:42 AM, Kristian Nielsen <
> [email protected]> wrote:
>
>> Rich Prohaska <[email protected]> writes:
>>
>> > tokudb lock timeouts are resolving the replication stall.
>> unfortunately,
>> > the tokudb lock timeout is 4 seconds, so the throughput is almost zero.
>>
>> Yes. Sorry for not making it clear that my proof-of-concept patch was
>> incomplete...
>>
>> >> > I suspect that the poor slave replication performance for optimistic
>> >> > replication occurs because TokuDB does not implement the kill_query
>> >> > handlerton function.  kill_handlerton gets called to resolve lock
>> wait
>>
>> >> Possibly, but I'm not sure it's that important. The kill will be
>> effective
>> >> as soon as the wait is over.
>>
>> No, you're absolutely right, after testing (and thinking) some more, I
>> realise that indeed the kill_query functionality is important.
>>
>> A possible scenario is, given transactions T1, T2, and T3 in that order:
>>
>> T3 acquires a lock on row R3, T2 similarly acquires R2.
>> Now T3 tries to acquire R2, but has to wait for T2 to release it.
>> Later T1 tries to acquire R3, also has to wait.
>>
>> At this point, we kill T3, since it is holding a lock (R3) needed by an
>> earlier transaction T1. However, T3 will not notice the kill until its own
>> wait (on R2 held by T2) times out. T2 cannot release the lock because it
>> is
>> waiting for T1 to commit first. So we have a deadlock :-/
>>
>> With InnoDB, the kill causes T3 to wake up immediately and roll back, so
>> that T1 can proceed without much delay.
>>
>> Ok, so something more is needed here. I see there is a killed_callback()
>> which seems to check for the kill, so I'm hoping that can be used with a
>> suitable wakeup of the offending lock_request (or all requests,
>> perhaps). But as I'm completely new to TokuDB, I still need some more time
>> to read the code and try to understand how everything fits together...
>>
>> > TokuFT implements pessimistic locking and 2 phase locking algorithms.
>> This
>> > wiki describes locking and concurrency in a little more detail:
>> > https://github.com/percona/tokudb-engine/wiki/Transactions-
>> and-Concurrency.
>>
>> Thanks, this was quite helpful.
>>
>> > Yes, I think they are false positives since the thd_report_wait_for API
>> is
>> > called but it does NOT call the THD::awake function.
>>
>> Ah. Then it's probably normal, caused by the group-commit optimisation. In
>> conservative mode, if two transactions T1 and T2 did not group commit on
>> the
>> master, then cannot be started in parallel on the slave. But T2 can start
>> as
>> soon as T1 has reached COMMIT. Thus, if T2 happens to conflict with T1,
>> there is a small window where T2 can need to wait on T1 until T1 has
>> completed its commit.
>>
>> Thanks,
>>
>>  - Kristian.
>>
>
>
Id      User    Host    db      Command Time    State   Info    Progress
13      system user             NULL    Connect 62      Waiting for master to 
send event        NULL    0.000
14      system user             NULL    Connect 15      Waiting for prior 
transaction to commit NULL    0.000
15      system user             NULL    Connect 15      
Delete_rows_log_event::find_row(-1)     NULL    0.000
16      system user             NULL    Killed  15      
Delete_rows_log_event::find_row(-1)     NULL    0.000
17      system user             NULL    Connect 62      Waiting for room in 
worker thread event queue   NULL    0.000
21      root    localhost       information_schema      Query   0       init    
show processlist        0.000
trx_id  trx_mysql_thread_id     trx_time
41844148        15      16
41844151        14      16
41844154        16      16
locks_trx_id    locks_mysql_thread_id   locks_dname     locks_key_left  
locks_key_right locks_table_schema      locks_table_name        
locks_table_dictionary_name
41844148        15      ./sbtest/sbtest1-main   00f1010000      00f1010000      
sbtest  sbtest1 main
41844148        15      ./sbtest/sbtest1-main   00f1010000      00f1010000      
sbtest  sbtest1 main
41844148        15      ./sbtest/sbtest1-main   00f1010000      00f1010000      
sbtest  sbtest1 main
41844148        15      ./sbtest/sbtest1-main   00f7010000      00f7010000      
sbtest  sbtest1 main
41844148        15      ./sbtest/sbtest1-main   00f7010000      00f7010000      
sbtest  sbtest1 main
41844148        15      ./sbtest/sbtest1-key-k_1        00f2010000f1010000      
00f2010000f1010000      sbtest  sbtest1 key-k_1
41844148        15      ./sbtest/sbtest1-key-k_1        00f3010000f1010000      
00f3010000f1010000      sbtest  sbtest1 key-k_1
41844151        14      ./sbtest/sbtest1-main   00f8010000      00f8010000      
sbtest  sbtest1 main
41844151        14      ./sbtest/sbtest1-main   00f8010000      00f8010000      
sbtest  sbtest1 main
41844151        14      ./sbtest/sbtest1-main   00f0010000      00f0010000      
sbtest  sbtest1 main
41844151        14      ./sbtest/sbtest1-main   00f0010000      00f0010000      
sbtest  sbtest1 main
41844151        14      ./sbtest/sbtest1-key-k_1        00df010000f8010000      
00df010000f8010000      sbtest  sbtest1 key-k_1
41844151        14      ./sbtest/sbtest1-key-k_1        00e0010000f8010000      
00e0010000f8010000      sbtest  sbtest1 key-k_1
41844154        16      ./sbtest/sbtest1-main   00f3010000      00f3010000      
sbtest  sbtest1 main
41844154        16      ./sbtest/sbtest1-main   00f3010000      00f3010000      
sbtest  sbtest1 main
41844154        16      ./sbtest/sbtest1-main   00f9010000      00f9010000      
sbtest  sbtest1 main
41844154        16      ./sbtest/sbtest1-main   00f9010000      00f9010000      
sbtest  sbtest1 main
41844154        16      ./sbtest/sbtest1-key-k_1        00f5010000f3010000      
00f5010000f3010000      sbtest  sbtest1 key-k_1
41844154        16      ./sbtest/sbtest1-key-k_1        00f6010000f3010000      
00f6010000f3010000      sbtest  sbtest1 key-k_1
41844154        16      ./sbtest/sbtest1-key-k_1        00f2010000f9010000      
00f2010000f9010000      sbtest  sbtest1 key-k_1
41844154        16      ./sbtest/sbtest1-key-k_1        00f3010000f9010000      
00f3010000f9010000      sbtest  sbtest1 key-k_1
requesting_trx_id       blocking_trx_id lock_waits_dname        
lock_waits_key_left     lock_waits_key_right    lock_waits_start_time   
lock_waits_table_schema lock_waits_table_name   lock_waits_table_dictionary_name
41844148        41844154        ./sbtest/sbtest1-main   00f9010000      
00f9010000      1471275642868   sbtest  sbtest1 main
41844154        41844151        ./sbtest/sbtest1-main   00f0010000      
00f0010000      1471275642868   sbtest  sbtest1 main
_______________________________________________
Mailing list: https://launchpad.net/~maria-developers
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp

Reply via email to