Hello Kristian, The simplest kill_query implementation for tokudb would just signal all of the pending lock request's condition variables. This would cause the killed callback to be called. A performance refinement, if necessary, would allow thread A (executing the kill_query function) to identify and signal a condition variable for a blocked thread B.
On Mon, Aug 15, 2016 at 5:42 AM, Kristian Nielsen <[email protected]> wrote: > Rich Prohaska <[email protected]> writes: > > > tokudb lock timeouts are resolving the replication stall. unfortunately, > > the tokudb lock timeout is 4 seconds, so the throughput is almost zero. > > Yes. Sorry for not making it clear that my proof-of-concept patch was > incomplete... > > >> > I suspect that the poor slave replication performance for optimistic > >> > replication occurs because TokuDB does not implement the kill_query > >> > handlerton function. kill_handlerton gets called to resolve lock wait > > >> Possibly, but I'm not sure it's that important. The kill will be > effective > >> as soon as the wait is over. > > No, you're absolutely right, after testing (and thinking) some more, I > realise that indeed the kill_query functionality is important. > > A possible scenario is, given transactions T1, T2, and T3 in that order: > > T3 acquires a lock on row R3, T2 similarly acquires R2. > Now T3 tries to acquire R2, but has to wait for T2 to release it. > Later T1 tries to acquire R3, also has to wait. > > At this point, we kill T3, since it is holding a lock (R3) needed by an > earlier transaction T1. However, T3 will not notice the kill until its own > wait (on R2 held by T2) times out. T2 cannot release the lock because it is > waiting for T1 to commit first. So we have a deadlock :-/ > > With InnoDB, the kill causes T3 to wake up immediately and roll back, so > that T1 can proceed without much delay. > > Ok, so something more is needed here. I see there is a killed_callback() > which seems to check for the kill, so I'm hoping that can be used with a > suitable wakeup of the offending lock_request (or all requests, > perhaps). But as I'm completely new to TokuDB, I still need some more time > to read the code and try to understand how everything fits together... > > > TokuFT implements pessimistic locking and 2 phase locking algorithms. > This > > wiki describes locking and concurrency in a little more detail: > > https://github.com/percona/tokudb-engine/wiki/ > Transactions-and-Concurrency. > > Thanks, this was quite helpful. > > > Yes, I think they are false positives since the thd_report_wait_for API > is > > called but it does NOT call the THD::awake function. > > Ah. Then it's probably normal, caused by the group-commit optimisation. In > conservative mode, if two transactions T1 and T2 did not group commit on > the > master, then cannot be started in parallel on the slave. But T2 can start > as > soon as T1 has reached COMMIT. Thus, if T2 happens to conflict with T1, > there is a small window where T2 can need to wait on T1 until T1 has > completed its commit. > > Thanks, > > - Kristian. >
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp

