I've pushed to 10.0 and 10.1 the change I described: With parallel replication, seconds_behind_master is updated only after transactions commit. (I did not change the behaviour of the non-parallel replication case yet.)
Ian, do you have enough information from this thread that you could update the docs in the knowledgebase accordingly? - Kristian. Ian Gilfillan <[email protected]> writes: > From a user's perspective, I like the idea of introducing the change > for both parallel and non-parallel in 10.1. > > On 15/10/2015 08:16, Kristian Nielsen wrote: >> It was brought to my attention an issue with parallel replication and the >> Seconds_Behind_Master field of SHOW SLAVE STATUS. I have a possible patch >> for this, but I wanted to discuss it on the list, as it changes semantics >> compared to the non-parallel case. >> >> Each binlog event contains a timestamp (**) of when the event was created on >> the master. Whenever the slave SQL thread reads an event from the relay log, >> it updates the value of Seconds_Behind_Master to the difference between the >> slave's current time and the event's timestamp. >> >> Now in parallel replication, the SQL thread can read a large number of >> events from the relay log and queue them in-memory for the worker threads. >> So a small value of Seconds_Behind_Master means only that recent events have >> been queued - it might still be a long time before the worker threads have >> had time to actually execute all the queued events. Apparently the problem >> is (justified) user confusion about this queuing delay not being reflected >> in Seconds_Behind_Master. >> >> The same problem actually exists in the non-parallel case. In case of a >> large transaction, the Seconds_Behind_Master can be small even though there >> is still a large amount of execution time remaining for the transaction to >> complete on the slave. However, in the non-parallel case, at most one >> transaction can be involved. In the parallel case, the problem is amplified >> by the potential of thousands of queued transactions awaiting execution. >> >> So how to solve it? Attached is a patch that implements one possible >> solution: the Seconds_Behind_Master is only updated after a transaction >> commits, with the timestamp of the commit events. This seems more intuitive >> anyway. But it does introduce a semantic difference between the non-parallel >> and parallel behaviour for Seconds_Behind_Master. The value will in general >> be larger on a parallel slave than on a non-parallel slave, for the same >> actual slave lag. >> >> Monty suggested changing the behaviour also for non-parallel mode - letting >> Seconds_Behind_Master reflect only events actually committed, not just read >> from the relay log. This would introduce an incompatible behaviour for >> Seconds_Behind_Master, but could perhaps be done for 10.1, if desired. Doing >> it in stable 10.0 would be more drastic. >> >> So any opinions on this? >> >> - Should Seconds_Behind_Master be changed as per above in parallel >> replication (from 10.0 on)? >> >> - If not, any suggestion for another semantics for Seconds_Behind_Master in >> parallel replication? >> >> - If so, should the change to Seconds_Behind_Master also be done in the >> non-parallel case in 10.1? What about 10.0? >> >> - Any comments on the patch? >> > > > _______________________________________________ > Mailing list: https://launchpad.net/~maria-developers > Post to : [email protected] > Unsubscribe : https://launchpad.net/~maria-developers > More help : https://help.launchpad.net/ListHelp _______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp

