Hi Kristian, On Thu, Jun 23, 2016 at 3:39 AM, Kristian Nielsen <[email protected]> wrote:
> Nirbhay Choubey <[email protected]> writes: > > > While copying the last 2 binlog files would have solved this, I have > worked > > out > > a solution where the donor node waits for binlog checkpoint event for > last > > binlog > > file to get logged before proceeding with file transfer. > > > > http://lists.askmonty.org/pipermail/commits/2016-June/009483.html > > Urgh, please don't do this, seems there are multiple problems with this > patch (insufficient locking, introducing a new redundant wait mechanism, > comparing binlog file names rather than ids, ...). > > > By the way, I initially tried reusing > > is_xidlist_idle_nolock()/COND_xid_list to implement the > > waiting mechanism. But since binlog checkpoint events are written > > asynchronously after > > xid_count falls to 0, that did not work. So later came up with the above > > I think it should work if you follow the chained locking of LOCK_xid_list > and LOCK_log. First wait under LOCK_xid_list for the binlog_xid_count_list > to become empty. Then release LOCK_xid_list and take and immediately > release > LOCK_log. mark_xid_done() will hold onto LOCK_log until the checkpoint > event > has been written. > > Note that there is already a similar wait mechanism, used by RESET > MASTER. RESET MASTER also needs to wait for checkpoint events to be > completed before running, so we should reuse that mechanism. > Right. > > Also, it seems reasonable that FTWRL in general could wait for checkpoint > events so that other backup mechanisms similarly could avoid binlog files > changing during backup. So please fix this in FTWRL, in 10.2. (If you feel > you need to fix the galera bug in 10.1, you can implement it only for > galera > in 10.1). > That sound good to me. But, considering Percona's backup locks, it seems more logical to implement this in Backup locks instead, whenever they get ported/implemented in MariaDB. Also, in this particular case, the problem lies in reload_acl_and_cache(REFRESH_BINARY_LOG), (executed after FTWRL while preparing for SST) that rotates the binary log. So, FTWRL is not directly linked to this issue. And as you rightly pointed, I will refrain from altering FTWRL's behavior in 10.1 at least. > So in more detail, here is suggested way to fix: > > In FTWRL (somewhere near the end, after commits are blocked), wait for > checkpoint events to be written using a similar mechanism as RESET MASTER: > > if (mysql_bin_log.is_open()) > { > mysql_mutex_lock(&LOCK_xid_list); > for (;;) > { > if (binlog_xid_count_list.is_last(binlog_xid_count_list.head())) > break; > mysql_cond_wait(&COND_xid_list, &LOCK_xid_list); > } > mysql_mutex_unlock(&LOCK_xid_list); > /* > LOCK_xid_list and LOCK_log are chained, so the LOCK_log will only be > obtained after mark_xid_done() has written the last checkpoint event. > */ > mysql_mutex_lock(&LOCK_log); > mysql_mutex_unlock(&LOCK_log); > } > > Now, since FTWRL is a bit different from RESET MASTER, we need a couple > other changes: > > - Use mysql_cond_broadcast(&COND_xid_list) instead of mysql_cond_signal() > in mark_xid_done() (to allow multiple waiters). > > - The second (but not the first mysql_cond_broadcast() in mark_xid_done() > should be unconditional, so remove the if() here: > > if (unlikely(reset_master_pending)) > mysql_cond_signal(&COND_xid_list); > > - Also add mysql_cond_broadcast(&COND_xid_list) in two other places that > the binlog_xid_count_list is modified. One in MYSQL_BIN_LOG::open(): > > while ((b= binlog_xid_count_list.head()) && b->xid_count == 0) > my_free(binlog_xid_count_list.get()); > > And one in reset_logs(): > > my_free(binlog_xid_count_list.get()); > > This should make FTWRL wait for all pending binlog checkpoint events to be > written. And with commits blocked, no new checkpoints should become > pending. > > Does it seem reasonable to you? Let me know if some things are unclear or > if > you see any potential problems with it. > Yes, it worked. But, to solve this issue in 10.1, I have added this wait to REFRESH_BINARY_LOG (as explained above) only when the server is acting as a Galera node. > By the way, how to you intend to handle the case where RESET MASTER is run > during SST? I just checked, FTWRL does not seem to block RESET MASTER. Or > do > you have another mechanism to prevent RESET MASTER from running during SST? > Thinking more, you should be holding LOCK_log while copying the binlog > files (I'm guessing your not currently, right?) You are right. > This will block RESET > MASTER, I am now taking LOG_log during the duration of file transfer as protection against the above commands. > and it also makes the extra lock/unlock of LOCK_log above redundant. Not quite. The wait logic (that includes LOCK_log, as the snippet above) is to pause REFRESH_BINARY_LOG and an additional use of LOCK_log to block the RESET/ FLUSH commands while file transfer is in progress. > Also, FTWRL has really complex semantics. You should get Monty's opinion > (or > maybe Serg?) on whether there are any potentials for deadlocks to waiting > inside FTWRL for binlog checkpoints. > As explained above, FTWRL remains unchanged, but will still check if Monty/Serg can take a look at the fix. http://lists.askmonty.org/pipermail/commits/2016-June/009494.html Best, Nirbhay > - Kristian. >
_______________________________________________ Mailing list: https://launchpad.net/~maria-developers Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp

