Great! Thank you! I've just found this issue at Jira. Will try to understand more. What is strange - sometimes it synchronizes, but mostly - don't.
It might be related to the last version of MariaDB 10.1.13 But, I remember, that I had some similar issue last summer (there was MariaDB 10.1.8) - but at that time, simple restart helped and I forgot about an issue. 2016-04-14 11:28 GMT+03:00 Andrew Garner <[email protected]>: > Sorry for the noise. > > XtraBackup does flush binary logs and does copy the last binary log > into the backup when gtids are detected (i.e. always on mariadb). > That's what's rotating your logs and it does only copy the last log > (post-rotate). I don't think that's configurable at all. > > I was able to reproduce this in my environment with identical results > to yours. MariaDB 10.1.13 / Xtrabackup 2.3.4 on Debian 8. I just > started a busy loop inserting a row constantly into a table and forced > SST: > > [Note] Recovering after a crash using /var/log/mysql/mysql-bin > [ERROR] Binlog file '/var/log/mysql/mysql-bin.000149' not found in > binlog index, needed for recovery. Aborting. > [ERROR] Crash recovery failed. Either correct the problem (if it's, > for example, out of memory error and restart, or delete (or rename) > binary log and start mysqld with > --tc-heuristic-recover={commit|rollback} > [ERROR] Can't init tc log > ERROR] Aborting > > Actual binlog changes each time while i'm writing to the cluster. > > I did find that commenting out the bits from wsrep_sst_xtrabackup-v2 > that copies the binlog back to /var/log/mysql/ avoids this error - I > guess the presence of a binlog here starts the recovery process. I'm > not sure what the ramifications are of skipping that recovery. > > This issue looks like it's already reported here: > https://jira.mariadb.org/browse/MDEV-9423 > > I will add my comments there and hope we can find a solution. > > > On Wed, Apr 13, 2016 at 11:50 PM, Igor Mazur <[email protected]> > wrote: > > Sorry, I've forgotten to say that I use xtrabackup. > > > > 2016-04-14 2:04 GMT+03:00 Andrew Garner <[email protected]>: > >> > >> What SST method are you using? Based on this I expect you are using > rsync. > >> > >> I imagine someone knows more detailed answers here, but from my > >> understanding: > >> > >> Internally the bin/wsrep_sst_rsync script indicates it needs "flush > >> tables" to the server and the server runs this "sst_flush_tables()" > >> function that effectively runs: > >> > >> FLUSH TABLES WITH READ LOCK > >> + reload_acl_and_cache(thd, REFRESH_ENGINE_LOG | REFRESH_BINARY_LOG, > ...) > >> > >> Which I think is equivalent to FLUSH ENGINE LOGS / FLUSH BINARY LOGS. > >> I think around this point it also sets innodb_disallow_writes to stop > >> InnoDB's background I/O (so rsync can be consistent). > >> > >> At that point the bin/wsrep_sst_rsync script graps the last > >> $BINLOG_N_FILES binary logs, which I believe is hardcoded to '1' and > >> seems to be evaluated after this flush condition. But internally I > >> guess InnoDB can have a prepared transaction related to a previous > >> binlog. > >> > >> I've wrestled with similar problems writing backup recovery routines > >> for MariaDB + snapshot backups (typically via LVM snapshots), where I > >> typically have to go 2 logs back and it seems to eliminate this kind > >> of failure. You might be able to bump up BINLOG_N_FILES in > >> /usr/bin/wsrep_sst_rsync (or wherever it is located) and see if that > >> resolves your race condition. > >> > >> Alternatively I think percona-xtrabackup has solved this problem[1] a > >> while back as well, so a different wsrep-sst-method may be an option. > >> Although xtrabackup has its own problems - in particularly I've found > >> that xtrabackup currently doesn't handle lost+found directories too > >> well[2], so you either have to avoid having the datadir directly under > >> the mountpoint or use some other filesystem (such as XFS) that doesn't > >> create lost+found. I suppose if you're using encryption, xtrabackup > >> is also not currently an option. :) > >> > >> ~Andrew > >> > >> [1] https://bugs.launchpad.net/percona-xtrabackup/+bug/1254227 > >> [2] https://bugs.launchpad.net/percona-xtrabackup/+bug/1272329 > >> > >> On Wed, Apr 13, 2016 at 4:27 PM, Igor Mazur <[email protected]> > wrote: > >> > Hi, > >> > > >> > Almost everytime when I join node to Galera cluster (MariaDB 10.1.13) > >> > I'm > >> > getting an error after SST: > >> > > >> > 2016-04-13 21:09:37 139932281419712 [Note] Recovering after a crash > >> > using > >> > /var/log/mysql/mariadb-bin > >> > 2016-04-13 21:09:37 139932281419712 [ERROR] Binlog file > >> > '/var/log/mysql/mariadb-bin.000023' not found in binlog index, needed > >> > for > >> > recovery. Aborting. > >> > 2016-04-13 21:09:37 139932281419712 [ERROR] Crash recovery failed. > >> > Either > >> > correct the problem (if it's, for example, out of memory error) and > >> > restart, > >> > or delete (or rename) binary log and start mysqld with > >> > --tc-heuristic-recover={commit|rollback} > >> > 2016-04-13 21:09:37 139932281419712 [ERROR] Can't init tc log > >> > 2016-04-13 21:09:37 139932281419712 [ERROR] Aborting > >> > > >> > As I see, almost every SST rotates a log file and downloads last (in > >> > example > >> > above I have mariadb-bin.000024), and not previous. But for recovering > >> > it > >> > needs previous. > >> > Sometimes I'm lucky and it restores. > >> > > >> > Anyone knows why it rotates a log and is it possible to force it to > load > >> > more than last log file? > >> > > >> > Igor Mazur > >> > > >> > _______________________________________________ > >> > Mailing list: https://launchpad.net/~maria-discuss > >> > Post to : [email protected] > >> > Unsubscribe : https://launchpad.net/~maria-discuss > >> > More help : https://help.launchpad.net/ListHelp > >> > > > > > >
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

