That second note is horribly misleading (note the bold): -- orig -- - I was able to take a full physical backup that runs every morning from the VPC 1 (prod cluster) directly from S3 and restore it to the server where i can not restore backups of any of the prod nodes. After this successfully started, i ran mysqlcheck --all-databases (online) againh mst this newly restored database and there was no corruption. -- should be -- - I was able to take the full physical backup *from a slave whose master is a prod cluster node,* which runs every morning, from S3 to the server where i can not restore backups of any of the prod nodes. After this successfully started, i ran mysqlcheck --all-databases (online) againh mst this newly restored database and there was no corruption.
Sorry and Thanks On Mon, Apr 27, 2020 at 2:38 PM Jeff Dyke <[email protected]> wrote: > Hello again peoples. I'm still at it, but have taken some time for higher > priority projects and now have some follow up data, not sure if i have any > questions, we'll see. > > in the staging VPC, referred above/below as VPC 2. I can restore any > combinations of tables right down to only selecting mysql and > performance_schema. In the prod VPC (VPC 1). I can not install any > combination of databases. So like was thought before, i may have > corruption in one of the tables, but.... > > - i have checked all tables using innochecksum (offline), aria_chk > (offline) and there are no indications of corruption > - I was able to take a full physical backup that runs every morning from > the VPC 1 (prod cluster) directly from S3 and restore it to the server > where i can not restore backups of any of the prod nodes. After this > successfully started, i ran mysqlcheck --all-databases (online) againh mst > this newly restored database and there was no corruption. > - if i take the production node, that I'm using for backup testing, out of > the cluster and then run mysqlcheck --all-databases and put it back on the > cluster, no corrupt tables are reported. > > I would prefer not to build an entire new cluster, but realize i could > have done so by now. b/c i would really like to know why and with the > backups and snapshots that i can use the slave for along with 3 cluster > nodes, my data is safe and being written to daily. > > All of these databases where upgraded from 10.3 to 10.4 (in the last two > months) according to the instructions on mariadb.com. I ran staging > first, let it hang a couple weeks and then upgraded production and both are > at the same exact release. Ver 10.4.12-MariaDB-1:10.4.12+maria~bionic-log > for debian-linux-gnu on x86_64 (mariadb.org binary distribution) > > I've considered compiling a debug enabled version of mariadb, but that > seems it would only be useful if it were on the galera nodes, and i'm not > too keen on doing that. > > Any and all thoughts are welcome, i didn't want to repeat any information > so for new readers you'll have to read from the start of the thread and i > do indeed appreciate it. > > Jeff > > > > > > > > On Sat, Apr 18, 2020 at 1:22 PM Jeff Dyke <[email protected]> wrote: > >> Hi Karl, someone else just saw that as well. it was indeed a type in the >> email. To answer your questions; the plugins are the same sql_error_log >> and socket_plugin (which is no longer needed in 10.4 and was removed after >> the upgrade). Also since i can restore a new node from a staging backup, >> in either VPC. >> >> Databases are a good question, no i have not. Perhaps i should try a >> complete set of combinations, which may be a better use of time than >> brining the cluster up to 5 and then seeing if i can take a physical backup >> of one of the new nodes. >> >> On Sat, Apr 18, 2020 at 1:11 PM Karl Erik Levik <[email protected]> >> wrote: >> >>> Well, what I mean is, you're missing the "q" in "mysql", so I was >>> thinking perhaps that command doesn't actually do what you think it does? >>> >>> If that's not the issue, then I guess you have to look again at your >>> prod nodes and figure out what, if anything, is special/unusual about them. >>> Do you have any special plugins enabled? Any special storage engines used >>> that you don't have on staging? (Wouldn't think so since only InnoDB is >>> properly supported by Galera ...) Do you have any crud in your datadir that >>> might create problems for mariabackup? >>> >>> Have you tried doing a partial backup/restore instead, i.e. only some of >>> the databases? >>> >>> Cheers, >>> Karl >>> >>> On Sat, 18 Apr 2020 at 17:25, Jeff Dyke <[email protected]> wrote: >>> >>>> Thanks Karl, that is just there so everything under mysql is deleted >>>> and not the directory itself as mariabackup needs the datadir clean. And >>>> i'm using the same "script" on both VPCs. >>>> >>>> Appreciate the read! >>>> >>>> On Sat, Apr 18, 2020 at 6:01 AM Karl Erik Levik <[email protected]> >>>> wrote: >>>> >>>>> Hi Jeff, >>>>> >>>>> Probably not helpful, but I dunno if this is a typo just in your email >>>>> or also in the script? >>>>> >>>>> > sudo rm -rf /var/lib/mysl/* >>>>> >>>>> Cheers, >>>>> Karl >>>>> >>>>> On Thu, 16 Apr 2020 at 00:47, Jeff Dyke <[email protected]> wrote: >>>>> >>>>>> This has biting me for a bit, and i have tried a litany of things. >>>>>> Here is the issue: >>>>>> VPC 1: (prod) >>>>>> Mariadb 10.4.12 - galera in a 3 node cluster, cluster functioning >>>>>> perfectly >>>>>> >>>>>> VPC 2: (staging) >>>>>> Mariadb 10.4.12 galera in a 3 node cluster also running smoothly. >>>>>> >>>>>> -- backup script (clipped for only the essentials) -- >>>>>> set -e >>>>>> DIR="/home/jeff/restore" >>>>>> rm -rf $DIR >>>>>> mkdir $DIR >>>>>> AUTH='--defaults-extra-file=/root/db.cnf -u root' >>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_desync = ON;" >>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_on = OFF"; >>>>>> sudo mariabackup ${AUTH} --backup --binlog-info=ON >>>>>> --target-dir="${DIR}" 2>&1 | grep -vE "Copying|done|log scanned" >>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_on = ON;" >>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_desync = OFF"; >>>>>> >>>>>> -- end script -- >>>>>> >>>>>> -- restore process -- >>>>>> sudo service mariadb stop (or kill -9 PID only when the server won't >>>>>> start) >>>>>> sudo rm -rf /var/lib/mysl/* >>>>>> sudo mariabackup --prepare --target-dir=restore >>>>>> sudo mariabackup --copy-back --target-dir=restore >>>>>> sudo chown -R mysql:mysql /var/lib/mysql/* >>>>>> sudo service mariadb start >>>>>> -- end restore -- >>>>>> >>>>>> I can run this on any of the script and backup process on any of the >>>>>> *staging* nodes and restore them to a freshly built server within >>>>>> that VPC. I can also take the backup from *staging* and load in >>>>>> into a newly build machine on the *prod* VPC. >>>>>> >>>>>> What i can not do is run the script on any of the prod members and >>>>>> restore it to a newly built server, regardless of location. (really the >>>>>> locations don't matter at all) >>>>>> >>>>>> The following is the log from a failing restore from a proddb backup >>>>>> and restore: >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Using Linux native AIO >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Mutexes and rw_locks use GCC >>>>>> atomic builtins >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Uses event mutexes >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Number of pools: 1 >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Using SSE2 crc32 instructions >>>>>> 2020-04-15 22:11:10 0 [Note] mysqld: O_TMPFILE is not supported on >>>>>> /tmp (disabling future attempts) >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Initializing buffer pool, total >>>>>> size = 2G, instances = 8, chunk size = 128M >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Completed initialization of >>>>>> buffer pool >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: If the mysqld execution user is >>>>>> authorized, page cleaner thread priority can be changed. See the man page >>>>>> of setpriority(). >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Setting log file ./ib_logfile101 >>>>>> size to 50331648 bytes >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Setting log file ./ib_logfile1 >>>>>> size to 50331648 bytes >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Renaming log file >>>>>> ./ib_logfile101 to ./ib_logfile0 >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: New log files created, >>>>>> LSN=42006460184 >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: 128 out of 128 rollback segments >>>>>> are active. >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Creating shared tablespace for >>>>>> temporary tables >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Setting file './ibtmp1' size to >>>>>> 12 MB. Physically writing the file full; Please wait ... >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: File './ibtmp1' size is now 12 >>>>>> MB. >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Waiting for purge to start >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: 10.4.12 started; log sequence >>>>>> number 42006460428; transaction id 23858673 >>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Loading buffer pool(s) from >>>>>> /var/lib/mysql/ib_buffer_pool >>>>>> 2020-04-15 22:11:10 0 [Note] Plugin 'FEEDBACK' is disabled. >>>>>> 2020-04-15 22:11:11 0 [Note] InnoDB: Buffer pool(s) load completed at >>>>>> 200415 22:11:11 >>>>>> -- hangs here for about an hour before timing out -- >>>>>> >>>>>> >>>>>> Logfile from a successful start >>>>>> >>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Using Linux native AIO >>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Mutexes and rw_locks use GCC >>>>>> atomic builtins >>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Uses event mutexes >>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 >>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Number of pools: 1 >>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Using SSE2 crc32 instructions >>>>>> 2020-04-15 23:13:15 0 [Note] mysqld: O_TMPFILE is not supported on >>>>>> /tmp (disabling future attempts) >>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Initializing buffer pool, total >>>>>> size = 2G, instances = 8, chunk size = 128M >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Completed initialization of >>>>>> buffer pool >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: If the mysqld execution user is >>>>>> authorized, page cleaner thread priority can be changed. See the man page >>>>>> of setpriority(). >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Setting log file ./ib_logfile101 >>>>>> size to 50331648 bytes >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Setting log file ./ib_logfile1 >>>>>> size to 50331648 bytes >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Renaming log file >>>>>> ./ib_logfile101 to ./ib_logfile0 >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: New log files created, >>>>>> LSN=43323668550 >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: 128 out of 128 rollback segments >>>>>> are active. >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Creating shared tablespace for >>>>>> temporary tables >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Setting file './ibtmp1' size to >>>>>> 12 MB. Physically writing the file full; Please wait ... >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: File './ibtmp1' size is now 12 >>>>>> MB. >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Waiting for purge to start >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: 10.4.12 started; log sequence >>>>>> number 43323669004; transaction id 22422468 >>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Loading buffer pool(s) from >>>>>> /var/lib/mysql/ib_buffer_pool >>>>>> 2020-04-15 23:13:16 0 [Note] Plugin 'FEEDBACK' is disabled. >>>>>> 2020-04-15 23:13:16 0 [Note] Server socket created on IP: '0.0.0.0'. >>>>>> 2020-04-15 23:13:16 0 [Note] Reading of all Master_info entries >>>>>> succeeded >>>>>> 2020-04-15 23:13:16 0 [Note] Added new Master_info '' to hash table >>>>>> 2020-04-15 23:13:16 0 [Note] /usr/sbin/mysqld: ready for connections. >>>>>> Version: '10.4.12-MariaDB-1:10.4.12+maria~bionic' socket: >>>>>> '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary >>>>>> distribution >>>>>> 2020-04-15 23:13:22 0 [Note] InnoDB: Buffer pool(s) load completed at >>>>>> 200415 23:13:22 >>>>>> >>>>>> ------------------------ >>>>>> The main difference between the two is that: >>>>>> - the server socket is logged >>>>>> - master_info is logged >>>>>> - and it binds to the port/socket >>>>>> >>>>>> Before they *both* end with >>>>>> 2020-04-15 23:13:22 0 [Note] InnoDB: Buffer pool(s) load completed at >>>>>> 200415 23:13:22 >>>>>> >>>>>> -- Other Notes -- >>>>>> - Config files are the same except for ips, both on newly built >>>>>> servers as the cluster as they are managed through salt. I ran a diff -r >>>>>> against prod vs staging and only saw >>>>>> >>>>>> There has to be some corruption in the mariabackup output, but still >>>>>> reported OK. Since i can put a staging backup on a prod server, but i >>>>>> can >>>>>> not put a prod backup on a staging server. >>>>>> >>>>>> If you've gotten this far, thank you. If you have any ideas on what >>>>>> i could try, i'd be grateful. aria_chk and all other table checks return >>>>>> correctly. I'm not sure what makes a server write out the master_info, >>>>>> so >>>>>> possibly that is a pointer. >>>>>> >>>>>> I'm happy to answer any questions, i've tried way to many things to >>>>>> post here as i didn't want to write until i had enough of dumb stuff >>>>>> figured out. I could rebuild the prod cluster but am really looking to >>>>>> understand this. I don't really want to build my own with debug info on >>>>>> b/c why would they be different between the two vpcs. >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Mailing list: https://launchpad.net/~maria-discuss >>>>>> Post to : [email protected] >>>>>> Unsubscribe : https://launchpad.net/~maria-discuss >>>>>> More help : https://help.launchpad.net/ListHelp >>>>>> >>>>> _______________________________________________ >>>>> Mailing list: https://launchpad.net/~maria-discuss >>>>> Post to : [email protected] >>>>> Unsubscribe : https://launchpad.net/~maria-discuss >>>>> More help : https://help.launchpad.net/ListHelp >>>>> >>>> _______________________________________________ >>>> Mailing list: https://launchpad.net/~maria-discuss >>>> Post to : [email protected] >>>> Unsubscribe : https://launchpad.net/~maria-discuss >>>> More help : https://help.launchpad.net/ListHelp >>>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~maria-discuss >>> Post to : [email protected] >>> Unsubscribe : https://launchpad.net/~maria-discuss >>> More help : https://help.launchpad.net/ListHelp >>> >>
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

