Sorry to anyone that is just catching up, the TLDR is migrated a set of servers to 10.4 from 10.3 they would not start up, even after migrating them to 10.4 b/c of the `plugin_load_add`. I just want to know why.
On Tue, May 26, 2020 at 6:44 PM Jeff Dyke <[email protected]> wrote: > I want to resurrect this. b/c i have found what i see to be the broken > bit of code/configuration. If i don't include the following the commented > out portion everything starts up. > -- Starts -- > /etc/mysql/conf.conf.d/100_sqlerr.log > [mysqld] > #plugin_load_add = sql_errlog > sql_error_log=ON > sql_error_log_rotate=1 > sql_error_log_size_limit=100000000 > > so if i remove the comment for "plugin_log_add" to run the server never > starts. I will produce a reproduce case, but can someone explain why this > would not start if i added a plugin that was already added. > > This journey sucked, but i learned a lot, even though i've been using > mysql since 3.X Is there different syntax for 10.4. Should i just add it > to start up args? and then add the variables after it starts. This worked > in 10.3. > > Thanks all! > > > On Mon, Apr 27, 2020 at 3:32 PM Jeff Dyke <[email protected]> wrote: > >> That second note is horribly misleading (note the bold): >> -- orig -- >> - I was able to take a full physical backup that runs every morning from >> the VPC 1 (prod cluster) directly from S3 and restore it to the server >> where i can not restore backups of any of the prod nodes. After this >> successfully started, i ran mysqlcheck --all-databases (online) againh mst >> this newly restored database and there was no corruption. >> -- should be -- >> - I was able to take the full physical backup *from a slave whose master >> is a prod cluster node,* which runs every morning, from S3 to the server >> where i can not restore backups of any of the prod nodes. After this >> successfully started, i ran mysqlcheck --all-databases (online) againh mst >> this newly restored database and there was no corruption. >> >> Sorry and Thanks >> >> On Mon, Apr 27, 2020 at 2:38 PM Jeff Dyke <[email protected]> wrote: >> >>> Hello again peoples. I'm still at it, but have taken some time for >>> higher priority projects and now have some follow up data, not sure if i >>> have any questions, we'll see. >>> >>> in the staging VPC, referred above/below as VPC 2. I can restore any >>> combinations of tables right down to only selecting mysql and >>> performance_schema. In the prod VPC (VPC 1). I can not install any >>> combination of databases. So like was thought before, i may have >>> corruption in one of the tables, but.... >>> >>> - i have checked all tables using innochecksum (offline), aria_chk >>> (offline) and there are no indications of corruption >>> - I was able to take a full physical backup that runs every morning from >>> the VPC 1 (prod cluster) directly from S3 and restore it to the server >>> where i can not restore backups of any of the prod nodes. After this >>> successfully started, i ran mysqlcheck --all-databases (online) againh mst >>> this newly restored database and there was no corruption. >>> - if i take the production node, that I'm using for backup testing, out >>> of the cluster and then run mysqlcheck --all-databases and put it back on >>> the cluster, no corrupt tables are reported. >>> >>> I would prefer not to build an entire new cluster, but realize i could >>> have done so by now. b/c i would really like to know why and with the >>> backups and snapshots that i can use the slave for along with 3 cluster >>> nodes, my data is safe and being written to daily. >>> >>> All of these databases where upgraded from 10.3 to 10.4 (in the last two >>> months) according to the instructions on mariadb.com. I ran staging >>> first, let it hang a couple weeks and then upgraded production and both are >>> at the same exact release. Ver 10.4.12-MariaDB-1:10.4.12+maria~bionic-log >>> for debian-linux-gnu on x86_64 (mariadb.org binary distribution) >>> >>> I've considered compiling a debug enabled version of mariadb, but that >>> seems it would only be useful if it were on the galera nodes, and i'm not >>> too keen on doing that. >>> >>> Any and all thoughts are welcome, i didn't want to repeat any >>> information so for new readers you'll have to read from the start of the >>> thread and i do indeed appreciate it. >>> >>> Jeff >>> >>> >>> >>> >>> >>> >>> >>> On Sat, Apr 18, 2020 at 1:22 PM Jeff Dyke <[email protected]> wrote: >>> >>>> Hi Karl, someone else just saw that as well. it was indeed a type in >>>> the email. To answer your questions; the plugins are the same >>>> sql_error_log and socket_plugin (which is no longer needed in 10.4 and was >>>> removed after the upgrade). Also since i can restore a new node from a >>>> staging backup, in either VPC. >>>> >>>> Databases are a good question, no i have not. Perhaps i should try a >>>> complete set of combinations, which may be a better use of time than >>>> brining the cluster up to 5 and then seeing if i can take a physical backup >>>> of one of the new nodes. >>>> >>>> On Sat, Apr 18, 2020 at 1:11 PM Karl Erik Levik <[email protected]> >>>> wrote: >>>> >>>>> Well, what I mean is, you're missing the "q" in "mysql", so I was >>>>> thinking perhaps that command doesn't actually do what you think it does? >>>>> >>>>> If that's not the issue, then I guess you have to look again at your >>>>> prod nodes and figure out what, if anything, is special/unusual about >>>>> them. >>>>> Do you have any special plugins enabled? Any special storage engines used >>>>> that you don't have on staging? (Wouldn't think so since only InnoDB is >>>>> properly supported by Galera ...) Do you have any crud in your datadir >>>>> that >>>>> might create problems for mariabackup? >>>>> >>>>> Have you tried doing a partial backup/restore instead, i.e. only some >>>>> of the databases? >>>>> >>>>> Cheers, >>>>> Karl >>>>> >>>>> On Sat, 18 Apr 2020 at 17:25, Jeff Dyke <[email protected]> wrote: >>>>> >>>>>> Thanks Karl, that is just there so everything under mysql is deleted >>>>>> and not the directory itself as mariabackup needs the datadir clean. And >>>>>> i'm using the same "script" on both VPCs. >>>>>> >>>>>> Appreciate the read! >>>>>> >>>>>> On Sat, Apr 18, 2020 at 6:01 AM Karl Erik Levik <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Jeff, >>>>>>> >>>>>>> Probably not helpful, but I dunno if this is a typo just in your >>>>>>> email or also in the script? >>>>>>> >>>>>>> > sudo rm -rf /var/lib/mysl/* >>>>>>> >>>>>>> Cheers, >>>>>>> Karl >>>>>>> >>>>>>> On Thu, 16 Apr 2020 at 00:47, Jeff Dyke <[email protected]> wrote: >>>>>>> >>>>>>>> This has biting me for a bit, and i have tried a litany of things. >>>>>>>> Here is the issue: >>>>>>>> VPC 1: (prod) >>>>>>>> Mariadb 10.4.12 - galera in a 3 node cluster, cluster functioning >>>>>>>> perfectly >>>>>>>> >>>>>>>> VPC 2: (staging) >>>>>>>> Mariadb 10.4.12 galera in a 3 node cluster also running smoothly. >>>>>>>> >>>>>>>> -- backup script (clipped for only the essentials) -- >>>>>>>> set -e >>>>>>>> DIR="/home/jeff/restore" >>>>>>>> rm -rf $DIR >>>>>>>> mkdir $DIR >>>>>>>> AUTH='--defaults-extra-file=/root/db.cnf -u root' >>>>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_desync = ON;" >>>>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_on = OFF"; >>>>>>>> sudo mariabackup ${AUTH} --backup --binlog-info=ON >>>>>>>> --target-dir="${DIR}" 2>&1 | grep -vE "Copying|done|log scanned" >>>>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_on = ON;" >>>>>>>> sudo mysql ${AUTH} -e "SET GLOBAL wsrep_desync = OFF"; >>>>>>>> >>>>>>>> -- end script -- >>>>>>>> >>>>>>>> -- restore process -- >>>>>>>> sudo service mariadb stop (or kill -9 PID only when the server >>>>>>>> won't start) >>>>>>>> sudo rm -rf /var/lib/mysl/* >>>>>>>> sudo mariabackup --prepare --target-dir=restore >>>>>>>> sudo mariabackup --copy-back --target-dir=restore >>>>>>>> sudo chown -R mysql:mysql /var/lib/mysql/* >>>>>>>> sudo service mariadb start >>>>>>>> -- end restore -- >>>>>>>> >>>>>>>> I can run this on any of the script and backup process on any of >>>>>>>> the *staging* nodes and restore them to a freshly built server >>>>>>>> within that VPC. I can also take the backup from *staging* and >>>>>>>> load in into a newly build machine on the *prod* VPC. >>>>>>>> >>>>>>>> What i can not do is run the script on any of the prod members and >>>>>>>> restore it to a newly built server, regardless of location. (really >>>>>>>> the >>>>>>>> locations don't matter at all) >>>>>>>> >>>>>>>> The following is the log from a failing restore from a proddb >>>>>>>> backup and restore: >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Using Linux native AIO >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Mutexes and rw_locks use GCC >>>>>>>> atomic builtins >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Uses event mutexes >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Compressed tables use zlib >>>>>>>> 1.2.11 >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Number of pools: 1 >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Using SSE2 crc32 instructions >>>>>>>> 2020-04-15 22:11:10 0 [Note] mysqld: O_TMPFILE is not supported on >>>>>>>> /tmp (disabling future attempts) >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Initializing buffer pool, >>>>>>>> total size = 2G, instances = 8, chunk size = 128M >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Completed initialization of >>>>>>>> buffer pool >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: If the mysqld execution user >>>>>>>> is authorized, page cleaner thread priority can be changed. See the man >>>>>>>> page of setpriority(). >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Setting log file >>>>>>>> ./ib_logfile101 size to 50331648 bytes >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Setting log file ./ib_logfile1 >>>>>>>> size to 50331648 bytes >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Renaming log file >>>>>>>> ./ib_logfile101 to ./ib_logfile0 >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: New log files created, >>>>>>>> LSN=42006460184 >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: 128 out of 128 rollback >>>>>>>> segments are active. >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Creating shared tablespace for >>>>>>>> temporary tables >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Setting file './ibtmp1' size >>>>>>>> to 12 MB. Physically writing the file full; Please wait ... >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: File './ibtmp1' size is now 12 >>>>>>>> MB. >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Waiting for purge to start >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: 10.4.12 started; log sequence >>>>>>>> number 42006460428; transaction id 23858673 >>>>>>>> 2020-04-15 22:11:10 0 [Note] InnoDB: Loading buffer pool(s) from >>>>>>>> /var/lib/mysql/ib_buffer_pool >>>>>>>> 2020-04-15 22:11:10 0 [Note] Plugin 'FEEDBACK' is disabled. >>>>>>>> 2020-04-15 22:11:11 0 [Note] InnoDB: Buffer pool(s) load completed >>>>>>>> at 200415 22:11:11 >>>>>>>> -- hangs here for about an hour before timing out -- >>>>>>>> >>>>>>>> >>>>>>>> Logfile from a successful start >>>>>>>> >>>>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Using Linux native AIO >>>>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Mutexes and rw_locks use GCC >>>>>>>> atomic builtins >>>>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Uses event mutexes >>>>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Compressed tables use zlib >>>>>>>> 1.2.11 >>>>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Number of pools: 1 >>>>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Using SSE2 crc32 instructions >>>>>>>> 2020-04-15 23:13:15 0 [Note] mysqld: O_TMPFILE is not supported on >>>>>>>> /tmp (disabling future attempts) >>>>>>>> 2020-04-15 23:13:15 0 [Note] InnoDB: Initializing buffer pool, >>>>>>>> total size = 2G, instances = 8, chunk size = 128M >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Completed initialization of >>>>>>>> buffer pool >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: If the mysqld execution user >>>>>>>> is authorized, page cleaner thread priority can be changed. See the man >>>>>>>> page of setpriority(). >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Setting log file >>>>>>>> ./ib_logfile101 size to 50331648 bytes >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Setting log file ./ib_logfile1 >>>>>>>> size to 50331648 bytes >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Renaming log file >>>>>>>> ./ib_logfile101 to ./ib_logfile0 >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: New log files created, >>>>>>>> LSN=43323668550 >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: 128 out of 128 rollback >>>>>>>> segments are active. >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Creating shared tablespace for >>>>>>>> temporary tables >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Setting file './ibtmp1' size >>>>>>>> to 12 MB. Physically writing the file full; Please wait ... >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: File './ibtmp1' size is now 12 >>>>>>>> MB. >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Waiting for purge to start >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: 10.4.12 started; log sequence >>>>>>>> number 43323669004; transaction id 22422468 >>>>>>>> 2020-04-15 23:13:16 0 [Note] InnoDB: Loading buffer pool(s) from >>>>>>>> /var/lib/mysql/ib_buffer_pool >>>>>>>> 2020-04-15 23:13:16 0 [Note] Plugin 'FEEDBACK' is disabled. >>>>>>>> 2020-04-15 23:13:16 0 [Note] Server socket created on IP: '0.0.0.0'. >>>>>>>> 2020-04-15 23:13:16 0 [Note] Reading of all Master_info entries >>>>>>>> succeeded >>>>>>>> 2020-04-15 23:13:16 0 [Note] Added new Master_info '' to hash table >>>>>>>> 2020-04-15 23:13:16 0 [Note] /usr/sbin/mysqld: ready for >>>>>>>> connections. >>>>>>>> Version: '10.4.12-MariaDB-1:10.4.12+maria~bionic' socket: >>>>>>>> '/var/run/mysqld/mysqld.sock' port: 3306 mariadb.org binary >>>>>>>> distribution >>>>>>>> 2020-04-15 23:13:22 0 [Note] InnoDB: Buffer pool(s) load completed >>>>>>>> at 200415 23:13:22 >>>>>>>> >>>>>>>> ------------------------ >>>>>>>> The main difference between the two is that: >>>>>>>> - the server socket is logged >>>>>>>> - master_info is logged >>>>>>>> - and it binds to the port/socket >>>>>>>> >>>>>>>> Before they *both* end with >>>>>>>> 2020-04-15 23:13:22 0 [Note] InnoDB: Buffer pool(s) load completed >>>>>>>> at 200415 23:13:22 >>>>>>>> >>>>>>>> -- Other Notes -- >>>>>>>> - Config files are the same except for ips, both on newly built >>>>>>>> servers as the cluster as they are managed through salt. I ran a diff >>>>>>>> -r >>>>>>>> against prod vs staging and only saw >>>>>>>> >>>>>>>> There has to be some corruption in the mariabackup output, but >>>>>>>> still reported OK. Since i can put a staging backup on a prod server, >>>>>>>> but >>>>>>>> i can not put a prod backup on a staging server. >>>>>>>> >>>>>>>> If you've gotten this far, thank you. If you have any ideas on >>>>>>>> what i could try, i'd be grateful. aria_chk and all other table checks >>>>>>>> return correctly. I'm not sure what makes a server write out the >>>>>>>> master_info, so possibly that is a pointer. >>>>>>>> >>>>>>>> I'm happy to answer any questions, i've tried way to many things to >>>>>>>> post here as i didn't want to write until i had enough of dumb stuff >>>>>>>> figured out. I could rebuild the prod cluster but am really looking to >>>>>>>> understand this. I don't really want to build my own with debug info >>>>>>>> on >>>>>>>> b/c why would they be different between the two vpcs. >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Mailing list: https://launchpad.net/~maria-discuss >>>>>>>> Post to : [email protected] >>>>>>>> Unsubscribe : https://launchpad.net/~maria-discuss >>>>>>>> More help : https://help.launchpad.net/ListHelp >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Mailing list: https://launchpad.net/~maria-discuss >>>>>>> Post to : [email protected] >>>>>>> Unsubscribe : https://launchpad.net/~maria-discuss >>>>>>> More help : https://help.launchpad.net/ListHelp >>>>>>> >>>>>> _______________________________________________ >>>>>> Mailing list: https://launchpad.net/~maria-discuss >>>>>> Post to : [email protected] >>>>>> Unsubscribe : https://launchpad.net/~maria-discuss >>>>>> More help : https://help.launchpad.net/ListHelp >>>>>> >>>>> _______________________________________________ >>>>> Mailing list: https://launchpad.net/~maria-discuss >>>>> Post to : [email protected] >>>>> Unsubscribe : https://launchpad.net/~maria-discuss >>>>> More help : https://help.launchpad.net/ListHelp >>>>> >>>>
_______________________________________________ Mailing list: https://launchpad.net/~maria-discuss Post to : [email protected] Unsubscribe : https://launchpad.net/~maria-discuss More help : https://help.launchpad.net/ListHelp

