Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
On Sunday 20 February 2022 02:23, Otto Kekäläinen wrote: > Is the issue https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996028 > still affecting people? We just hit it again on our server aftre4r upgrading just now, and I am not sure why. We had to resort to enableing «innodb_force_recovery = 5» just to get the mariadb server started. I am not sure how we are supposed to fix it. There is no clear guidance as to how we are supposed to fix this, not that i can see anyway. Is there a _Clear_ and _simple_ explanation/guide as to how we are supposed to fix this on our end? -- Johnny A. Solbu web site, https://www.solbu.net PGP key ID: 0x4F5AD64DFA687324 signature.asc Description: This is a digitally signed message part.
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
Control: reassign -1 mariadb-server-10.3 Control: tags moreinfo Hello! Is the issue https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996028 still affecting people? Did anybody figure out the root cause or what upstream issue it was, or what version it was fixed in?
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
On 2021-10-25 19:37, Otto Kekäläinen wrote: We have two users who have experienced a potentially corrupted database (out of hundreds of thousands or even potentially millions of users, depending how one wants to extrapolate the popcon data). A bug report has been filed and it is kept open in case somebody could provide a way to reproduce the bug or report something actionable. Otherwise neither the Debian packagers nor upstream developers (and upstream does not even know about this bug, since it is still vague and no bug report has been filed upstream) will do anything about the bug report. I am now downgrading this bug report severity to "normal" so that it will not raise false alarms for random users. Okay, thanks for clarifying. I initially thought the upgrade *caused* the data corruption. But rather, it *exposes* a potentially existing database corruption, which means most people should not be affected. This misunderstanding, coupled with the "grave" severity, simply raised alarms and caution before proceeding on my side. You should have a backup anyway, that is just good practice while maintaining database systems. As previously stated, I do have backups of course, this is not in question. But a scheduled maintenance for a simple apt upgrade is not the same level of preparation and expected downtime compared to a scheduled maintenance dedicated to test backups restoration. Thus the reluctance to go the disaster recovery route versus waiting for a fix* for a simple apt upgrade. (* yes, now it's clear there is no fix to expect, but that wasn't the initial understanding) Again, thanks for the clarification. I'll upgrade with caution in the next regular maintenance window and hope I'm not affected.
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
Hello Ondrej! I sent the below message on Oct 10th but I am not sure if you read it. The gist is: > I recommend that you file a bug about this upstream, and try to attach > relevant info from the error log, maybe a strace output etc. Upstream > devs will guide you on what to debug next. The bug report https://jira.mariadb.org/browse/MDEV-265377 was investigated and fixed because the submitter provided a trace and core dump. If you are hitting a bug and you cannot provide steps to reproduce it, you should provide other data. Don't waste time on recompiling your own versions, that path won't lead to any fixes upstream or in Debian. If you represent an enterprise and are running MariaDB in a critical environment, you might want to consider to hire a consultant or get a support contract so somebody else can investigate database crashes/corruption for you. On Mon, Oct 11, 2021 at 12:58 AM Otto Kekäläinen wrote: > > > The problem is in the ibdata1 file (about 450MB). Deleted other database > > directories and it still crashes, deleted ibdata1 and it runs. > > > > How to bisect mariadb from git? Tried: > > $ git bisect good mariadb-10.3.29 > > $ git bisect bad mariadb-10.3.31 > > the build process showed version 10.2 so I aborted it. > > > > Checked out mariadb-10.3.30 but dpkg-buildpackage failed with: > > dh_install: mariadb-plugin-cassandra missing files: > > etc/mysql/conf.d/cassandra.cnf > > Some dependency was missing and Cassandra was not built. Note that the > upstream repository is not identical to the one in Debian regarding > the contents of debian/ directory. MariaDB builds without a cache take > 30 mins each and there are all kinds of things going on. Doing bisect > (fully correctly) on MariaDB is hard even for experienced developers. > Your time is probably better spent doing some other kind of debugging. > > I recommend that you file a bug about this upstream, and try to attach > relevant info from the error log, maybe a strace output etc. Upstream > devs will guide you on what to debug next. > > One thing you could also try is to start the server with 10.3.29 and > ensure that you have a clean shutdown (SET GLOBAL > innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the > new 10.3.31 binary. > > Ref > - https://mariadb.com/kb/en/shutdown/#see-also > - https://www.percona.com/blog/2020/05/07/prepare-mysql-for-a-safe-shutdown/
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
Control: severity -1 normal On Thu, Oct 21, 2021 at 9:24 PM Marc Gallet wrote: > I've been brought to this bug by apt-listbugs while doing upgrades > on my buster install, warning me of a grave bug. We have two users who have experienced a potentially corrupted database (out of hundreds of thousands or even potentially millions of users, depending how one wants to extrapolate the popcon data). A bug report has been filed and it is kept open in case somebody could provide a way to reproduce the bug or report something actionable. Otherwise neither the Debian packagers nor upstream developers (and upstream does not even know about this bug, since it is still vague and no bug report has been filed upstream) will do anything about the bug report. I am now downgrading this bug report severity to "normal" so that it will not raise false alarms for random users. > I have not attempted the upgrade yet, since, after reading this bug, I > see a risk of data corruption and I would like to avoid going into > recovery procedures (from backups) as a result of what should be a > stable upgrade. You should have a backup anyway, that is just good practice while maintaining database systems. When you want to upgrade, run 'apt upgrade'. If your database is already broken/corrupted, the upgrade will not fix it. You can easily test your database by restarting it (and see that it restarts), read the logs and related documentation. Official Debian package documentation is the README files in the packaging, and they contain more tips about best practices. I recommend you use them as the primary source of information and don't put too much weight on a single bug report.
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Tue, 2021-10-12 at 21:58 +0200, Ondrej Zary wrote: > > One thing you could also try is to start the server with 10.3.29 and > > ensure that you have a clean shutdown (SET GLOBAL > > innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the > > new 10.3.31 binary. > > With fast shutdown disabled, mysqld (10.3.29) seems to be stuck in an infinite > loop. 100% CPU usage, no I/O for an hour until I killed it. > Even dropping all the databases did not help - killed it after it has been > running for 5 minutes. I tried that as well and for now the shutdown process seems a bit stuck: 2021-10-13 8:50:01 0 [Note] /usr/sbin/mysqld (initiated by: unknown): Normal shutdown 2021-10-13 8:50:01 0 [Note] Event Scheduler: Purging the queue. 0 events 2021-10-13 8:50:01 0 [Note] InnoDB: FTS optimize thread exiting. 2021-10-13 8:50:01 4 [Note] InnoDB: to purge 5 transactions 2021-10-13 8:50:01 0 [Note] InnoDB: Starting shutdown... 2021-10-13 8:50:01 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool 2021-10-13 8:50:01 0 [Note] InnoDB: Buffer pool(s) dump completed at 211013 8:50:01 2021-10-13 8:51:02 0 [Note] InnoDB: Waiting for master thread to exit 2021-10-13 8:51:03 0 [Note] InnoDB: Waiting for change buffer merge to complete number of bytes of change buffer just merged: 1251 2021-10-13 8:52:02 0 [Note] InnoDB: Waiting for master thread to exit [...] 2021-10-13 9:02:05 0 [Note] InnoDB: Waiting for master thread to exit 2021-10-13 9:02:14 0 [Note] InnoDB: Waiting for change buffer merge to complete number of bytes of change buffer just merged: 746 2021-10-13 9:03:05 0 [Note] InnoDB: Waiting for master thread to exit 2021-10-13 9:03:15 0 [Note] InnoDB: Waiting for change buffer merge to complete number of bytes of change buffer just merged: 1040 I don't think anything still uses MySQL/MariaDB (I've stopped apache2 just in case). I'll let it run a bit but I'm not really confident with that shutdown process. I've tried to rebuild mariadb-10.3 with the patch upstream committed, but it doesn't apply to 10.3.31 directly so I stalled a bit. Regards, - -- Yves-Alexis -BEGIN PGP SIGNATURE- iQEzBAEBCAAdFiEE8vi34Qgfo83x35gF3rYcyPpXRFsFAmFmhToACgkQ3rYcyPpX RFuQFwf/Y7hPWdNmI//ZVnT3k2H1Jn47kOKttff4cg+RDc4LzbQX/NoQkrDaOwJJ Yef7ne3a4FhTnkL4+L1yq8/53L22zaVvAxrXEmSgcqSvVPap0v4x6S5MXXvPhSsN 961LWuTQwdY3qSiXTRqUtKh3fgQBxBdqoWK9zT2AE0zGh04/XhaoauZ1MmH4s+T/ uX5IGtu4a8DhPQ2bno/mu421bsIndn8rkTBzaXPEh9QAd0wexMJyul6kLe4rJ2ym zvRvYeIO4Bx8YzGV1+CuN2rtAEOGrplf3qRmBFkb1Jo7nNX8SdjaK4NqgZHAaDx9 WdEsaa6cGtoKLvNbadUJaoQ+r922cA== =4Lb3 -END PGP SIGNATURE-
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
On Sunday 10 October 2021 23:58:12 Otto Kekäläinen wrote: > > The problem is in the ibdata1 file (about 450MB). Deleted other database > > directories and it still crashes, deleted ibdata1 and it runs. > > > > How to bisect mariadb from git? Tried: > > $ git bisect good mariadb-10.3.29 > > $ git bisect bad mariadb-10.3.31 > > the build process showed version 10.2 so I aborted it. > > > > Checked out mariadb-10.3.30 but dpkg-buildpackage failed with: > > dh_install: mariadb-plugin-cassandra missing files: > > etc/mysql/conf.d/cassandra.cnf > > Some dependency was missing and Cassandra was not built. Note that the > upstream repository is not identical to the one in Debian regarding > the contents of debian/ directory. MariaDB builds without a cache take > 30 mins each and there are all kinds of things going on. Doing bisect > (fully correctly) on MariaDB is hard even for experienced developers. > Your time is probably better spent doing some other kind of debugging. > > I recommend that you file a bug about this upstream, and try to attach > relevant info from the error log, maybe a strace output etc. Upstream > devs will guide you on what to debug next. > > One thing you could also try is to start the server with 10.3.29 and > ensure that you have a clean shutdown (SET GLOBAL > innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the > new 10.3.31 binary. With fast shutdown disabled, mysqld (10.3.29) seems to be stuck in an infinite loop. 100% CPU usage, no I/O for an hour until I killed it. Even dropping all the databases did not help - killed it after it has been running for 5 minutes. Deleted ibdata1 (and ib_logfile0, ib_logfile1), then shutdown ended immediately. Seems that the file structure is corrupted somehow - probably because of a previous bug. One table is also affected by the "ERROR 1118 (42000): Row size too large (> 8126)" bug. When complete SQL dump is restored with a new ibdata1, everything works (upgrade to 10.3.31 and also clean shutdown). -- Ondrej Zary
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
> The problem is in the ibdata1 file (about 450MB). Deleted other database > directories and it still crashes, deleted ibdata1 and it runs. > > How to bisect mariadb from git? Tried: > $ git bisect good mariadb-10.3.29 > $ git bisect bad mariadb-10.3.31 > the build process showed version 10.2 so I aborted it. > > Checked out mariadb-10.3.30 but dpkg-buildpackage failed with: > dh_install: mariadb-plugin-cassandra missing files: > etc/mysql/conf.d/cassandra.cnf Some dependency was missing and Cassandra was not built. Note that the upstream repository is not identical to the one in Debian regarding the contents of debian/ directory. MariaDB builds without a cache take 30 mins each and there are all kinds of things going on. Doing bisect (fully correctly) on MariaDB is hard even for experienced developers. Your time is probably better spent doing some other kind of debugging. I recommend that you file a bug about this upstream, and try to attach relevant info from the error log, maybe a strace output etc. Upstream devs will guide you on what to debug next. One thing you could also try is to start the server with 10.3.29 and ensure that you have a clean shutdown (SET GLOBAL innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the new 10.3.31 binary. Ref - https://mariadb.com/kb/en/shutdown/#see-also - https://www.percona.com/blog/2020/05/07/prepare-mysql-for-a-safe-shutdown/
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
On Sunday 10 October 2021 16:55:45 Otto Kekäläinen wrote: > Hello! > > Thanks for reporting. Could you please check if this has been reported > upstream at jira.mariadb.org? > > There isn't much we can do about InnoDB internals in Debian packaging. > The problem is in the ibdata1 file (about 450MB). Deleted other database directories and it still crashes, deleted ibdata1 and it runs. How to bisect mariadb from git? Tried: $ git bisect good mariadb-10.3.29 $ git bisect bad mariadb-10.3.31 the build process showed version 10.2 so I aborted it. Checked out mariadb-10.3.30 but dpkg-buildpackage failed with: dh_install: mariadb-plugin-cassandra missing files: etc/mysql/conf.d/cassandra.cnf -- Ondrej Zary
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
Haven't found this exact problem. This seems to be closest but the error messages are different: https://jira.mariadb.org/browse/MDEV-25981 I'm going to copy the datadir to another machine and debug it further. On Sunday 10 October 2021 16:55:45 Otto Kekäläinen wrote: > Hello! > > Thanks for reporting. Could you please check if this has been reported > upstream at jira.mariadb.org? > > There isn't much we can do about InnoDB internals in Debian packaging. -- Ondrej Zary
Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31
Hello! Thanks for reporting. Could you please check if this has been reported upstream at jira.mariadb.org? There isn't much we can do about InnoDB internals in Debian packaging.