Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2022-03-29 Thread Johnny A. Solbu
On Sunday 20 February 2022 02:23, Otto Kekäläinen wrote:
> Is the issue https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996028
> still affecting people?

We just hit it again on our server aftre4r upgrading just now, and I am not 
sure why.
We had to resort to enableing «innodb_force_recovery = 5» just to get the 
mariadb server started.


I am not sure how we are supposed to fix it.
There is no clear guidance as to how we are supposed to fix this, not that i 
can see anyway.

Is there a _Clear_ and _simple_ explanation/guide as to how we are supposed to 
fix this on our end?


-- 
Johnny A. Solbu
web site,   https://www.solbu.net
PGP key ID: 0x4F5AD64DFA687324


signature.asc
Description: This is a digitally signed message part.


Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2022-02-19 Thread Otto Kekäläinen
Control: reassign -1 mariadb-server-10.3
Control: tags moreinfo

Hello!

Is the issue https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996028
still affecting people? Did anybody figure out the root cause or what
upstream issue it was, or what version it was fixed in?



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-25 Thread Zertrin

On 2021-10-25 19:37, Otto Kekäläinen wrote:


We have two users who have experienced a potentially corrupted
database (out of hundreds of thousands or even potentially millions of
users, depending how one wants to extrapolate the popcon data). A bug
report has been filed and it is kept open in case somebody could
provide a way to reproduce the bug or report something actionable.
Otherwise neither the Debian packagers nor upstream developers (and
upstream does not even know about this bug, since it is still vague
and no bug report has been filed upstream) will do anything about the
bug report.

I am now downgrading this bug report severity to "normal" so that it
will not raise false alarms for random users.


Okay, thanks for clarifying. I initially thought the upgrade *caused* 
the
data corruption. But rather, it *exposes* a potentially existing 
database

corruption, which means most people should not be affected.

This misunderstanding, coupled with the "grave" severity, simply raised
alarms and caution before proceeding on my side.


You should have a backup anyway, that is just good practice while
maintaining database systems.


As previously stated, I do have backups of course, this is not in 
question.


But a scheduled maintenance for a simple apt upgrade is not the same 
level

of preparation and expected downtime compared to a scheduled maintenance
dedicated to test backups restoration. Thus the reluctance to go the 
disaster

recovery route versus waiting for a fix* for a simple apt upgrade.

(* yes, now it's clear there is no fix to expect, but that wasn't the
initial understanding)

Again, thanks for the clarification. I'll upgrade with caution in the 
next

regular maintenance window and hope I'm not affected.



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-25 Thread Otto Kekäläinen
Hello Ondrej!

I sent the below message on Oct 10th but I am not sure if you read it.
The gist is:

> I recommend that you file a bug about this upstream, and try to attach
> relevant info from the error log, maybe a strace output etc. Upstream
> devs will guide you on what to debug next.

The bug report https://jira.mariadb.org/browse/MDEV-265377 was
investigated and fixed because the submitter provided a trace and core
dump. If you are hitting a bug and you cannot provide steps to
reproduce it, you should provide other data. Don't waste time on
recompiling your own versions, that path won't lead to any fixes
upstream or in Debian.

If you represent an enterprise and are running MariaDB in a critical
environment, you might want to consider to hire a consultant or get a
support contract so somebody else can investigate database
crashes/corruption for you.


On Mon, Oct 11, 2021 at 12:58 AM Otto Kekäläinen  wrote:
>
> > The problem is in the ibdata1 file (about 450MB). Deleted other database 
> > directories and it still crashes, deleted ibdata1 and it runs.
> >
> > How to bisect mariadb from git? Tried:
> > $ git bisect good mariadb-10.3.29
> > $ git bisect bad mariadb-10.3.31
> > the build process showed version 10.2 so I aborted it.
> >
> > Checked out mariadb-10.3.30 but dpkg-buildpackage failed with:
> > dh_install: mariadb-plugin-cassandra missing files: 
> > etc/mysql/conf.d/cassandra.cnf
>
> Some dependency was missing and Cassandra was not built. Note that the
> upstream repository is not identical to the one in Debian regarding
> the contents of debian/ directory. MariaDB builds without a cache take
> 30 mins each and there are all kinds of things going on. Doing bisect
> (fully correctly) on MariaDB is hard even for experienced developers.
> Your time is probably better spent doing some other kind of debugging.
>
> I recommend that you file a bug about this upstream, and try to attach
> relevant info from the error log, maybe a strace output etc. Upstream
> devs will guide you on what to debug next.
>
> One thing you could also try is to start the server with 10.3.29 and
> ensure that you have a clean shutdown (SET GLOBAL
> innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the
> new 10.3.31 binary.
>
> Ref
> - https://mariadb.com/kb/en/shutdown/#see-also
> - https://www.percona.com/blog/2020/05/07/prepare-mysql-for-a-safe-shutdown/



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-25 Thread Otto Kekäläinen
Control: severity -1 normal

On Thu, Oct 21, 2021 at 9:24 PM Marc Gallet  wrote:
> I've been brought to this bug by apt-listbugs while doing upgrades
> on my buster install, warning me of a grave bug.

We have two users who have experienced a potentially corrupted
database (out of hundreds of thousands or even potentially millions of
users, depending how one wants to extrapolate the popcon data). A bug
report has been filed and it is kept open in case somebody could
provide a way to reproduce the bug or report something actionable.
Otherwise neither the Debian packagers nor upstream developers (and
upstream does not even know about this bug, since it is still vague
and no bug report has been filed upstream) will do anything about the
bug report.

I am now downgrading this bug report severity to "normal" so that it
will not raise false alarms for random users.

> I have not attempted the upgrade yet, since, after reading this bug, I
> see a risk of data corruption and I would like to avoid going into
> recovery procedures (from backups) as a result of what should be a
> stable upgrade.

You should have a backup anyway, that is just good practice while
maintaining database systems.

When you want to upgrade, run 'apt upgrade'.

If your database is already broken/corrupted, the upgrade will not fix
it. You can easily test your database by restarting it (and see that
it restarts), read the logs and related documentation. Official Debian
package documentation is the README files in the packaging, and they
contain more tips about best practices. I recommend you use them as
the primary source of information and don't put too much weight on a
single bug report.



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-13 Thread Yves-Alexis Perez
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On Tue, 2021-10-12 at 21:58 +0200, Ondrej Zary wrote:
> > One thing you could also try is to start the server with 10.3.29 and
> > ensure that you have a clean shutdown (SET GLOBAL
> > innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the
> > new 10.3.31 binary.
> 
> With fast shutdown disabled, mysqld (10.3.29) seems to be stuck in an infinite
> loop. 100% CPU usage, no I/O for an hour until I killed it.
> Even dropping all the databases did not help - killed it after it has been
> running for 5 minutes.

I tried that as well and for now the shutdown process seems a bit stuck:

2021-10-13  8:50:01 0 [Note] /usr/sbin/mysqld (initiated by: unknown): Normal
shutdown
2021-10-13  8:50:01 0 [Note] Event Scheduler: Purging the queue. 0 events
2021-10-13  8:50:01 0 [Note] InnoDB: FTS optimize thread exiting.
2021-10-13  8:50:01 4 [Note] InnoDB: to purge 5 transactions
2021-10-13  8:50:01 0 [Note] InnoDB: Starting shutdown...
2021-10-13  8:50:01 0 [Note] InnoDB: Dumping buffer pool(s) to
/var/lib/mysql/ib_buffer_pool
2021-10-13  8:50:01 0 [Note] InnoDB: Buffer pool(s) dump completed at 211013 
8:50:01
2021-10-13  8:51:02 0 [Note] InnoDB: Waiting for master thread to exit
2021-10-13  8:51:03 0 [Note] InnoDB: Waiting for change buffer merge to
complete number of bytes of change buffer just merged: 1251
2021-10-13  8:52:02 0 [Note] InnoDB: Waiting for master thread to exit
[...]
2021-10-13  9:02:05 0 [Note] InnoDB: Waiting for master thread to exit
2021-10-13  9:02:14 0 [Note] InnoDB: Waiting for change buffer merge to
complete number of bytes of change buffer just merged: 746
2021-10-13  9:03:05 0 [Note] InnoDB: Waiting for master thread to exit
2021-10-13  9:03:15 0 [Note] InnoDB: Waiting for change buffer merge to
complete number of bytes of change buffer just merged: 1040

I don't think anything still uses MySQL/MariaDB (I've stopped apache2 just in
case). I'll let it run a bit but I'm not really confident with that shutdown
process.

I've tried to rebuild mariadb-10.3 with the patch upstream committed, but it
doesn't apply to 10.3.31 directly so I stalled a bit.

Regards,
- -- 
Yves-Alexis
-BEGIN PGP SIGNATURE-

iQEzBAEBCAAdFiEE8vi34Qgfo83x35gF3rYcyPpXRFsFAmFmhToACgkQ3rYcyPpX
RFuQFwf/Y7hPWdNmI//ZVnT3k2H1Jn47kOKttff4cg+RDc4LzbQX/NoQkrDaOwJJ
Yef7ne3a4FhTnkL4+L1yq8/53L22zaVvAxrXEmSgcqSvVPap0v4x6S5MXXvPhSsN
961LWuTQwdY3qSiXTRqUtKh3fgQBxBdqoWK9zT2AE0zGh04/XhaoauZ1MmH4s+T/
uX5IGtu4a8DhPQ2bno/mu421bsIndn8rkTBzaXPEh9QAd0wexMJyul6kLe4rJ2ym
zvRvYeIO4Bx8YzGV1+CuN2rtAEOGrplf3qRmBFkb1Jo7nNX8SdjaK4NqgZHAaDx9
WdEsaa6cGtoKLvNbadUJaoQ+r922cA==
=4Lb3
-END PGP SIGNATURE-



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-12 Thread Ondrej Zary
On Sunday 10 October 2021 23:58:12 Otto Kekäläinen wrote:
> > The problem is in the ibdata1 file (about 450MB). Deleted other database 
> > directories and it still crashes, deleted ibdata1 and it runs.
> >
> > How to bisect mariadb from git? Tried:
> > $ git bisect good mariadb-10.3.29
> > $ git bisect bad mariadb-10.3.31
> > the build process showed version 10.2 so I aborted it.
> >
> > Checked out mariadb-10.3.30 but dpkg-buildpackage failed with:
> > dh_install: mariadb-plugin-cassandra missing files: 
> > etc/mysql/conf.d/cassandra.cnf
> 
> Some dependency was missing and Cassandra was not built. Note that the
> upstream repository is not identical to the one in Debian regarding
> the contents of debian/ directory. MariaDB builds without a cache take
> 30 mins each and there are all kinds of things going on. Doing bisect
> (fully correctly) on MariaDB is hard even for experienced developers.
> Your time is probably better spent doing some other kind of debugging.
> 
> I recommend that you file a bug about this upstream, and try to attach
> relevant info from the error log, maybe a strace output etc. Upstream
> devs will guide you on what to debug next.
> 
> One thing you could also try is to start the server with 10.3.29 and
> ensure that you have a clean shutdown (SET GLOBAL
> innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the
> new 10.3.31 binary.

With fast shutdown disabled, mysqld (10.3.29) seems to be stuck in an infinite 
loop. 100% CPU usage, no I/O for an hour until I killed it.
Even dropping all the databases did not help - killed it after it has been 
running for 5 minutes.

Deleted ibdata1 (and ib_logfile0, ib_logfile1), then shutdown ended 
immediately. Seems that the file structure is corrupted somehow - probably 
because of a previous bug. One table is also affected by the "ERROR 1118 
(42000): Row size too large (> 8126)" bug.

When complete SQL dump is restored with a new ibdata1, everything works 
(upgrade to 10.3.31 and also clean shutdown).

-- 
Ondrej Zary



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-10 Thread Otto Kekäläinen
> The problem is in the ibdata1 file (about 450MB). Deleted other database 
> directories and it still crashes, deleted ibdata1 and it runs.
>
> How to bisect mariadb from git? Tried:
> $ git bisect good mariadb-10.3.29
> $ git bisect bad mariadb-10.3.31
> the build process showed version 10.2 so I aborted it.
>
> Checked out mariadb-10.3.30 but dpkg-buildpackage failed with:
> dh_install: mariadb-plugin-cassandra missing files: 
> etc/mysql/conf.d/cassandra.cnf

Some dependency was missing and Cassandra was not built. Note that the
upstream repository is not identical to the one in Debian regarding
the contents of debian/ directory. MariaDB builds without a cache take
30 mins each and there are all kinds of things going on. Doing bisect
(fully correctly) on MariaDB is hard even for experienced developers.
Your time is probably better spent doing some other kind of debugging.

I recommend that you file a bug about this upstream, and try to attach
relevant info from the error log, maybe a strace output etc. Upstream
devs will guide you on what to debug next.

One thing you could also try is to start the server with 10.3.29 and
ensure that you have a clean shutdown (SET GLOBAL
innodb_fast_shutdown=0; SHUTDOWN) and only after that start with the
new 10.3.31 binary.

Ref
- https://mariadb.com/kb/en/shutdown/#see-also
- https://www.percona.com/blog/2020/05/07/prepare-mysql-for-a-safe-shutdown/



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-10 Thread Ondrej Zary
On Sunday 10 October 2021 16:55:45 Otto Kekäläinen wrote:
> Hello!
> 
> Thanks for reporting. Could you please check if this has been reported
> upstream at jira.mariadb.org?
> 
> There isn't much we can do about InnoDB internals in Debian packaging.
> 

The problem is in the ibdata1 file (about 450MB). Deleted other database 
directories and it still crashes, deleted ibdata1 and it runs.

How to bisect mariadb from git? Tried:
$ git bisect good mariadb-10.3.29
$ git bisect bad mariadb-10.3.31
the build process showed version 10.2 so I aborted it.

Checked out mariadb-10.3.30 but dpkg-buildpackage failed with:
dh_install: mariadb-plugin-cassandra missing files: 
etc/mysql/conf.d/cassandra.cnf

-- 
Ondrej Zary



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-10 Thread Ondrej Zary
Haven't found this exact problem. This seems to be closest but the error 
messages are different: https://jira.mariadb.org/browse/MDEV-25981

I'm going to copy the datadir to another machine and debug it further.

On Sunday 10 October 2021 16:55:45 Otto Kekäläinen wrote:
> Hello!
>
> Thanks for reporting. Could you please check if this has been reported
> upstream at jira.mariadb.org?
>
> There isn't much we can do about InnoDB internals in Debian packaging.


-- 
Ondrej Zary



Bug#996028: [debian-mysql] Bug#996028: InnoDB: corrupted TRX_NO after upgrading to 10.3.31

2021-10-10 Thread Otto Kekäläinen
Hello!

Thanks for reporting. Could you please check if this has been reported
upstream at jira.mariadb.org?

There isn't much we can do about InnoDB internals in Debian packaging.