Hello,
We are evaluating new hardware by reproducing real-life workload on
real-life data which works fine on an existing server.
We copied over the software used to ensure an apple-to-apple
comparison, which is based on MariaDB 10.2.23 and uses primarily
InnoDB tables, with a few Mroonga tables which do not seem involved in
the problem.
On the new hardware, we are seeing catastrophically bad performance,
especially chain-deadlocks happening and being resolved but producing
virtually no useful work at our usual parallelism level of 64 active
connections.
If we reduce parallelism to 2 active connections, queries start to
succeed in any meaningful proportion.
The database was freshly restored from a mysqldump (as we cannot
interrupt the original database), could this have any effect on
deadlocks ?
The main tables involved in the deadlocking queries (job queues) are
initially (in the test, and periodically in reality) empty, so it would
seem surprising.
What should I check to debug further ?
Note that these deadlocks happen in production but they're negligible.
See attachment for mariadb configuration file.
Some more details:
When we restore from mysqldump, we use the following extra options to
speed things up:
innodb_flush_log_at_trx_commit = 0
innodb_flush_method = nosync
innodb_doublewrite = 0
sync_frm = 0
as during that period nothing of value can be lost (worst case we
restart the restore from scratch).
BTW, despite these settings, we are still noticing a lot of fsyncs. Is
this expected ? Are we missing some other option ?
Also, after the import and without any connection, MariaDB was still
producing a non-trivial amount of activity on the machine: 5% CPU, read
<1MB/s, write 10MB/s, 60 fsync/s.
I could not identify what is causing these, where should I look ? How
can I tell when it will stabilise back to idle ?
When I interrupt the benchmark workload, I see a similar resource usage.
I saw https://jira.mariadb.org/browse/MDEV-18698
And InnoDB seems to throttle its background activity: is there a way to
tell InnoDB to perform its background tasks at maximum speed ?
(the goal is that after the import, we can make a clean tarball that we
extract when we want to launch the test case again)
Regards,
Julien
# ERP5 buildout my.cnf template based on my-huge.cnf shipped with mysql
# The MySQL server
[mysqld]
# ERP5 by default requires InnoDB storage. MySQL by default fallbacks to using
# different engine, like MyISAM. Such behaviour generates problems only, when
# tables requested as InnoDB are silently created with MyISAM engine.
#
# Loud fail is really required in such case.
sql_mode="NO_ENGINE_SUBSTITUTION"
skip_show_database
bind_address = 10.0.148.200
port = 2099
socket = /srv/slapgrid/slappart25/var/run/mariadb.sock
datadir = /srv/slapgrid/slappart25/srv/mariadb
tmpdir = /srv/slapgrid/slappart25/tmp
pid_file = /srv/slapgrid/slappart25/var/run/mariadb.pid
log_error = /srv/slapgrid/slappart25/var/log/mariadb_error.log
slow_query_log
slow_query_log_file = /srv/slapgrid/slappart25/var/log/mariadb_slowquery.log
long_query_time = 1
max_allowed_packet = 128M
query_cache_size = 32M
innodb_file_per_table = 0
plugin_load = ha_mroonga.so;handlersocket.so
# By default only 100 connections are allowed, when using zeo
# we may have much more connections
max_connections = 1000
innodb_buffer_pool_size = 17179869184
innodb_buffer_pool_instances = 16
innodb_log_file_size = 134217728
# very important to allow parallel indexing
# Note: this is compatible with binlog-based incremental backups, because ERP5
# doesn't use "insert ... select" (in any number of queries) pattern.
innodb_locks_unsafe_for_binlog = 1
# Some dangerous settings you may want to uncomment temporarily
# if you only want performance or less disk access.
#innodb_flush_log_at_trx_commit = 0
#innodb_flush_method = nosync
#innodb_doublewrite = 0
#sync_frm = 0
# Force utf8 usage
collation_server = utf8_unicode_ci
character_set_server = utf8
skip_character_set_client_handshake
[client]
socket = /srv/slapgrid/slappart25/var/run/mariadb.sock
user = root
[mysql]
no_auto_rehash
[mysqlhotcopy]
interactive_timeout
[mysqldump]
max_allowed_packet = 128M
_______________________________________________
Mailing list: https://launchpad.net/~maria-discuss
Post to : [email protected]
Unsubscribe : https://launchpad.net/~maria-discuss
More help : https://help.launchpad.net/ListHelp