Bug#881034: Sid: Galera-3 default configuration; nodes beyond primary will not connect.

Louis Schmitz Tue, 07 Nov 2017 02:03:46 -0800

Package: galera-3
Version: 25.3.19-2
Severity: minor
Tags: sid

The default configuration of Galera-3 systemd scripts for the mariaDB server in 
debian Sid is incorrect. The secondary (tertiary, etc.) nodes do not connect to 
the primary node.
The problem may also be with one of the 'systemd' or 'mariadb' packages, and 
the script may be provided upstream. This information is unknown to me.


Reproducing the issue:
1) Create 3 connected VMs or connect 3 physical machines on a LAN with a 
default debian install.
2) Install the galera-3, mariadb-client, mariadb-server, and rsync packages.
3) Configure /etc/mysql/conf.d/galera.cnf with the recommended settings. In my 
case, each of 3 servers had a configuration file like this (the two bottom 
lines are changed on a server-to-server basis):

[mysqld]
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_doublewrite=1
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0

#galera settings
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="osdb_cluster"
wsrep_cluster_address=gcomm://10.0.40.111,10.0.40.112,10.0.40.113
wsrep_sst_method=rsync

wsrep_node_address="10.0.40.111"
wsrep_node_name="galera01"

4) Bootstrap the cluster by using the recommended script on the primary node, 
by calling
    $ galera_new_cluster
5) Using the recommended way, check whether the cluster is started:

MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+
1 row in set (0.01 sec)

6) Now, on the second node, try the recommended way of starting this node (as 
described on galera's home page):
    $ systemctl start mysql
Here's where the bug happens. This operation fails with the mysql server not 
starting on the second node, while we're expecting a two-node cluster.  Below 
is the output passed to the file '/var/log/mysql/error.log'.

2017-11-06 17:10:13 139930521723456 [Note] WSREP: Read nil XID from storage 
engines, skipping position init
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_load(): loading 
provider library '/usr/lib/galera/libgalera_smm.so'
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_load(): Galera 
3.19(rb98f92f) by Codership Oy <[email protected]> loaded successfully.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: CRC-32C: using hardware 
acceleration.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Found saved state: 
00000000-0000-0000-0000-000000000000:-1, safe_to_bootsrap: 1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Passing config to GCS: 
base_dir = /var/lib/mysql/; base_host = 10.0.40.113; base_port = 4567; 
cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = 
PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; 
evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; 
evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = 
PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; 
evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; 
gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = 
/var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; 
gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 
1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; 
gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; 
gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; 
gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ig
2017-11-06 17:10:13 139930521723456 [Note] WSREP: GCache history reset: 
old(00000000-0000-0000-0000-000000000000:0) -> 
new(00000000-0000-0000-0000-000000000000:-1)
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Assign initial position for 
certification: -1, protocol version: -1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_sst_grab()
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Start replication
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Setting initial position to 
00000000-0000-0000-0000-000000000000:-1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: protonet asio version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Using CRC-32C for message 
checksums.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: backend: asio
2017-11-06 17:10:13 139930521723456 [Note] WSREP: gcomm thread scheduling 
priority set to other:0
2017-11-06 17:10:13 139930521723456 [Warning] WSREP: access 
file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2017-11-06 17:10:13 139930521723456 [Note] WSREP: restore pc from disk failed
2017-11-06 17:10:13 139930521723456 [Note] WSREP: GMCast version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') multicast: , ttl: 1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: EVS version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: gcomm: connecting to group 
'osdb_cluster', peer '10.0.40.111:,10.0.40.112:,10.0.40.113:'
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') connection established to f877f818 tcp://10.0.40.113:4567
2017-11-06 17:10:13 139930521723456 [Warning] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') address 'tcp://10.0.40.113:4567' points to own listening 
address, blacklisting
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') connection established to f5e5497f tcp://10.0.40.112:4567
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') connection established to 71b33d37 tcp://10.0.40.111:4567
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2017-11-06 17:10:13 139930521723456 [Note] WSREP: declaring 71b33d37 at 
tcp://10.0.40.111:4567 stable
2017-11-06 17:10:13 139930521723456 [Note] WSREP: declaring f5e5497f at 
tcp://10.0.40.112:4567 stable
2017-11-06 17:10:13 139930521723456 [Note] WSREP: 
view(view_id(NON_PRIM,71b33d37,360) memb {
        71b33d37,0
        f5e5497f,0
        f877f818,0
} joined {
} left {
} partitioned {
        1d6324f7,0
        1e18a4f1,0
        2416f453,0
        3355448a,0
        5fc6bb86,0
        62a46c5c,0
        b1235628,0
})
2017-11-06 17:10:16 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') connection to peer f877f818 with addr 
tcp://10.0.40.113:4567 timed out, no messages seen in PT3S
2017-11-06 17:10:16 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') turning message relay requesting off
2017-11-06 17:10:39 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
tcp://10.0.40.112:4567
2017-11-06 17:10:40 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') reconnecting to f5e5497f (tcp://10.0.40.112:4567), 
attempt 0
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: failed to open gcomm backend 
connection: 110: failed to reach primary view: 110 (Connection timed out)
         at gcomm/src/pc.cpp:connect():158
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: 
gcs/src/gcs_core.cpp:gcs_core_open():208: Failed to open backend connection: 
-110 (Connection timed out)
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: 
gcs/src/gcs.cpp:gcs_open():1380: Failed to open channel 'osdb_cluster' at 
'gcomm://10.0.40.111,10.0.40.112,10.0.40.113': -110 (Connection timed out)
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: gcs connect failed: 
Connection timed out
2017-11-06 17:10:43 139930521723456 [ERROR] WSREP: 
wsrep::connect(gcomm://10.0.40.111,10.0.40.112,10.0.40.113) failed: 7
2017-11-06 17:10:43 139930521723456 [ERROR] Aborting

As one can see; the node will not connect to the base node. Normally, this 
issue would be caused by a network problem. I checked using 'nmap', and the 
ports galera wants to use by default are not in use by another service or 
blocked by a firewall. This is a simple test setup without any firewall or 
password settings.

WORKAROUND:
I believe the problem is caused by an error in the systemctl script for 
starting mysql. I found a workaround for the problem like so:

7) On the second node, manually start mysqld with
$ mysqld --wsrep-new-cluster
8) Connect with a second terminal and manually kill the mysqld process. (E.g. 
kill -9 12741, where 12741 is whatever PID got assigned to mysql)
9) Now start mysql via the recommended way on node 2:
$ systemctl start mysql
10) Back on node 1; we can run the query to check that the new node was added 
to the cluster;

MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 2     |
+--------------------+-------+
1 row in set (0.01 sec)

This time, there is no error, and the second cluster node starts up correctly 
(this process can be repeated for the 3rd, 4th, etc. nodes).

CONFIGURATION:

Apart from the configuration files mentioned above, no configuration files have 
been modified (e.g. the config files are the unmodified defaults shipped with 
mariadb, which sets up a basic database server with no password protection).

$ uname -a
Linux st01 4.9.0-4-amd64 #1 SMP Debian 4.9.51-1 (2017-09-28) x86_64 GNU/Linux

$ cat /etc/issue
Debian GNU/Linux 9 \n \l

Bug#881034: Sid: Galera-3 default configuration; nodes beyond primary will not connect.

Reply via email to