05.10.2018 15:00, Simon Bomm пишет: > Hi all, > > Using pacemaker 1.1.18-11 and mysql resource agent ( > https://github.com/ClusterLabs/resource-agents/blob/RHEL6/heartbeat/mysql), > I run into an unwanted behaviour. My point of view of course, maybe it's > expected to be as it is that's why I ask. > > # My test case is the following : > > Everything is OK on my cluster, crm_mon output is as below (no failed > actions) > > Master/Slave Set: ms_mysql-master [ms_mysql] > Masters: [ db-master ] > Slaves: [ db-slave ] > > 1. I insert in a table on master, no issue data is replicated. > 2. I shut down net int on the master (vm),
What exactly does it mean? How do you shut down net? > pacemaker correctly start on the > other node. Master is seen as offline, and db-slave is now master > > Master/Slave Set: ms_mysql-master [ms_mysql] > Masters: [ db-slave ] > > 3. I bring back my net int up, pacemaker see the node online and set the > old-master as a the new slave : > > Master/Slave Set: ms_mysql-master [ms_mysql] > Masters: [ db-slave ] > Slaves: [ db-master ] > > 4. From this point, my external monitoring bash script shows that SQL and > IO thread are not running, but I can't see any error in the pcs > status/crm_mon outputs. Pacemaker just shows what resource agents claim. If resource agent claims resource is started, there is nothing pacemaker can do. You need to debug what resource agent does. > Consequence is that I continue inserting on my new > promoted master but the data is never consumed by my former master computer. > > # Questions : > > - Is this some kind of safety behaviour to avoid data corruption when a > node is back online ? > - When I want to manually start it like ocf does it returns this error : > > mysql -h localhost -u user-repl -pmysqlreplpw -e "START SLAVE" > ERROR 1200 (HY000) at line 1: Misconfigured slave: MASTER_HOST was not set; > Fix in config file or with CHANGE MASTER TO > > - I would expect the cluster to stop the slave and show a failed action, am > I wrong here ? > I am not familiar with specific application and its structure. From quick browsing monitor action does mostly check for running process. Is mySQL process running? > # Other details (not sure it matters a lot) > > No stonith enabled, no fencing or auto-failback. How are you going to resolve split-brain without stonith? "Stopping net" sounds exactly like split brain, in which case further investigation is rather pointless. Anyway, to give some non-hypothetical answer full configuration and logs from both systems are needed. > Symetric cluster > configured. > > Details of my pacemaker resource configuration is > > Master: ms_mysql-master > Meta Attrs: master-node-max=1 clone_max=2 globally-unique=false > clone-node-max=1 notify=true > Resource: ms_mysql (class=ocf provider=heartbeat type=mysql) > Attributes: binary=/usr/bin/mysqld_safe config=/etc/my.cnf.d/server.cnf > datadir=/var/lib/mysql evict_outdated_slaves=false max_slave_lag=15 > pid=/var/lib/mysql/mysql.pid replication_passwd=mysqlreplpw > replication_user=user-repl socket=/var/lib/mysql/mysql.sock > test_passwd=mysqlrootpw test_user=root > Operations: demote interval=0s timeout=120 (ms_mysql-demote-interval-0s) > monitor interval=20 timeout=30 (ms_mysql-monitor-interval-20) > monitor interval=10 role=Master timeout=30 > (ms_mysql-monitor-interval-10) > monitor interval=30 role=Slave timeout=30 > (ms_mysql-monitor-interval-30) > notify interval=0s timeout=90 (ms_mysql-notify-interval-0s) > promote interval=0s timeout=120 > (ms_mysql-promote-interval-0s) > start interval=0s timeout=120 (ms_mysql-start-interval-0s) > stop interval=0s timeout=120 (ms_mysql-stop-interval-0s) > > Any things I'm missing on this ? Did not find a clearly similar usecase > when googling around network outage and pacemaker. > > Thanks > > > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org