On 09/08/16 16:20 -0400, [email protected] wrote: > I've got a 3-node CentOS6 cluster and I'm trying to add mysql 5.1 as > a new service. Other cluster services (IP addresses, Postgresql, > applications) work fine. > > The mysql config file and data files are located on shared, > cluster-wide storage (GPFS), as are config/data files for other > services which work correctly. > > On each node, I can successfully start mysql locally via: > service mysqld start > and via: > rg_test test /etc/cluster/cluster.conf start service mysql > > (in each case, the corresponding command with the 'stop' option will > also successfully shut down mysql). > > However, attempting to start the mysql service with clusvcadm > results in the service failing over from one node to the next, and > being marked as "stopped" after the last node. > > Each failover happens very quickly, in about 5 seconds. I suspect > that rgmanager isn't waiting long enough for mysql to start before > checking if it is running and I have added startup delays in > cluster.conf, but they don't seem to be honored. Nothing is written > into the mysql log file at this time -- no startup or failure > messages, which implies that the mysqld never begins to run. The > only log entries (/var/log/messages, /var/log/cluster/*, etc) > reference rgmanager, not the mysql process itself. > > Any suggestions?
see inline below... > RHCS components: > cman-3.0.12.1-78.el6.x86_64 > luci-0.26.0-78.el6.centos.x86_64 > rgmanager-3.0.12.1-26.el6_8.3.x86_64 > ricci-0.16.2-86.el6.x86_64 > corosync-1.4.7-5.el6.x86_64 > > > --------------------- /etc/cluster/cluster.conf (edited subset) > ----------------- > <cluster config_version="63" name="example-rhcs"> > <rm> > <resources> > <postgres-8 > config_file="/var/lib/pgsql/data/postgresql.conf" name="PostgreSQL8" > postmaster_user="postgres" startup_wait="25"/> > <ip address="192.168.169.173" sleeptime="10"/> > <mysql > config_file="/cluster_shared/mysql_centos6/etc/my.cnf" > listen_address="192.168.169.173" name="mysql" shutdown_wait="10" > startup_wait="30"/> > </resources> > <service max_restarts="3" name="mysql" recovery="restart" > restart_expire_time="180"> > <ip ref="192.168.169.173"> > <mysql ref="mysql"/> > </ip> > </service> > </rm> > </cluster> > -------------------------------------------------------------------------- > > > --------------------- /var/log/cluster/rgmanager.log from attempt to start > mysql with clusvcadm ----------------------- > Aug 08 11:58:16 rgmanager Recovering failed service service:mysql > Aug 08 11:58:16 rgmanager [ip] Link for eth2: Detected > Aug 08 11:58:16 rgmanager [ip] Adding IPv4 address 192.168.169.173/24 to eth2 > Aug 08 11:58:16 rgmanager [ip] Pinging addr 192.168.169.173 from dev eth2 > Aug 08 11:58:18 rgmanager [ip] Sending gratuitous ARP: 192.168.169.173 > c8:1f:66:e8:bb:34 brd ff:ff:ff:ff:ff:ff > Aug 08 11:58:19 rgmanager [mysql] Verifying Configuration Of mysql:mysql > Aug 08 11:58:19 rgmanager [mysql] Verifying Configuration Of mysql:mysql > > Succeed > Aug 08 11:58:19 rgmanager [mysql] Monitoring Service mysql:mysql > Aug 08 11:58:19 rgmanager [mysql] Checking Existence Of File > /var/run/cluster/mysql/mysql:mysql.pid [mysql:mysql] > Failed > Aug 08 11:58:19 rgmanager [mysql] Monitoring Service mysql:mysql > Service Is > Not Running > Aug 08 11:58:19 rgmanager [mysql] Starting Service mysql:mysql > Aug 08 11:58:19 rgmanager [mysql] Looking For IP Address > Succeed - IP > Address Found > Aug 08 11:58:20 rgmanager [mysql] Starting Service mysql:mysql > Succeed > Aug 08 11:58:21 rgmanager [mysql] Monitoring Service mysql:mysql > Aug 08 11:58:21 rgmanager 1 events processed > Aug 08 11:58:21 rgmanager [mysql] Checking Existence Of File > /var/run/cluster/mysql/mysql:mysql.pid [mysql:mysql] > Failed As business of launching services used to be incredibly racy (and often, still is), where launching scripts presumably finishes so as to denote that the service is ready for service while that's not entirely true as it in fact is still "just warming up" (perhaps not even the PID file is created by then), I can imagine that hackish workaround > 127 # Sleep 1 sec before checking status so mysqld can start > 128 sleep 1 may not be enough in your deployment (large DB, high load due to other [clustered or not] services unlike in rg_test scenario...) so I'd start with tweaking that value in /usr/share/cluster/mysql.sh to some higher figures to see if it helps. > Aug 08 11:58:21 rgmanager [mysql] Monitoring Service mysql:mysql > Service Is > Not Running > Aug 08 11:58:21 rgmanager start on mysql "mysql" returned 7 (unspecified) > Aug 08 11:58:21 rgmanager #68: Failed to start service:mysql; return value: 1 > Aug 08 11:58:21 rgmanager Stopping service service:mysql > Aug 08 11:58:21 rgmanager [mysql] Verifying Configuration Of mysql:mysql > Aug 08 11:58:21 rgmanager [mysql] Verifying Configuration Of mysql:mysql > > Succeed > Aug 08 11:58:21 rgmanager [mysql] Stopping Service mysql:mysql > Aug 08 11:58:21 rgmanager [mysql] Checking Existence Of File > /var/run/cluster/mysql/mysql:mysql.pid [mysql:mysql] > Failed - File Doesn't > Exist > Aug 08 11:58:21 rgmanager [mysql] Stopping Service mysql:mysql > Succeed > -------------------------------------------------------------------------------- -- Jan (Poki)
pgpZLBR9_w0QS.pgp
Description: PGP signature
_______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
