Can you create a hb_report for this? There are a number of things that could be going on but we need that report to say for sure.
On Tue, Feb 26, 2008 at 11:43 AM, Adrian Chapela <[EMAIL PROTECTED]> wrote: > Raoul Bhatia [IPAX] escribió: > > > Adrian Chapela wrote: > >> <master_slave id="MySQL_Server"> > > [snip] > >> <operations> > >> <op id="mysqld-child-monitor" name="monitor" > >> interval="20s" timeout="19s" prereq="nothing"/> > >> <op id="mysqld-child-start" name="start" prereq="nothing"/> > >> </operations> > >> </primitive> > >> </master_slave> > >> > >> I think that this line: <op id="mysqld-child-monitor" > >> name="monitor" interval="20s" timeout="19s" prereq="nothing"/> is the > >> line to config monitoring operations and the time to do that. In this > >> line I think interval is 20 seconds, but I am testing and I manually > >> make an error in the Master MySQL server to test failover. I saw that > >> monitoring operation isn't being executed and the error isn't > >> detected by Heartbeat. > >> > >> If I run the script manually the error is detected but Heartbeat is > >> not running the script in monitor mode and it don't know the problem. > >> This is the crm_mon output: > > > > [snip] > Yes, I already did this and now I am testing more options. Now, a Slave > server is making failover well but I have some problems with my mysql > script ( http://code.adrianchapela.net/heartbeat/mysql_slave_master ). > One of them is the stop operation. After a failure, my mysql resource is > stopped but MySQL monitor is always informing that the server is down > and failed. Heartbeat knows the server is failed. When I am stopping > Heartbeart server, this can't stop well. It says this: > > crmd[8531]: 2008/02/26_11:09:10 ERROR: verify_stopped: Resource > mysqld-child:0 was active at shutdown. You may ignore this error if it > is unmanaged. > crmd[8531]: 2008/02/26_11:09:10 info: process_client_disconnect: > Received HUP from tengine:[-1] > crmd[8531]: 2008/02/26_11:09:10 ERROR: verify_stopped: Resource > mysqld-child:0 was active at shutdown. You may ignore this error if it > is unmanaged. > > And this: > tengine[8566]: 2008/02/26_11:09:09 info: te_connect_stonith: Attempting > connection to fencing daemon... > crmd[8531]: 2008/02/26_11:09:09 info: stop_subsystem: Sent -TERM to > tengine: [8566] > tengine[8566]: 2008/02/26_11:09:09 ERROR: stonithd_signon: Can't > initiate connection to stonithd > crmd[8531]: 2008/02/26_11:09:09 info: do_shutdown: Waiting for > subsystems to exit > tengine[8566]: 2008/02/26_11:09:09 notice: Not currently connected. > crmd[8531]: 2008/02/26_11:09:09 info: do_shutdown: All subsystems > stopped, conti > > I am searching information about this errors and How can I force the > stop operation ? Stonith daemon should shutdown the server automatically ? > > > > > please refer to [1] and add more monitoring actions for all applicable > > roles. > > > > cheers, > > raoul > > > > [1] > > > http://www.linux-ha.org/ClusterInformationBase/Actions#head-951a50aae161c116d73c95aa0659873ee7a2973b > > > > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
