Send Linux-HA mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.linux-ha.org/mailman/listinfo/linux-ha
or, via email, send a message with subject or body 'help' to
[EMAIL PROTECTED]
You can reach the person managing the list at
[EMAIL PROTECTED]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Linux-HA digest..."
Today's Topics:
1. Re: heartbeat dying (Gary Schlachter)
2. monitor mysql + prevent splitbrain (2node-cluster) (Lino Moragon)
3. RE: Get resource location by C/C++ program (API) (Stephan Berlet)
4. Re: monitor mysql + prevent splitbrain (2node-cluster)
(Michael Brennen)
----------------------------------------------------------------------
Message: 1
Date: Mon, 14 Jan 2008 11:58:17 -0500
From: Gary Schlachter <[EMAIL PROTECTED]>
Subject: Re: [Linux-HA] heartbeat dying
To: General Linux-HA mailing list <[email protected]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Dejan,
I started there. However, the problem I had was that I could not
install 2.1.3 on Fedora Core 1 since it needed later versions of other
RPMs. I can make 2.1.3 on FC1 but when I try to package heartbeat, I
get missing libnet-devel, openhpi-devel, gnutls-devel, OpenIPMI-devel.
Is there a way around this?
Gary
Dejan Muhamedagic wrote:
Hi,
On Fri, Jan 11, 2008 at 10:22:48AM -0500, Gary Schlachter wrote:
I have a problem with heartbeat dying. I have a 3 node cluster running
HA 2.0.8 on Fedora Core 1. They are providing a single IP address
resource. They are using eth0 as the heartbeat mechanism. If I disconnect
the eth0 cable from the node which is providing the IP address, one of the
other nodes correctly begins providing it. However, shortly after
disconnecting the eth0 cable, the heartbeat process (and others) die. The
This has been fixed a few months ago. The fix is in the 2.1.3
release. Could you please use the new release.
Thanks,
Dejan
key area in the ha-debug log looks like the following:
pengine[4293]: 2008/01/11_09:50:22 info: determine_online_status: Node
loneranger.us.big.net is online
pengine[4293]: 2008/01/11_09:50:22 info: native_print: SharedIP
(heartbeat::ocf:IPaddr): Started loneranger.us.big.net
pengine[4293]: 2008/01/11_09:50:22 notice: StopRsc: loneranger.us.big.net
Stop SharedIP
crmd[9543]: 2008/01/11_09:50:22 info: do_state_transition:
loneranger.us.big.net: State transition S_POLICY_ENGINE
->S_TRANSITION_ENGINE [input=I_PE_SUCCESS cause=C_IPC_MESSAGE
origin=route_message ]
pengine[4293]: 2008/01/11_09:50:22 info: process_pe_message: Transition 0:
PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-137.bz2
tengine[4292]: 2008/01/11_09:50:22 info: unpack_graph: Unpacked transition
0: 1 actions in 1 synapses
tengine[4292]: 2008/01/11_09:50:22 info: send_rsc_command: Initiating
action 3: SharedIP_stop_0 on loneranger.us.big.net
crmd[9543]: 2008/01/11_09:50:22 info: do_lrm_rsc_op: Performing
op=SharedIP_stop_0 key=3:0:994066a9-4cae-49a4-abad-37f3e0b84b3e)
IPaddr[4300]: 2008/01/11_09:50:22 INFO: /sbin/ifconfig eth0:0 10.1.2.50
down
lrmd[9540]: 2008/01/11_09:50:22 info: RA output: (SharedIP:stop:stderr)
SIOCDELRT: No such process
crmd[9543]: 2008/01/11_09:50:22 info: process_lrm_event: LRM operation
SharedIP_stop_0 (call=4, rc=0) complete
cib[9539]: 2008/01/11_09:50:22 info: cib_diff_notify: Update (client: 9543,
call:32): 0.30.317 -> 0.30.318 (ok)
cib[4315]: 2008/01/11_09:50:22 info: write_cib_contents: Wrote version
0.30.318 of the CIB to disk (digest: ad7329b3cddc6a9bbd96deb332a3d08f)
tengine[4292]: 2008/01/11_09:50:22 info: te_update_diff: Processing diff
(cib_update): 0.30.317 -> 0.30.318
tengine[4292]: 2008/01/11_09:50:22 info: match_graph_event: Action
SharedIP_stop_0 (3) confirmed on c8608d41-66b2-4115-9043-4a8423b0d562
tengine[4292]: 2008/01/11_09:50:22 info: run_graph: Transition 0:
(Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0)
tengine[4292]: 2008/01/11_09:50:22 info: notify_crmd: Transition 0 status:
te_complete - <null>
crmd[9543]: 2008/01/11_09:50:22 info: do_state_transition:
loneranger.us.big.net: State transition S_TRANSITION_ENGINE -> S_IDLE [
input=I_TE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ]
heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0:
Resource temporarily unavailable
heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down.
heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0:
Resource temporarily unavailable
heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down.
heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0:
Resource temporarily unavailable
heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down.
The last messages repeat for a very long time then most daemons eventually
stop.
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
------------------------------
Message: 2
Date: Mon, 14 Jan 2008 19:33:15 +0100
From: Lino Moragon <[EMAIL PROTECTED]>
Subject: [Linux-HA] monitor mysql + prevent splitbrain (2node-cluster)
To: [email protected]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi list,
I've got 2 questions concerning the prevention of splitbrain and
monitoring MySQL Server 5.
I'm testing a MySQL Server 5 with 3 instances on a CentOS 5.1 with
Heartbeat and DRBD on a 2 Node Cluster (active / passive)
At the moment my 2 Nodes are running on a VMware Server.
I use the following Versions:
heartbeat v. 2.0.8-1
DRBD v. 8.0.6
For heartbeat style I'm using Release 1.
I've configured on each 2 NICs, 1 for DRBD sync and heartbeat and
another one for heartbeat.
haresoures:
mysql1 drbddisk::r0 Filesystem::/dev/drbd0::/pool/mysql/::ext3
172.16.100.110 mysqld_multi
My Questions:
1. If I unplug both NICs of the active Node, I get a Splitbrain after I
reconnect them again.
Is there any solution to prevent this using heartbeat R1 or which
possibilities would I have with R2?
2. How can I tell heartbeat to make an automatic failover to my passive
node if any of my MySQL Process has a hangup or terminates?
Can you monitor these processes and in cause of failure provoke an
automatic failover? If yes, which tools would I have to use?
I digged around the linux-ha site and other mailing-list articles but so
far unsuccessful.
Has anyone had this combination yet?
I'd be very thankful for any ideas / suggestions.
Lino
------------------------------
Message: 3
Date: Mon, 14 Jan 2008 19:40:36 +0100
From: "Stephan Berlet" <[EMAIL PROTECTED]>
Subject: RE: [Linux-HA] Get resource location by C/C++ program (API)
To: "'General Linux-HA mailing list'" <[email protected]>
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain;charset="iso-8859-1"
Hello again,
I've worked at this things. I'm not finished yet, but now I
know a couple of things better.
On Jan 10, 2008, at 0:23 PM, Andrew Beekhof wrote:
On Jan 9, 2008, at 7:30 PM, Stephan Berlet wrote:
When I try to compile crm_mon.c, the compiler moans that he can't
find the headers "lha_internal.h" and "lib/crm/pengine/unpack.h"
crm_mon.c can only be built from within the project
in particular, the name of the first header should tell you something
about who should be including it :-)
you're better off starting from scratch and copying in only what you
need
That is what I've done in the meantime.
Both files don't exist in my filesystem. (I'm searched them by
using 'locate'). Is it because I installed heartbeat with rpms?
right, they're both internal files which are not installed. you
shouldn't be using them.
I've simply omitted these two files, and I hope it works anyway.
Another problem for me is that there are some conflicts with C++
keywords
someone had a nice solution for this previously on the mailing list.
i forget the details but google should be able to help
That solution works fine, here my code therefore:
#ifdef __cplusplus
extern "C" {
# define delete __fake_delete
# define private __fake_private
# define new __fake_new
# define class __fake_class
// Add other defines for any conflicting C++ keyword
#endif
/*** include heartbeat headers here ***/
#ifdef __cplusplus
}
#endif
and invalid transformations (e.g. void* to resource_t*)
Is it possible to make the macro "slist_iter" C++ compliant?
probably
but not being a c++ guy i'd not know how. i'm happy to take patches
though...
I worked out a solution for this thing, too. Just modify one line
in the definition from slist_iter:
(more precisely the line 196 in /include/crm/crm.h, version 2.1.2-3)
-- child = __crm_iter_head->data; \
++ child = (child_type *) __crm_iter_head->data; \
That works for my purposes.
Similar changes for the xml_child_iter macro in xml.h
On Jan 8, 2008, at 3:20 AM, Andrew Beekhof wrote:
On Jan 7, 2008, at 2:54 PM, Stephan Berlet wrote:
Hello,
First of all I want to excuse me for my bad english!
We use heartbeat 2.1.2-3 in a 2 node cluster, just to manage the
virtual
IP adress 172.30.4.170. We have a network service that have to run
at both nodes to make sure they have a synchronous data set.
Therefore both nodes have to know which one holds the virtual IP.
I would like to implement that with the heartbeat API.
If you're using the crm, then the correct API to use is from the
Policy Engine.
For an example, check out the source code for crm_mon.
Maybe I will report my final results with this subject,
or I will ask you many more questions ;)
Best regards and many thanks,
Stephan
HELPING HEADS for Hard- and Software
-------------------------------------------------------------------------
Für Ihre Projekte entwickeln wir maßgeschneiderte Lösungen - schnell,
flexibel und direkt vor Ort. Unser eingespieltes Team an erfahrenen Hard-
und Software-Spezialisten unterstützt Sie dort, wo Sie uns brauchen.
--------------------------------------------------------------------------
SysDesign GmbH
Säntisstrasse 25
D-88079 Kressbronn am Bodensee
Geschäftsführer: Franz Kleiner, Achim Solle
Handelsregister: Ulm 632138
--------------------------------------------------------------------------
------------------------------
Message: 4
Date: Mon, 14 Jan 2008 12:47:07 -0600
From: Michael Brennen <[EMAIL PROTECTED]>
Subject: Re: [Linux-HA] monitor mysql + prevent splitbrain
(2node-cluster)
To: [email protected]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="iso-8859-1"
On Monday 14 January 2008 12:33, Lino Moragon wrote:
Hi list,
I've got 2 questions concerning the prevention of splitbrain and
monitoring MySQL Server 5.
I'm testing a MySQL Server 5 with 3 instances on a CentOS 5.1 with
Heartbeat and DRBD on a 2 Node Cluster (active / passive)
At the moment my 2 Nodes are running on a VMware Server.
I use the following Versions:
heartbeat v. 2.0.8-1
DRBD v. 8.0.6
For heartbeat style I'm using Release 1.
I've configured on each 2 NICs, 1 for DRBD sync and heartbeat and
another one for heartbeat.
haresoures:
mysql1 drbddisk::r0 Filesystem::/dev/drbd0::/pool/mysql/::ext3
172.16.100.110 mysqld_multi
My Questions:
1. If I unplug both NICs of the active Node, I get a Splitbrain after I
reconnect them again.
Is there any solution to prevent this using heartbeat R1 or which
possibilities would I have with R2?
That sounds normal, as both machines then think they can become primary.
Do you have a fence mechanism in place so the secondary can forcibly take the
former primary out of service?
2. How can I tell heartbeat to make an automatic failover to my passive
node if any of my MySQL Process has a hangup or terminates?
Can you monitor these processes and in cause of failure provoke an
automatic failover? If yes, which tools would I have to use?
That I'm not sure, I will be awaiting the answer myself. :)