On Friday 30 March 2007 01:25, Alan Robertson wrote: > Max Hofer wrote: > > I have a questiuon regarding the heartbeat messgage exchange. > > > > Currently i have 2 cluster systems, each consisting of 2 node: > > - cluster A consists of nodes A1, A2 > > - cluster B consists of nodes B1, B2 > > > > All 4 nodes are attached with bonded interface to a tow LAN > > switches SW1 and SW2 (lets call it normal LAN). > > > > A1 and A2 (and B1 and B2) have a direct interconnection where > > the DRBD devices are syncronized plus a serial cable (lets call it > > DRBD LAN) > > > > Thus currently cluster A (and B) use 3 different ways to exchange > > the heartbeat packages: > > - bcast ofer the DRDB LAN > > - ucast using normal lan > > - the serial cable > > > > Now i figured out the cluster A needs states/data from cluster B > > (and vice versa) for some fail-over decisions. > > > > I see 2 possible solutions: > > a) wrting a resource agent which polls the state from the other cluster > > and i use this state > > b) i configure 1 single cib.xml with 2 "sub-clusters" > > > > With sub-cluster i mean certain resource run only on cluster A and other > > resource run only on B. > > > > My question now: > > * what will happen if one of the nodes is disconnected from the normal > > LAN - are the information tunnled over the redundant connections? > > > > Scenario: A1 is disconnected from SW1. A2 still recieves HB packages > > via the serial line and the DRBD LAN. Do B1 and B2 see A1 as dead or > > do the get the information about A1 via A2? > > > > maybe strange scenario but i have it (but unfortunatly i can not test it out > > because some external constraints .... managers! ;-) > > I'd suggest starting your thinking by having a single cluster, and using > the heartbeat APIs to exchange messages. You can even do this at a > shell script level ;-). Is there somehwere a documentation about the messages?
> Or you could write status information into CIB node attributes - which > are automatically distributed... And, clone resources notify you when > your peers come and go. I already write the status in the CIB. When i was refering to "states" above i meant CIB node status attributes. I would like to controll resources of A1/A2 based on node attributes located on B1/B2. My main question was: If A1 is not connected over the normal LAN and B1/B2 change an attribute in the CIB, is this change propagted from B1/B2 to A2 and then A2 is giving it to A1 over the serial cable or the DRBD LAN? If this is the case i'm fine and i can write my constrains to start/stop resources on A1/A2. > So, a thought would be a clone resource combined with writing short, > simple state into the node attributes. I have a set of resource agents which create quite a lot of attributes (currently i have approx. 50 status attributes for each node cluster ---> the combined cluster would have more than 100 attributes). But i noticed following heartbeat behaviour: * attrd seems to be a bottle neck. One resource agent starts a process monitoing daemon (something like pingd for the network - instead it monitors processes i'm interested in). To keep it short, multiple threads start writing to the CIB using attrd_updater. During daemon start/shutdown phase a lot of attributes are written to the CIB. attrd_updater sometimes returns an error code saying attrd is not accessible. I'm not sure if attrd was designed to handle changes from multiple threads. (for now i help myself by repeated call of attrd_udpater when it returns with an error code) * on cluster startup the log really spams the hell out. Each time an attribute is set the transition engine applies the rules based on the attributes. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
