Re: [Linux-HA] sub-clusters - heartbeat tunneling

Max Hofer Fri, 30 Mar 2007 01:07:53 -0800

On Friday 30 March 2007 01:25, Alan Robertson wrote:
> Max Hofer wrote:
> > I have a questiuon regarding the heartbeat messgage exchange.
> > 
> > Currently i have 2 cluster systems, each consisting of 2 node:
> > - cluster A consists of nodes A1, A2
> > - cluster B consists of nodes B1, B2
> > 
> > All 4 nodes are attached with bonded interface to a tow LAN
> > switches SW1 and SW2 (lets call it normal LAN).
> > 
> > A1 and A2 (and B1 and B2) have a direct interconnection where
> > the DRBD devices are syncronized plus a serial cable (lets call it
> > DRBD LAN)
> > 
> > Thus currently cluster A (and B) use 3 different ways to exchange
> > the heartbeat packages:
> > - bcast ofer the DRDB LAN
> > - ucast using normal lan
> > - the serial cable
> > 
> > Now i figured out the cluster A needs states/data from cluster B 
> > (and vice versa) for some fail-over decisions.
> > 
> > I see 2 possible solutions:
> > a) wrting a resource agent which polls the state from the other cluster
> > and i use this state
> > b) i configure 1 single cib.xml with 2 "sub-clusters"
> > 
> > With sub-cluster i mean certain resource run only on cluster A and other
> > resource run only on B.
> > 
> > My question now:
> > * what will happen if one of the nodes is disconnected from the normal
> > LAN - are the information tunnled over the redundant connections?
> > 
> > Scenario: A1 is disconnected from SW1. A2 still recieves HB packages 
> > via the serial line and the DRBD LAN. Do B1 and B2 see A1 as dead or
> > do the get the information about A1 via A2?
> > 
> > maybe strange scenario but i have it (but unfortunatly i can not test it out
> > because some external constraints .... managers! ;-)
> 
> I'd suggest starting your thinking by having a single cluster, and using
> the heartbeat APIs to exchange messages.  You can even do this at a
> shell script level ;-).
Is there somehwere a documentation about the messages?


> Or you could write status information into CIB node attributes - which
> are automatically distributed...  And, clone resources notify you when
> your peers come and go.
I already write the status in the CIB. When i was refering to "states" above
i meant CIB node status attributes. I would like to controll resources of 
A1/A2 based on node attributes located on B1/B2.

My main question was:
If A1 is not connected over the normal LAN and B1/B2 change an 
attribute in the CIB, is this change propagted from B1/B2 to A2 and 
then A2 is giving it to A1 over the serial cable or the DRBD LAN?

If this is the case i'm fine and i can write my constrains to start/stop
resources on A1/A2.

> So, a thought would be a clone resource combined with writing short,
> simple state into the node attributes.
I have a set of resource agents which create quite a lot of attributes
(currently i have approx. 50 status attributes for each node cluster
---> the combined cluster would have more than 100 attributes).

But i noticed following heartbeat behaviour:
* attrd seems to be a bottle neck. One resource agent starts a process
monitoing daemon (something like pingd for the network - instead it monitors
processes i'm interested in). To keep it short, multiple threads start writing 
to the CIB using attrd_updater. During daemon start/shutdown phase a lot 
of attributes are written to the CIB. attrd_updater sometimes returns an 
error code saying attrd is not accessible.

I'm not sure if attrd was designed to handle changes from multiple threads.
(for now i help myself by repeated call of attrd_udpater when it returns
with an error code)

* on cluster startup the log really spams the hell out. Each time an attribute 
is set the transition engine applies the rules based on the attributes. 


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] sub-clusters - heartbeat tunneling

Reply via email to