[EMAIL PROTECTED] wrote on 09/12/2007 04:51:38 
PM:

> -          A shelf manager with IPMI service (In fact, there is 2 
> shelf-manager, but the second one is in standby mode)
> -          2 blades running each an openhpi daemon, each of them 
> connected to the shelf-manager to retrieve IPMI data
> -          A SMS connected to OpenHPI with the libopenhpi.so
> 
> 
> Probably using a Master/Slave model, I think that both daemons 
> should have the data in sync in order to be able to provide HPI 
> service in case one of the HPI daemons disappears.  Right?  How hard
> do you think it would be to keep necessary data in sync?  We would 
> probably add a data pipe between the 2 daemons to transfer 
> synchronization data and heartbeat?  Should we also sync some files 
> (Domain Event Log, etc??)
> 
>  What do you think about this?  I am trying to evaluate how much of 
> effort it will be required by us to add these features (in order for
> me to get the goahead)?  

I think that if both domains, having a peer relationship, are accessing 
the same hardware through a plugin, you don't need to take any steps to 
keep the domains in sync other that what its specified in the spec as the 
HPI user's responsibility (e.g. HPI B.02.01 spec; Peer Domains, page 23; 
last paragraph, page 83).

Now, peer domains do not require:
- that same resources (by Resource ID) will have the same tag, severity, 
failed flag, or entry id (order in RPT).
- that events will be received in same order.
- that internal domain events (Domain Event Log) and user events will show 
up in the other domain.
- same alarm id or acknowledge status between DATs.
- same user alarms.
- alarms resulting because of domain conditions to appear in the other 
domain.

You probably want some or all of these things if you are talking about 
master/slave and heartbeats. In that case, you don't necessarily need a 
peer domain (second paragraph, page 25, spec). A master/slave 
configuration could work. libopenhpi.so could switch between them 
transparently if it loses connection with one of them. For synchronizing 
information between the domains, the slave domain could act like a client 
and get all its information from the master domain. Once libopenhpi.so 
needs to switch to the slave domain, it can send an extra message to let 
the slave know that it must get its information from the plugin, instead 
of the master domain. Or this could be automatic, if we let the slave 
switch to the plugin once he detects connection loss with the master. 
Later, the slave polls the master and stops answering HPI requests from 
libopenhpi.so once it finds the master is back on. That would make 
libopenhpi.so switch back to the master....something like that.
An effort like this would take a good chunk of time. The main thing is 
that it depends on having the domains live within their own daemons first, 
which is another good chunk of time (maybe greater). I'm probably going to 
be doing the latter next year.

In my personal opinion, this level of failover redundancy, that goes above 
and beyond the HPI spec, is best done at a higher level middleware above 
the HPI layer. For example, Linux-HA supports a stonith plugin for the 
BladeCenter using OpenHPI. 
However, though the primary goal of OpenHPI is to implement and comply 
with the SAF HPI spec, secondary goals are guided by its volunteers and 
their interests. So I would support any contributions as long as they are 
clean and integrate well.

        --Renier

 
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Openhpi-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openhpi-devel

Reply via email to