Hello,

Following local experience with ipmi stonith, we've found that certain
systems (i.e. Sun X4200)would occasionally fail to perform a reset.
Using a power off and then a power on proved far more reliable. So if
scripts are likely to find their way into mixed environments then going
the route of the lowest common denominator is probably the best / safest
way to go ? NB. The script we use has been previously submitted to this
list.

Cautionary footnote. For more recent server systems that claim ipmi
compatibility it appears that the support for ipmi is at the kernel
level only (and not via the ilo itself...) so if the system becomes
unresponsive and a fencing action is sent, it probably won't have any
effect... For these systems we've resorted to using non-ipmi based
scripts (i.e. modified rilo etc).

ipmi is a useful standard, but as with any standard it helps end users
(us) if manufacturers implemented things in a consistent manner :-).


Pete
> Hi, Dejan
>
> It seems that I have same hardware with you, some HP Proliant DL145
> with Qlogic BMC which (claims to) support IPMI 1.5
>
> I tried the IPMI power cycle function and my server didn't got any
> response, I think there may be something wrong between the ACPI
> interface and server BMC, which caused OS didn't know about a soft
> reset happened.
>
> As for STONITH use, I think you should always use power reset, like
> the stonith script external/ipmi, power reset can do a quick and
> "real" reset to the server hardware and not depend on OS behavior,
> that's what we need for stonith to do.
>
> Recently I build a 4-node HA cluster for my cite's LVS with pingd and
> stonith supported. I use Debian's Heartbeat 2.1.3 package with a
> little modify, and I found some issue about IPMI:
>
> 1. stonith2/ipmilan cannot work, a segmentation fault throw from
> OpenIPMI library.
> 2. external/ipmi cannot work, because ipmitool power reset sometimes
> successfully reset my server but didn't exit with 0, and I have to
> modify the script to let it exit 0 when calling with reset ...
>
> Regards,
>
> Chun TIAN (binghe)
>
> 在 2008-1-23,下午11:40, Dejan Muhamedagic 写道:
>



-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to