Hi,

On Mon, Sep 22, 2008 at 01:47:40PM +0100, Peter Clapham wrote:
> Dejan Muhamedagic wrote:
>> Hi,
>>
>> On Mon, Sep 22, 2008 at 12:51:20PM +0200, Christian Lox wrote:
>>   
>>> Hi there,
>>>
>>> can anyone here share some best practice or similar how to set up the 
>>> heartbeat communication on an  IBM Bladecenter (8677 with HS20 blades)?
>>> We plan on using eth0 for the servers "normal" payload and eth1 for 
>>> connecting to our iscsi storage and extra vlan for heartbeat.
>>> Is this recommended?
>>>     
>>
>> Use both ethernets for heartbeats.
>>
>>   
>>> Same goes for stonith. Is it safe to use the BMC (if so, which plugin to 
>>> use) or is an  external device recommended.
>>>     
>>
>> Does it support IPMI? In that case you can try either ipmilan or
>> external/ipmi. There's also bladehpi, but I'm not sure if your
>> management modules support it (HPI). If you want to try bladehpi,
>> then you'll need to compile the plugin (it's missing for some
>> reason from the 2.99 packages).
>>
>> Thanks,
>>
>> Dejan
>>
>>   
>
> The Chassis use IBM's own IBMmpcli to communicate to the MMC. I'm aware of 
> a STONITH resource script that makes use of this that has been previously 
> discussed in this list (I apologise I can't recall all the details), I 
> think it may actually be a part of the distributed resource scripts 
> (although I'm not currently sure), so worth checking :)

That'd be the external/ibmrsa. There's also
external/ibmrsa-telnet, an implementation which avoids the mpcli
which, being a Java app, uses inordinate amounts of memory and
perhaps even fails at times too (can't remember the details
either).

> If you get nowhere there, ping me and I can attach a copy of our local 
> script which we have been using for approx 18months in production without 
> STONITH issues on basicaally the same setup as yourself.
>
> One useful caveat... we have found that the MMC's do not like a high 
> communication / connect rate. This can cause the unit to go AWOL so be 
> gentle with the monitoring etc :)

This should be a general warning. Many STONITH devices become
whimsy if one tries to push them too hard. Furthermore, it's more
than enough to monitor a stonith device every half an hour or so:
remember that it is a service which is needed very seldom. The
probability that it'll be needed within the same 30 minute frame
when it failed is low.

Thanks,

Dejan

> Other than that... all is good :) so good luck and enjoy.
>
> Pete
>>> Thanks for any hint,
>>> Christian
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>     
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>   
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited, 
> a charity registered in England with number 1021457 and a company 
> registered in England with number 2742969, whose registered office is 215 
> Euston Road, London, NW1 2BE. 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to