Dejan Muhamedagic escribió:
Hi,
On Fri, Dec 14, 2007 at 01:29:51PM +0100, Miguel Araujo wrote:
Hello HA list!
I have been 2 or 3 days at the IRC channel posing some questions and
probably abusing your patience a little bit ;) which wasn't my intention
at all. I got to the HA world about 6 days ago and I have been gobbling
Welcome!
Thank you!
up documentation. First I would like to say that I find the
documentation at the HA site dispersed and hard to follow. However the
wiki web is easier to read and find what you are looking for.
Once said this, I would like to be able to understand heartbeat in a
deep way and in future if possible collaborate with the documentation
part or patching the software. I came across with HA because I'm working
on a virtualization project. Basically what I have to do is set up 4
machines with XEN3.1 using exported block devices from a SAN using
iSCSI.
There's a brand new iSCSI ocf RA.
I'm not interested on doing high availability on iSCSI. Anyway thanks
for the info.
That's already done, but now I was asked to make the service High
available. This is how I found heartbeat and started reading about this
tested software with you many years of experience in the field and that
so many people rely on.
What I want to do is to monitor domUs so that if they fail I can move
them to other nodes of my 4 node cluster. In the IRC they have already
explained me some of the issues I didn't understand before, like you can
not monitor the whole dom0, you monitor resources (domUs). After doing
that I would like to do a stacked cluster monitoring services that the
domUs run, but this may take ages.
Recently the Xen ocf RA saw some improvements. In particular,
there's a way to hook scripts to monitor resources within the
DomU which may allow you to keep the heartbeat running in the
Dom0. See recent discussion on Xen and the bugzilla entry.
Perhaps you could aid in testing it.
I have been reading it.
I have a 2 node cluster with a SAN for testing and having fun, so don't
worry about warranties. Dominik Klein gently passed me his cib file for
XEN, I understand almost the whole of it. The problem now is that I
would like to add fencing to the cluster. Here comes the questions:
1.- Are fencing and STONITH different technologies that let you avoid
brain-split? or they are just two different concepts achieved by the
STONITH way?
Fencing is a term, STONITH a technology. STONITH makes fencing
possible.
Ok, first time I understand the real difference.
2.- How does STONITH know how to differentiate between a communication
network failure and a crash on the node?
It can't. BTW, STONITH is just a way to reset the node. Other
components (pengine in particular) decide when to reset the node.
So, what happens if the stonith device can't do its work and then the
supposed dead node comes back to life and screws up the resource?
I mean if the node's network
fails, how can the STONITH device kill it?
There are STONITH devices (mainly UPS) which are controlled over
serial. Another class is the lights off style devices, such as
ilo (HP) or rsa (IBM).
3.- As my SAN is not like a ServeRAID (it's not resource self fencing to
say it somehow) I would like to run a different fencing script for every
node of my cluster as every node have in a first time different block
devices mounted (one for every virtual machine). I know how to block
nodes accessing the SAN using its CLI, can it be done? could you pass me
some example cib files?
CIB files won't help here. Managing access is not so easy to
implement. There's some code around for that, I believe. Don't
know about its state though. Look for "shared disk access" or
similar in the dev list archives.
OK. So if I'm not mistaken there is no way at the moment to do what I
want using heartbeat. If I can't fence my nodes this way and neither buy
special hardware, heartbeat is useless for my purpose, isn't it?
4.- Would you mind listing some STONITH devices available in the market?
Take a look at the output of stonith -L. Those are supported.
Thanks,
No, Thank you! I was wondering if you know any way I can solve my
fencing problem using heartbeat or other HA software. I'm starting to
run out of time for testing.
I'm starting to see myself programming some scripts to do HA my way,
although they would just be an specific solution that should be
refactored for every situation, which is not flexible or scalable.
Regards and thanks for your patience.
Miguel Araujo
Dejan
Finally I want to thank you all your time and effort. It's very likely
you will receive more reply mails asking more of these questions, thanks
in advanced.
Regards,
Miguel Araujo
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems