Re: [Linux-HA] Hostname & ignore error question

Rob Morin Mon, 14 Apr 2008 10:23:07 -0700

Thanks for the reply Dejan....

Our company has a policy to use Debian apt-get packages only, to myknowledge what i am running is the latest available for AMD64 in apt-geti know its not the latest.


|The point of having a service in a cluster is to make
|it more available, right? So, if your service is unstable, it
|should first be fixed.

Yes i agree with the above, but is it not better to have pop, imap,mysql and mail deleviery working while apache is down? clients accessingsome services will be happier than having everything down...??

Implementing heart & drbd ended up being more complicated on aproduction system then i expected... i had to integrate into a livesystems my test in a dev system... its not as easy as 1 2 3 i found theHA website a bit confusing as there seems to be a mix in documentationbetween versions..... so this is why i am asking questions here... Iwanted the heartbeat implementation as simple as possible.... we use oneIP for web, pop, mysql and imap...

I did realize an error i had in my haresources file however i had theserver name in there as primary, i should have had Joe rather thanstewie, i changed that just before i sent my email.... i prefer to usethe haresource file for config as i found the xml config a bit confusingas i tried to use it....

I do not understand when you say i need 2 com links... you mean forreplication of of data, or for the heartbeat itself...?


I very much appreciate your input, thanks again...

Rob Morin
Dido Internet Inc.
Montreal,Canada
http://www.dido.ca
514-990-4444



Dejan Muhamedagic wrote:

Hi,

On Mon, Apr 14, 2008 at 09:22:08AM -0400, Rob Morin wrote:
Hello all my first post here so be gentle....  :)
I have setup already DRBD and Heartbeat-2 on 2 Debian Etch servers. Primarynamed Joe secondary named Stewie
DRBD version 8 via apt-get and heartbeat-2 via apt-get version 2.0.7-2
2.0.7-2 is rather old. You would want to upgrade, in particular
if you run v2/crm style configurations.
I am using 2 NICS, eth0 which is private for DRBD replication and heartbeatand eth1 used for my real public IP address where outsiders connect to forthe services.
See below.
I am not using heartbeat yet, but i am using drbd, as i am having a troublegetting heartbeat to take over on the secondary server(Stewie). The problemis Apache is dying for some reason... however i would like the otherresources to start, such as pop and mail and a couple others.. i figure itsbetter to have only one server dead such as web , rather than all servicesdead...
My question is, is it possible to have heartbeat ignore a problem when aproblem or error occurs starting up a service?
In v1 probably not, but other services/groups shouldn't be
affected. The point of having a service in a cluster is to make
it more available, right? So, if your service is unstable, it
should first be fixed.
As its is hard to troubleshoot a problem when it occurs as heartbeat givesup if it encounters one error....
Why should it be hard to troubleshoot? There are logs I guess.
Also i noticed in the in the ha.cf file ther is a comment that says "# Nodename must be same as uname -r."
SO i have "Joe" and "Stewie" as my hostnames but if i do a uname -r oneither host i get this in return
2.6.18-6-amd64
That must be a typo. It should read 'uname -n'.
Could this be an issue... here are my conf files....


ha.cf file
-------------------------------------
logfacility     daemon        # This is deprecated
keepalive 2                   # Interval between heartbeat (HB) packets.
deadtime 60                   # How quickly HB determines a dead node.
warntime 5                    # Time HB will issue a late HB.
initdead 120 # Time delay needed by HB to report a deadnode.udpport 694 # UDP port HB uses to communicate betweennodes.#ping 192.168.5.1 # Ping VMware Server host to simulatenetwork resource.
bcast eth0
You need at least two comm links for production servers. Another
link could be your public network interface.
#baud 115200
#serial /dev/ttyS0              # Which interface to use for HB packets.
coredumps true
auto_failback off # Auto promotion of primary node upon returnto cluster.
node    joe      # Node name must be same as uname -r.
node    stewie      # Node name must be same as uname -r.
###
respawn hacluster /usr/lib/heartbeat/ipfail
# Specifies which programs to run at startup

------------------------------------------------------------


haresources  file
------------------------------------------------------
joe IPaddr::xxx.xxx.xxx.150 \
drbddisk::mail Filesystem::/dev/drbd0::/var/mail/virtual::ext3::defaultsapache2 mysql ispcp_daemon \drbddisk::web Filesystem::/dev/drbd1::/var/www::ext3::defaults postfixcourier-authdaemon courier-pop courier-imap
Looks like you put everything in a single group. You should try
to split them into several, if possible. For example, I'd assume
that drbddisk::mail and drbddisk::web don't depend on each other
and that various services depend on either the former or the
latter. Then create at least two groups. If all depend on the
IP address, then all have to be in a single group if you're
running a v1/haresources based configuration. In that case, you
would want to consider a v2/crm configuration. At any rate, you
may consider introducing an extra IP address for the second group
of services.

See http://linux-ha.org/LearningAboutHeartbeat,
http://linux-ha.org/HeartbeatTutorials, and
http://linux-ha.org/GettingStartedV2 for more information.

HTH,

Dejan
----------------------------------------------------------------------------------------------------------------------

Thanks to all for your help and have a great day!

--

Rob Morin
Dido Internet Inc.
Montreal,Canada
http://www.dido.ca
514-990-4444

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Hostname & ignore error question

Reply via email to