On Sun, 2008-06-01 at 10:10 +0200, Lars Ellenberg wrote: > On Sat, May 31, 2008 at 03:59:12PM -0400, Rubin Bennett wrote: > > Hello all! > > > > After lurking for a long time on this list, I have a question about a > > failover pair I've been tasked with building up. > > > > It's a super straight forward setup with an IP address and one service > > as shared resources, with one significant twist. > > > > The Shared resource is a Scalix mailserver installation. It is managed > > by 3 init scripts, (/etc/init.d/(scalix|scalix-tomcat|scalix-postgres), > > and runs on a shared IP address managed by heartbeat > > The Message store is replicated via drbd and/ or rsync done from LVM > > snapshots on the primary node. > > > > The trick is the hostname: it *must* be identical on both nodes in order > > for the failover to be successful. Due to the internals of a Scalix > > message store, changing the hostname is next to impossible and very > > strongly discouraged (even though it is documented on their website, > > with strong wording like "This *may* work for you, but it will more > > likely hose your message store. Remember, we told you not to try this!" > > > > So my dilemma is that I need to run heartbeat on 2 systems, both > > configured with the same hostname (mail.domain.com). I have added > > entried in /etc/hosts for the "native" ip addresses of the NICs and the > > extra hostnames that correspond to each, but as far as I can tell, this > > is not compatible with Heartbeat, as both nodes have the same output > > from uname -n > > depending on what scalix does exactly, > you may be able to do a trick: > > if you in fact can tell it to bind to "yo.ur.ip.ha", > it is likely that it does a gethostbyaddr() on that ip, > so you could just put that into your hosts on both nodes. > you may need to put the node names as aliases on that line. > > I would expect it to work when you put in /etc/hosts: > the order is important, put that ha-address last! > > 127.0.0.1 localhost > official.ip1 fqdn.node1 node1 > official.ip2 fqdn.node2 node2 > scalix.ip.ha fqdn.scalix.ha node1 node2 > > and in the scalix config only ever reference > scalix.ip.ha or fqdn.scalix.ha > > > Has anyone done this, and do I have a snowball's chance in hell of > > getting it to fly? > > It worked for me. > well, not for scalix, though, did not do that yet. > but for something similarly brain damaged, > where it also was documented it was impossible... > > :) > > if it does not work, > you need to put it into a virtual machine, > and switch over that one. > xen domU, kvm, openvz should all work. > > may be easier than to persuade a stubborn application > to become "cluster aware". >
Thanks for the suggestion! Unfortunately that doesn't work in Scalix (the Scalix setup was done prior to Heartbeat and the hostname must be set to 'scalix.domain.com' on the machine as that string exists in myriad places inside the scalix message store. I did get it to work though; here's how: I set the hostname on both system to scalix.domain.com In /etc/hosts I added the following: #The static address on eth0, primary host 192.168.1.100 scalix1.domain.com scalix1 #The static address on eth0, secondart host 192.168.1.110 scalix2.domain.com scalix2 #The shared address 192.168.1.105 scalix.domain.com scalix In /etc/ha.d/ha.cf, I ran a standard setup, using the non-shared hostnames: node scalix1.domain.com node scalix2.domain.com Then, I told heartbeat not to run on startup: chkconfig heartbeat off Finally, I wrote a machine specific wrapper script in /etc/init.d that changes the hostname to the appropriate nodename just long enough for Heartbeat to start, then changes it back to scalix.domain.com: heartbeat-wrapper hostname scalix1.domain.com /etc/init.d/heartbeat start hostname scalix.domain.com I've tested the failover on both sides once heartbeat is running and it works perfectly. And I *know* this is an ugly, ugly hack, but it also works, and saved me a *ton* of work writing custom failover/ ip monitoring stuff, so... whoohoo! Ah, one more Heartbeat installation out in the wild :) Thanks again to all who have contributed for a rockin' piece of software! Rubin -- Rubin Bennett High Commander and Janitor RB Technologies http://thatitguy.com [EMAIL PROTECTED] (802)223-4448 "They that can give up essential liberty to obtain a little temporary security deserve neither liberty nor safety" --Benjamin Franklin, Historical Review of Pennsylvania, 1759 _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
